Ad Siphoning, Bad Bots, and Crawler Traffic Induced Performance Issues?
We recently started noticing some abnormal sluggishness on our public website and even on interior plugin store pages as well as the WordPress Dashboard. Thinking it might be related to standard Multisite plugin bloat (i.e. our busy-induced laziness) we went through a relatively thorough house cleaning process and threw some performance tests at it too. That seemed to help, but it wasn't conclusive. Getting serious about it, we fired up some monitoring with AlertFox and Yottaa (both fantastic!) and quickly observed interesting, non-patterns, of peaks and valleys in page load times as well as some sporadic micro-outages. What began as a performance troubleshooting issue quickly blended into the WordPress Security arena.
Another clue came in from some other sites we manage on a different, single hosting account that have been experiencing massive performance issues recently as well. Logs revealed that "googlebot" was pounding the sites in that account at an alarming frequency. Well, we certainly never asked Googlebot to work against its own nature which already handles crawl throttling fairly smartly, so we began to wonder if there was something else going on and started suspecting there might be substantial naughty bot traffic hammering away.
Enter Google Adwords and China
Ok, before I go further with my theories please feel free to call me crazy and explain why in the comments (naturally supported by data). We're still gathering data to validate what is going on in our scenario, so some of this might be circumstantial observation fueled by other anecdotal historical scenarios.
The facts so far:
- We have a mildly successful wordpress multisite plugin available here and on WordPress.org
- Watching Google Alerts we've seen how many other sites / libraries ruthlessly scrape WordPress.org and pull the repositories of free plugins into their own content for automated SEO pirating. Sound harsh? Well... some of them might truly bring value, and we should be thankful for the free, effortless backlinks right? But it still smells funny and for the most part just dilutes the value of WordPress.or not to mention pollutes search results. That's an interesting topic all to itself.
- We had an episode running Google Adwords where our ads (and paid clicks) were getting farmed out to irrelevant Chinese sites and killing our conversion rate. Someone might say that we had things configured incorrectly. But in the process of straightening that out we found massive gaps in Google's documentation and will be publishing a good article on our findings in the near future. Bottom line was that some things weren't working like they were supposed to, but we got it sorted out in the end.
- We know that when we aren't running a strong spam-comment blocking plugin we get a ton of automated spam comments (who doesn't? :) ).
- There's a social / search / authorship connection between our main site and the other sites we've observed performance issues on (even though total human traffic is very low on those sites)
- Now we have some (traffic-induced???) performance issues on our main site.
WordPress Security Measures Against Ad Siphoning and Bot Resource Consumption
Theory: Our site's performance issues are highly related to substantial Bad Bot traffic consuming resources. We're testing a solution that will validate (or disprove) this and I will describe that more in a moment. Has the moment arrived when super tight security is just part of the game when running a business website? And I'm not referring to security against viruses or intrusions or site compromises due to hacking and cracked passwords. I'm talking about security to protect bandwidth, resource utilization, and user experience in this age of bot swarms and legions of automated crawlers and scrapers. Those are limited resources, and you could have the most highly optimized site, but those inhuman code spiders are almost never programmed to be considerate when those resources are especially scarce (like in shared hosting environments) you can't afford to lose any of them to virtual visitors that produce zero value for you and your business. And the thing is, you likely have no idea how much of that is constantly hitting your site.
More disturbing, however, are some interesting implications for this crazy search driven era. What happens when these bad bots follow your Google Ads? They're never going to convert, but they are happy to burn your paid clicks. Maybe Google has precautions against this? But from our logs it looks like bots can burn clicks just like humans. What if lots of bots burn lots of the clicks you paid for. I'm asking the question - and maybe someone has a definitive answer as to why my assumptions are incorrect. But this seems like a super wasteful possible outcome. In the meantime, Google has a never ending supply of automated free money.
Safety in the Power of Innovative Ecosystems like WordPress
So how do Innovative Ecosystems like the WordPress Community fight back? One of our measures to test and defend against bad bot traffic is the amazing plugin called WordFence Security. The existence of high caliber plugins like this for free blows my mind. We set this up primarily to use its firewall features and see if blocking / throttling our traffic can improve our site performance, but it is loaded with a ton of other excellent security features too. The log captured in this post's featured image is from the WordFence live monitoring capability, and shows what appears to be a crawler siphoning off one of our Google Ads and hopping in to crawl the site. You have to read it from bottom up in chronology since the live monitoring keeps the most recent activity at the top. Fire that plugin up and just watch the traffic in the Crawlers tab. You will be floored by the non-human activity on your site.
On top of this, there are some amazing innovations in the WP.org plugin library that were born in response to internet-wide threats to WordPress security (like the widespread botnet attacks on standard admin accounts). It's amazing to be a part of the self-defending and evolving Ecosystem of Innovation surrounding WordPress.