Programming, SEO, Theory

Will Google stop SERP scrapers by going Ajax?

Some of you might have noticed the buzz going around the net today about Google’s SERPs going ajax. There is a great post about it here:

Google Web Search Ajax

I’ve heard more than a few people asking if this will stop automation software from crawling Googles SERPs to retrieve rankings or Adwords data. In short, the answer is no.

I doubt this development would be solely intended to stop people from using automated rank checkers or content scrapers, but Google has to know that whatever they change, people will simply adapt and evolve their software to keep pace. My personal opinion is that a move to ajax SERPs would provide google with a tighter control of how they serve data, as well as providing them with a whole slew of new metrics they could leverage. On the flip-side, this move could hamper the accessibility of Googles service from outdated/underpowered computers and browsers.

Regardless, the possibility of Google going ajax does indeed raise some questions in so far as how people should proceed with developing automation software. Not only is ajax delivered content more difficult to automate, but ajax also could be used to track mouse movement. This would be one of the metrics that ajax would make available to Google, and also it could be a possible means by which Google could begin to distinguish bots from humans. (although, on the Google-scale, that would be an incredibly large amount of data to process.) While AJAX content is not as trivial to scrape as traditional content, it is still quite possible. I think that as Google evolves to develop ways to clean up their SERPs, programmers and marketers will evolve as well. I believe that software will inevitably evolve to mimic human browsing behavior. This includes filling out form data, mouse movement, sending Google Toolbar data, using back buttons, clicking links, storing cookes, etc. And don’t forget, all that will have to be done at human speeds, not computer speeds. The requirement to mimic human behavior also means that it will become more and more difficult to multi-thread and do simultaneous requests from one IP range. Rather than managing massive proxy farms, in the future, it will be more cost-effective and productive to off-load automation requests to client computers which, when required, send data back to a master server for processing (no, I’m not talking about malware here….more like browser plugins).

There are already a slew of different libraries which mimic browsing behavior and also handle ajax. Off the cuff, ScrubyT handles ajax quite well. Watir can actually open an instance of IE, FF, Safari, or even Chrome, and thus fully mimic a browser. Marketers and programmers can use these libraries, and other libraries to both mimic human browsing behavior and make ajax calls to automate SERPs. In the future, you can expect these kinds of libraries to evolve and become more advanced and customized towards mimicking human behavior. Even still, if ajax calls are not encoded, you can simply extract the appropriate call out of the page and make it yourself. An example of this can be seen here.

At the end of the day, a move on Google’s part to ajax SERPs may throw a wrench into some peoples software, but it will certainly not herald an end to automation tools like Aaron Wall’s excellent SEOToolbar, or others.

some posts that may be related

5 Comments

speak up

You can skip to the end and leave a response. Pinging is currently not allowed.

*Required Fields