FidoSysop Blog

Baidu Scraper Bots Using EU IP’s To Circumvent Blocking

Badiu Hong Kong IP WhoisIf your a webmaster you have most likely seen Baidu crawling your website using China and Hong Kong IP’s.

I block all of China as a country because there is little my sites offer, if absolutely anything at all that Chinese web surfers would be interested in. This has worked well for a couple of years.

But recently i noticed a large number of hits coming from EU (Europe) IP’s. A quick whois lookup shows it’s Baidu circumventing my blocking of China and Hong Kong.

I use CloudFlare to manage my sites DNS and CDN to quickly block offending countries as a whole. But unfortunately for me CloudFlare does not support EU blocking.

So what to do to stop these Baidu Leeches from strip mining your websites? You can block the IP’s either locally or in CloudFlare’s Threat Management console.

I LOVE CloudFlare! And they also cache your sites pages giving your visitors an offline cached version of a page if your site is offline. And the best part is the basic plan is FREE!

CloudFlare Threat Management

Are you using WordPress? If so let me clue you into two great FREE Plugins that will show your site visitors graphically and add a quick block option. I haven’t blocked Baidu’s site scrapers on CloudFlare yet, but instead are using these two plugins to give them a 403 message every time they hit my sites.

The 1st is called Visitor Maps by Mike Challis. The second is a companion plugin that does the IP or Referrer blocking. It is Visitor Maps Extended Referrer Field by President McCheese. With this companion plugin you can quickly block offending IP’s as i have done with Baidu’s Leeches.

Badiu Spider Web Scraper Flying EU IP Flag

I really suggest you use CloudFlare to manage your websites content delivery and enhance your sites speed and security. It’s real easy to do and FREE!

Once you register your account on CloudFlare go to the add website tab and type your domain in (i suggest using www) and CloudFlare will scan your site and then give you two name servers to use. Just go to your domains registrar and edit your name servers to point to CloudFlare. That’s it your done.

I suggest using the Medium security and basic Cache. Their rocket loader cache will really speed your site up but some plugins or themes may not function well. You can test your sites speed Here on Pingdom Tools.

Good luck.. And if you have questions please comment below.. 🙂