Baidu spider

Baidu spider drives my server mad every day. Once in 24hourse cpu usage jumps from 1 core just to 3 - 5. I have blocked it via robots.txt but it ignores it. Any of you got a solution?

  1. Block the IP in your firewall?
  2. Limit the pages per minute in your robots.txt. I.e.

    Crawl-delay: 2.0

    Request-rate: 30



    Crawl-delay says 2 pages/minute

    Request rate says 1 page every 30 seconds



    Adjust for your own desires.

Are you selling internationally?



I have a lot of countries ip blocked, cuts down the spam tremendously.



kogi

[quote name='kogi' timestamp='1318812771' post='123839']

Are you selling internationally?



I have a lot of countries ip blocked, cuts down the spam tremendously.



kogi

[/quote]



How do you block ? With htaccess?

Yes, with .htaccess, but you can do it any variety of ways.



I block RIPE = Europe, the Middle East, and Central Asia

and LACNIC = Latin America and some Caribbean Islands





order allow,deny
#RIPE
deny from 2.0.0.0/8
deny from 46.0.0.0/8
deny from 31.0.0.0/8
deny from 62.0.0.0/8
deny from 77.0.0.0/8
deny from 78.0.0.0/7
deny from 80.0.0.0/5
deny from 88.0.0.0/6
deny from 92.0.0.0/6
deny from 109.0.0.0/8
deny from 176.0.0.0/8
deny from 178.0.0.0/8
deny from 188.0.0.0/8
deny from 193.0.0.0/8
deny from 194.0.0.0/7
deny from 212.0.0.0/7
deny from 217.0.0.0/8
#lacnic
deny from 186.0.0.0/8
deny from 187.0.0.0/8
deny from 189.0.0.0/8
deny from 190.0.0.0/8
deny from 191.0.0.0/8
deny from 200.0.0.0/8
deny from 201.0.0.0/8
allow from all

is this method really working? advise please?

Thanks

First, did you add this to your robots.txt



User-agent: Baiduspider

Disallow: /