Baidu spider drives my server mad every day. Once in 24hourse cpu usage jumps from 1 core just to 3 - 5. I have blocked it via robots.txt but it ignores it. Any of you got a solution?
- Block the IP in your firewall?
- Limit the pages per minute in your robots.txt. I.e.
Crawl-delay: 2.0
Request-rate: 30
Crawl-delay says 2 pages/minute
Request rate says 1 page every 30 seconds
Adjust for your own desires.
Are you selling internationally?
I have a lot of countries ip blocked, cuts down the spam tremendously.
kogi
[quote name='kogi' timestamp='1318812771' post='123839']
Are you selling internationally?
I have a lot of countries ip blocked, cuts down the spam tremendously.
kogi
[/quote]
How do you block ? With htaccess?
Yes, with .htaccess, but you can do it any variety of ways.
I block RIPE = Europe, the Middle East, and Central Asia
and LACNIC = Latin America and some Caribbean Islands
order allow,deny
#RIPE
deny from 2.0.0.0/8
deny from 46.0.0.0/8
deny from 31.0.0.0/8
deny from 62.0.0.0/8
deny from 77.0.0.0/8
deny from 78.0.0.0/7
deny from 80.0.0.0/5
deny from 88.0.0.0/6
deny from 92.0.0.0/6
deny from 109.0.0.0/8
deny from 176.0.0.0/8
deny from 178.0.0.0/8
deny from 188.0.0.0/8
deny from 193.0.0.0/8
deny from 194.0.0.0/7
deny from 212.0.0.0/7
deny from 217.0.0.0/8
#lacnic
deny from 186.0.0.0/8
deny from 187.0.0.0/8
deny from 189.0.0.0/8
deny from 190.0.0.0/8
deny from 191.0.0.0/8
deny from 200.0.0.0/8
deny from 201.0.0.0/8
allow from all
is this method really working? advise please?
Thanks
First, did you add this to your robots.txt
User-agent: Baiduspider
Disallow: /