Wow! I’m having a bugger of a time trying to get indexed with google. Has anyone put together a robot.txt file so google’s robot crawls the proper directories? and if so would you be so kind to share as I really have no idea as to what I’m doing. I submitted the site map as they suggested and after two weeks my site finally got crawled but got an error that the sitemap was in the wrong format. I submitted [url]http://www.domain.ca/index.html?target=sitemap[/url]
Additionally, I tried to create a html catalog and the script locked and froze. I’m guessing my catalog is too big???
Help would be greatly appreciated.
You don’t need robots.txt to get crawled. If they find inlinks to your site they will crawl it anyway. Use robots.txt to block out what areas you do not want the bot to access.
Also that is not the sitemap you should submit. You need to submit a specific Google sitemap. Search for it on the forums.
thanks for your help. I’ll go browsing the forums and hopefully can put together something that works this time.
What I read on google was that if you have a robot.txt file it gets read everyday as opposed to waiting up to two weeks for their crawler to show up and crawl your site. And since I’ve already lost two weeks because of my mistake I’m trying to get crawled sooner then later.
PS: what is a sig?
It’s robotS.txt and you don’t really need it unless you want to disallow access to crawlers. Your site will not get crawled faster because you have one. You can create a blank robots.txt however so you wont get hit by 404 Not Found errors in your logs.
The fastest way to get your new site indexed is to get some links from other sites. If you have a few links coming in, the first indexing takes a while.
A sig is a signature.