help with index of wrong pages

hi,

i have a lot of links being indexed by google that are wrong.



close window is the title,have a look

[url]http://www.funky-sheila.co.uk/shop/image.php?object_type=detailed&image_id=3993&window=popup[/url]



the worst is that this is multiplied many times,please help with this



also this link is wrong,its not supoed to be indexed…



[url]http://www.funky-sheila.co.uk/shop/image.php?object_type=detailed&image_id=4026&window=popup[/url]

i know it has to do with robots.txt,but i do not know wich changes in there should i do…



thanks in advance for any help

Add:


Disallow: /image.php

hi,

thanks:p

would you know how to dissalow the links in the global configurations,such as costumer login,the breadcrumbs(apart sitemap) like the promotions link and affiliates…I notice that internal searches in the shop are being indexed,i do not know if thats good…problably not.

also,can you tell me difference between dissalow and no index?

thanks again

If you have SEO enabled then add:


Disallow: /index.php

[quote name=‘Tool Outfitters’]If you have SEO enabled then add:


Disallow: /index.php[/quote]

Very dangerous.

[quote name=‘gabrieluk’]…can you tell me difference between dissalow and no index?

thanks again[/quote]



1.

“Disallow” will avoid your site to be crawled by robots (bots) and so your site will not appear (shouldn’t) on the web.



2.

“noindex” will just tell the robots, you won’t to index your site. But the site will be still crawled and can appear in other search engine pages and directories. For example Google will accept the “noindex” command, but if it finde your site on other place on the web, the site will be (maybe) indexed as well.





I created new thread aout robots.txt here: [url]http://forum.cs-cart.com/showthread.php?p=78830#post78830[/url]

[quote name=‘indy0077’]Very dangerous.[/QUOTE]



Care to explain?



No problems in 3 years of use.

[quote name=‘Tool Outfitters’]Care to explain?



No problems in 3 years of use.[/quote]

Should be seen as a generally warning to users who are not very familiar to .htaccess and static/dynamic matter. Then it can happen that they will use this rule on a static page and … it will gone…

[quote name=‘indy0077’]Should be seen as a generally warning to users who are not very familiar to .htaccess and static/dynamic matter. Then it can happen that they will use this rule on a static page and … it will gone…[/QUOTE]



Please refrain from misleading other readers. There is nothing “very dangerous” about my post.

[quote name=‘Tool Outfitters’]Please refrain from misleading other readers. There is nothing “very dangerous” about my post.[/quote]

Just for understanding what I’m talking about. If you do use


Disallow: /index.phpon a site which doesn’t include that in the .htaccess file

## Redirect index.php to root "/"
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/
RewriteRule ^index\.php$ http://www.yourdomain.com [R=301,L]
or you don’t use the CS-Cart SEO addon, then some sites might not be indexed.



If you set:

Disallow: /index.phpit will block sites which start with index.php in the root directory.



Each directory can have a default page. In CS-Cart’s case it’s the “index.php” file as the default when a user reaches “yourdomain.com”. You can block the “index.php” file but that would effectively block the “.com” at the same time.

[quote name=‘Tool Outfitters’]If you have SEO enabled then add:


Disallow: /index.php[/quote]


[quote name=‘indy0077’](“IF” Added for context) IF or you don’t use the CS-Cart SEO addon, then some sites might not be indexed.

[/quote]



Didn’t he just say that?

Hahaha… I can see now what do you mean, didn’t see that …sorry.

I have SEO ADDON running,so i will use that.Thanks Tool,Indy and Jesse.This forum is really great and im really happy to have choosen CS cart:mrgreen:

[quote name=‘Tool Outfitters’]Add:


Disallow: /image.php[/QUOTE]



this is a copy of my robots.txt



User-agent: *

Disallow: /classes/

Disallow: /images/

Disallow: /skins/

Disallow: /payments/

Disallow: /image.php

Disallow: /admin.php

Disallow: /store_closed.html

Sitemap: [url]http://www.funky-sheila.co.uk/shop/sitemap.xml[/url]



its the default text that comes with cs.As you see,image.php,is already disallow.How can it be…

The only reason for this,maybe,is that cs cart is installed in a sub folder.Theres another cms script in the root of the domain,wich has an robots.txt also.I do not know if is correct to have 2 robots.txt,one for the root folder,and another for the cs cart subfolder.Or if i should have 1 robots.txt in the root folder of the domain with this:

Disallow: /shop/image.php

What do you think?:rolleyes:

If it’s installed in a subfolder (not as a subdomain) then you have to use the robots.txt in your root:


Disallow: /subfolder/image.php



You can’t use a second robots.txt in the subfolder because you can’t submit a “subfolder” in the Google Webmaster Tools.

[quote name=‘indy0077’]If it’s installed in a subfolder (not as a subdomain) then you have to use the robots.txt in your root:


Disallow: /subfolder/image.php



You can’t use a second robots.txt in the subfolder because you can’t submit a “subfolder” in the Google Webmaster Tools.[/QUOTE]



oh…i see…but theres something you say i dont understand…



you can’t submit a “subfolder” in the Google Webmaster Tools~~



I submited the subfolder to google webmaster tools and it has acepted.I describe what i did.

created webmaster account,inserted www.mydomain.com to the account.Verified the website with google.html file.After i registered mydomain.co.uk - mydomain Resources and Information. and it was accepted.i also verified the subfolder /shop.

I did this more for the fact i was finding difficult to send only 1 sitemap xml to googles including two independent scripts(other cms plus cs cart).So i registered them independent,so i could use cs cart addon xml sitemap from SnoRocket and the addon from the other cms script.

Is that makes sense???:confused:

[quote name=‘gabrieluk’]oh…i see…but theres something you say i dont understand…



you can’t submit a “subfolder” in the Google Webmaster Tools~~



I submited the subfolder to google webmaster tools and it has acepted.I describe what i did.

created webmaster account,inserted www.mydomain.com to the account.Verified the website with google.html file.registered wwww.mydomain.co.uk-shop and it was accepted.i also verified the subfolder /shop.

I did this more for the fact i was finding difficult to send only 1 sitemap xml to googles including two independent scripts(other cms plus cs cart).So i registered them independent,so i could use cs cart addon xml sitemap from SnoRocket and the addon from the other cms script.

Is that makes sense???:confused:[/quote]

Yes, you’re right I forgotten that it supports subfolders as well. In that case you should disallow the CSC subfolder in the root in the robots.txt

[quote name=‘indy0077’]Yes, you’re right I forgotten that it supports subfolders as well. In that case you should disallow the CSC subfolder in the root in the robots.txt[/QUOTE]

I dont know if this is good idea,since i need to insert the command disallow image.php in the robots.txt in the root FOLDER OF THE DOMAIN.even if the sub folder is registered as another website,google is not reading the robots.txt from cs cart,as you can see that my images are being indexed…i feel a little confusion now…

i think even if they are registered as two sites,google still see them as in the same domain,just separated by the folder,so maybe,in this case google reads the FIRST robots.txt he founds in the domain ,independent if theres a second website in a subfolder

[quote name=‘gabrieluk’]I dont know if this is good idea,since i need to insert the command disallow image.php in the robots.txt of the main site.even if the sub folder is registered as another website,google is not reading it,as you can see that my images are being indexed…i feel a little confusion now…[/quote]

You will have to put the command into the robots.txt in the subfolder. With disallowing the subfolder you will tell the robots not to go to this folder. The robots.txt in the subfolder will take care about crawling.



However, I can understand that now it will be allowed to submit subfolders as well because of many scripts which use own sitemaps etc. but I know from the past that robots have accepted only one robots.txt file namely in the root.

still confused…

im afraid to disallow the subfolder in root of the domain,as many pages are indexed and the site is ok with googles by now.Are you sure that if i dissalow the sub folder in the robots.txt in the root of the domain,that the SECOND robots.txt in the sub folder will be active??are you 100 per cent sure??so now you are saying i should have 2 robots.txt file…Any other developer could share its experience with us?