Jump to content

  • You cannot start a new topic
  • You cannot reply to this topic

help with index of wrong pages Rate Topic   - - - - -

 
  • gabrieluk
  • Senior Member
  • Members
  • Join Date: 21-Jul 09
  • 133 posts

Posted 20 March 2010 - 12:09 AM #1

hi,
i have a lot of links being indexed by google that are wrong.

close window is the title,have a look
http://www.funky-she...93&window=popup

the worst is that this is multiplied many times,please help with this

also this link is wrong,its not supoed to be indexed....

http://www.funky-she...26&window=popup
i know it has to do with robots.txt,but i do not know wich changes in there should i do....

thanks in advance for any help
Number 1

 
  • The Tool
  • Been Here Way Too Long Member
  • Members
  • Join Date: 30-Mar 07
  • 3830 posts

Posted 20 March 2010 - 01:39 AM #2

Add:

Disallow: /image.php


 
  • gabrieluk
  • Senior Member
  • Members
  • Join Date: 21-Jul 09
  • 133 posts

Posted 21 March 2010 - 01:43 AM #3

hi,
thanks:p
would you know how to dissalow the links in the global configurations,such as costumer login,the breadcrumbs(apart sitemap) like the promotions link and affiliates...I notice that internal searches in the shop are being indexed,i do not know if thats good...problably not.
also,can you tell me difference between dissalow and no index?
thanks again
Number 1

 
  • The Tool
  • Been Here Way Too Long Member
  • Members
  • Join Date: 30-Mar 07
  • 3830 posts

Posted 21 March 2010 - 04:00 AM #4

If you have SEO enabled then add:

Disallow: /index.php


 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 21 March 2010 - 04:10 PM #5

If you have SEO enabled then add:

Disallow: /index.php

Very dangerous.
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 21 March 2010 - 04:57 PM #6

...can you tell me difference between dissalow and no index?
thanks again


1.
"Disallow" will avoid your site to be crawled by robots (bots) and so your site will not appear (shouldn't) on the web.

2.
"noindex" will just tell the robots, you won't to index your site. But the site will be still crawled and can appear in other search engine pages and directories. For example Google will accept the "noindex" command, but if it finde your site on other place on the web, the site will be (maybe) indexed as well.


I created new thread aout robots.txt here: http://forum.cs-cart...78830#post78830
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 
  • The Tool
  • Been Here Way Too Long Member
  • Members
  • Join Date: 30-Mar 07
  • 3830 posts

Posted 22 March 2010 - 09:35 AM #7

Very dangerous.


Care to explain?

No problems in 3 years of use.

 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 22 March 2010 - 01:07 PM #8

Care to explain?

No problems in 3 years of use.

Should be seen as a generally warning to users who are not very familiar to .htaccess and static/dynamic matter. Then it can happen that they will use this rule on a static page and ... it will gone...
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 
  • The Tool
  • Been Here Way Too Long Member
  • Members
  • Join Date: 30-Mar 07
  • 3830 posts

Posted 22 March 2010 - 01:21 PM #9

Should be seen as a generally warning to users who are not very familiar to .htaccess and static/dynamic matter. Then it can happen that they will use this rule on a static page and ... it will gone...


Please refrain from misleading other readers. There is nothing "very dangerous" about my post.

 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 22 March 2010 - 02:53 PM #10

Please refrain from misleading other readers. There is nothing "very dangerous" about my post.

Just for understanding what I'm talking about. If you do use

Disallow: /index.php
on a site which doesn't include that in the .htaccess file
## Redirect index.php to root "/"
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/
RewriteRule ^index\.php$ http://www.yourdomain.com [R=301,L]
or you don't use the CS-Cart SEO addon, then some sites might not be indexed.

If you set:
Disallow: /index.php
it will block sites which start with index.php in the root directory.

Each directory can have a default page. In CS-Cart's case it's the "index.php" file as the default when a user reaches "yourdomain.com". You can block the "index.php" file but that would effectively block the ".com" at the same time.
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 

Posted 22 March 2010 - 03:09 PM #11

If you have SEO enabled then add:

Disallow: /index.php


("IF" Added for context) IF or you don't use the CS-Cart SEO addon, then some sites might not be indexed.


Didn't he just say that?
I've moved on from CS-Cart to WooC******** - If you need anything I can be of little help.

 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 22 March 2010 - 03:14 PM #12

Hahaha... I can see now what do you mean, didn't see that ...sorry.
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 
  • gabrieluk
  • Senior Member
  • Members
  • Join Date: 21-Jul 09
  • 133 posts

Posted 22 March 2010 - 04:07 PM #13

I have SEO ADDON running,so i will use that.Thanks Tool,Indy and Jesse.This forum is really great and im really happy to have choosen CS cart:mrgreen:
Number 1

 
  • gabrieluk
  • Senior Member
  • Members
  • Join Date: 21-Jul 09
  • 133 posts

Posted 22 March 2010 - 06:53 PM #14

Add:

Disallow: /image.php


this is a copy of my robots.txt

User-agent: *
Disallow: /classes/
Disallow: /images/
Disallow: /skins/
Disallow: /payments/
Disallow: /image.php
Disallow: /admin.php
Disallow: /store_closed.html
Sitemap: http://www.funky-she...hop/sitemap.xml

its the default text that comes with cs.As you see,image.php,is already disallow.How can it be....
The only reason for this,maybe,is that cs cart is installed in a sub folder.Theres another cms script in the root of the domain,wich has an robots.txt also.I do not know if is correct to have 2 robots.txt,one for the root folder,and another for the cs cart subfolder.Or if i should have 1 robots.txt in the root folder of the domain with this:
Disallow: /shop/image.php
What do you think?:rolleyes:
Number 1

 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 22 March 2010 - 07:08 PM #15

If it's installed in a subfolder (not as a subdomain) then you have to use the robots.txt in your root:

Disallow: /subfolder/image.php

You can't use a second robots.txt in the subfolder because you can't submit a "subfolder" in the Google Webmaster Tools.
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 
  • gabrieluk
  • Senior Member
  • Members
  • Join Date: 21-Jul 09
  • 133 posts

Posted 22 March 2010 - 08:18 PM #16

If it's installed in a subfolder (not as a subdomain) then you have to use the robots.txt in your root:

Disallow: /subfolder/image.php

You can't use a second robots.txt in the subfolder because you can't submit a "subfolder" in the Google Webmaster Tools.


oh...i see...but theres something you say i dont understand...

you can't submit a "subfolder" in the Google Webmaster Tools~~

I submited the subfolder to google webmaster tools and it has acepted.I describe what i did.
created webmaster account,inserted www.mydomain.com to the account.Verified the website with google.html file.After i registered wwww.mydomain.co.uk/shop and it was accepted.i also verified the subfolder /shop.
I did this more for the fact i was finding difficult to send only 1 sitemap xml to googles including two independent scripts(other cms plus cs cart).So i registered them independent,so i could use cs cart addon xml sitemap from SnoRocket and the addon from the other cms script.
Is that makes sense???:confused:
Number 1

 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 22 March 2010 - 08:35 PM #17

oh...i see...but theres something you say i dont understand...

you can't submit a "subfolder" in the Google Webmaster Tools~~

I submited the subfolder to google webmaster tools and it has acepted.I describe what i did.
created webmaster account,inserted www.mydomain.com to the account.Verified the website with google.html file.registered wwww.mydomain.co.uk-shop and it was accepted.i also verified the subfolder /shop.
I did this more for the fact i was finding difficult to send only 1 sitemap xml to googles including two independent scripts(other cms plus cs cart).So i registered them independent,so i could use cs cart addon xml sitemap from SnoRocket and the addon from the other cms script.
Is that makes sense???:confused:

Yes, you're right I forgotten that it supports subfolders as well. In that case you should disallow the CSC subfolder in the root in the robots.txt
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 
  • gabrieluk
  • Senior Member
  • Members
  • Join Date: 21-Jul 09
  • 133 posts

Posted 22 March 2010 - 08:46 PM #18

Yes, you're right I forgotten that it supports subfolders as well. In that case you should disallow the CSC subfolder in the root in the robots.txt

I dont know if this is good idea,since i need to insert the command disallow image.php in the robots.txt in the root FOLDER OF THE DOMAIN.even if the sub folder is registered as another website,google is not reading the robots.txt from cs cart,as you can see that my images are being indexed.....i feel a little confusion now...
i think even if they are registered as two sites,google still see them as in the same domain,just separated by the folder,so maybe,in this case google reads the FIRST robots.txt he founds in the domain ,independent if theres a second website in a subfolder
Number 1

 
  • indy0077
  • Senior Member
  • Banned
  • Join Date: 03-Nov 09
  • 1431 posts

Posted 22 March 2010 - 08:59 PM #19

I dont know if this is good idea,since i need to insert the command disallow image.php in the robots.txt of the main site.even if the sub folder is registered as another website,google is not reading it,as you can see that my images are being indexed.....i feel a little confusion now...

You will have to put the command into the robots.txt in the subfolder. With disallowing the subfolder you will tell the robots not to go to this folder. The robots.txt in the subfolder will take care about crawling.

However, I can understand that now it will be allowed to submit subfolders as well because of many scripts which use own sitemaps etc. but I know from the past that robots have accepted only one robots.txt file namely in the root.
.
CS-Cart Professional €160.00 | CS-Cart Multi-Vendor €625.00 | CS-Cart Hosting | SSL Certificates
.
CS-Cart Optimized Servers *** USA & UK VPS Servers

 
  • gabrieluk
  • Senior Member
  • Members
  • Join Date: 21-Jul 09
  • 133 posts

Posted 22 March 2010 - 09:43 PM #20

still confused...
im afraid to disallow the subfolder in root of the domain,as many pages are indexed and the site is ok with googles by now.Are you sure that if i dissalow the sub folder in the robots.txt in the root of the domain,that the SECOND robots.txt in the sub folder will be active??are you 100 per cent sure??so now you are saying i should have 2 robots.txt file...Any other developer could share its experience with us?
Number 1