robots.txt

[quote name=‘jobosales’]Is this supposed to represent some widespread disregard for the meta tag? [/quote]

No, it’s not.

Hi Indy!

I have this urls being indexed many times…

[url]http://www.fmydomain.co.uk/shop/index.php?target=gift_certificates&mode=free_products&search_products=Y&page=90[/url]



the only difference between them is the end of the url "Y&page= "



should i

Disallow: /index.php?target=gift_certificates&mode=free_products&search_products=Y&page=



??:rolleyes:

[quote name=‘gabrieluk’]should i

Disallow: /index.php?target=gift_certificates&mode=free_products&search_products=Y&page=



??:rolleyes:[/quote]

Yes, right.

Avoid duplicate content?



Disallow: /.html?subcats=Y

Disallow: /
.php?subcats=Y

[quote name=‘hyteckit’]Avoid duplicate content?



Disallow: /.html?subcats=Y

Disallow: /
.php?subcats=Y[/quote]

You can still add it to the file, there is nothing wrong. The only thing are the “wildcards” [ * ]. Not all bots support those.

+

IMHO



Disallow: /?sort_by=

Disallow: /?sl=

Is it possible to prevent the generation of dynamic links for cs-cart? That way, the sitemap won’t index any?



Thanks!

[url]http://www.robotstxt.org/[/url]

[quote name=‘dlm3089’]Is it possible to prevent the generation of dynamic links for cs-cart? That way, the sitemap won’t index any?



Thanks![/quote]

You can use something like that:


Disallow: /index.php?dispatch=It depends, which links do you want to disallow. The example above will disallow indexing of all URLs which contain ‘/index.php?dispatch=’ in the URL.

[quote name=‘indy0077’]You can use something like that:


Disallow: /index.php?dispatch=It depends, which links do you want to disallow. The example above will disallow indexing of all URLs which contain ‘/index.php?dispatch=’ in the URL.[/quote]



Is it ok to use Disallow actual URL instead to prevent the duplication.



Disallow: /display-cabinets.html?subcats=Y

Disallow: /display-cabinets/corner-display-cabinets.html?sort_by=

Disallow: /display-cabinets/corner-display-cabinets.html?subcats=Y



Thank you.

[quote name=‘albert@officeuniversal.co’]Is it ok to use Disallow actual URL instead to prevent the duplication.



Disallow: /display-cabinets.html?subcats=Y

Disallow: /display-cabinets/corner-display-cabinets.html?sort_by=

Disallow: /display-cabinets/corner-display-cabinets.html?subcats=Y



Thank you.[/quote]

What do you mean with ‘duplication’?

[quote name=‘indy0077’]What do you mean with ‘duplication’?[/quote]



I should have said, prevent crawling for Filters and Sorting features.



Because I use extensive filtering, and google is crawling all of my pages and I am scare to be spammed.

I thought to use actual path to prevent them to be crawled.



I know u are expert, and I used suggested robot.txt sample.

Is that correct or I should take it out.



Thank you.

[quote name=‘albert@officeuniversal.co’]I should have said, prevent crawling for Filters and Sorting features.



Because I use extensive filtering, and google is crawling all of my pages and I am scare to be spammed.

I thought to use actual path to prevent them to be crawled.



I know u are expert, and I used suggested robot.txt sample.

Is that correct or I should take it out.



Thank you.[/quote]

And what the filter begin the string with?

[quote name=‘indy0077’]And what the filter begin the string with?[/quote]



Something like:

[URL]http://domain.com/floor-clocks.html?subcats=Y&features_hash=V1275[/URL]

[quote name=‘albert@officeuniversal.co’]Something like:

[URL]http://domain.com/floor-clocks.html?subcats=Y&features_hash=V1275[/URL][/quote]

Hi you can try this one:



Disallow: features_hash
Disallow: /index.php?type=extended&search_performed
Disallow: /index.php?subcats=Y
This should disallow all URLs which contain the word ‘features_hash’

[quote name=‘indy0077’]Hi you can try this one:



Disallow: features_hash
Disallow: /index.php?type=extended&search_performed
Disallow: /index.php?subcats=Y
This should disallow all URLs which contain the word ‘features_hash’[/quote]



Thank you.

[quote name=‘albert@officeuniversal.co’]Thank you.[/quote]

You’re welcome… of course I added the two other lines to the code just to disallow some searching strings…

… and how to disallow the following strings:



.html?sort_by=position&sort_order=asc

.html?sort_by=popularity&sort_order=desc&layout=products_multicolumns

.html?sort_by=product&sort_order=asc



by adding:



Disallow: ?sort_by=



or



Disallow: /?sort_by=



or



Disallow: position&sort_order

Disallow: popularity&sort_order

Disallow: product&sort_order



or…?

[quote name=‘nedd’]… and how to disallow the following strings:



.html?sort_by=position&sort_order=asc

.html?sort_by=popularity&sort_order=desc&layout=products_multicolumns

.html?sort_by=product&sort_order=asc



by adding:



Disallow: ?sort_by=



or



Disallow: /?sort_by=



or



Disallow: position&sort_order

Disallow: popularity&sort_order

Disallow: product&sort_order



or…?[/quote]

this one


Disallow: /?sort_by=



is wrong, because there isn’t a ‘/’ before ‘?sort_by=’



It depends on what do you want to disallow. If you choose ‘html?sort_by=position&sort_order=asc’ then the disallowed URL must contain the whole string. If you say e.g. just ‘?sort_by=’ then all URLs which contain the string, will be disallowed.

hey Indy,



I have 84,263 duplicate title ,meta descriptions tags due to filters will you please guide what should i disallow in my robot file