?subcats=Y

I have a truck load of categories indexed in google with ?subcats=y parameter. Now with the new version of cs-cart this is properly handled via rel=canonical pointing to the main category URL? Is that sufficent to make Google de-index those duplicated pages with ?subcats=y ?

Or should I add a disallow in robots.txt?
Or perhaps handle that parameter in "URL parameter settings" in Google Search Console?

What is the correct way? I'd like those pages to be gone from site:mypage.com.

Search Console now indicates a whooping 3mln of URLs with that parameter "monitored" (whatever that means, because they are not all indexed).

What is the strategy to poke Google to de-index mass low/no-value links? Should I first put "noindex" tag in them and wait to be de-index then block them?

There is no strategy :stuck_out_tongue: i have warned about this potential danger and complications but nobody hears.


Add the following line to your robots.txt


Disallow: /*?subcats=

Submit reindexation and wait couple days

Thanks demeldoo. What do you mean by "submit reindexation"?

Would you mind showing me your cs-cart website?

submit in the google webmaster refreshed robots.txt file... look at the left side for robots

or use

https://www.google.com/webmasters/tools/submit-url?pli=1

when you made changes

Oh, yea, I submitted new robots.txt a week ago with other parameters blocked, but Google still did not de-index them

Takes time

demeldoo any idea what does a gclid= parameter stand for? It adds after (usually) a homepage URL and makes no difference to content

Somekind get parameter. Have no clue… post a link here.


I have basicly removed all query parameters in my robots.txt

There is no strategy :P i have warned about this potential danger and complications but nobody hears.

Add the following line to your robots.txt

Disallow: /*?subcats=
Submit reindexation and wait couple days

Can you please provide me with the topic where this was discussed or a bug report.

I will take a look.

Can you please provide me with the topic where this was discussed or a bug report.
I will take a look.


Hello,

I will search the forum and attach some links. Also the links from bugtracker.

Best wishes

As I am reading the duplicate and thin content with URL parameters and "unfriendly" seo name should firstly be "noindexed" in HTML and then after they go de-indexed they can be blocked in robots.txt

I am afraid that just blocking them in robots.txt wont actually force them to be de-indexed.

Anybody has any experience in pruning thin content from post-old cs-cart installation?

Maybe there is a way to add "noindex" tag in pages like:

?subcats

?sale

?more_filters

?dispatch

?product_id

?category_id

Perhaps you could advice imac?

I had solved this problem with robots.txt abd when you browse the Google forum suggested solution is robots not no index.


Unfortunately i still didn’t gater some time to post all links from the forum and bugtracker about this matter.


Hope for weekend :wink:

Can you please provide me with the topic where this was discussed or a bug report.

I will take a look.

hello,

here is a sit. what i faced

http://forum.cs-cart.com/tracker/issue-5432-web-attack/

robots tried to index million of pages and the result was a total failure.

here are some selected forum topics about robots.txt

http://forum.cs-cart.com/topic/43510-blocking-dynamic-urls-from-google/?hl=robots&fromsearch=1
http://forum.cs-cart.com/topic/43037-google-out/?hl=robots#entry238887
http://forum.cs-cart.com/topic/43037-google-out/?hl=robots#entry238887
http://forum.cs-cart.com/topic/35006-database-size-increased/page-2?hl=robots#entry225026
http://forum.cs-cart.com/topic/25374-robotstxt/page-2?hl=robots#entry195585
http://forum.cs-cart.com/topic/40094-google-crawling-features-hash-page-2/?hl=robots&fromsearch=1
http://forum.cs-cart.com/topic/37408-how-to-remove-items-per-page12/?hl=robots#entry194127
http://forum.cs-cart.com/topic/34346-features-hashv215v6293v197/?hl=robots#entry210410
http://forum.cs-cart.com/topic/39701-return-urlindexphp?dispatch=pagesviewpage-id3/?hl=robots#entry210406
http://forum.cs-cart.com/topic/38237-stop-items-per-page12-from-being-crawled/?hl=robots#entry199681
Also please check if cs replaces the robots.txt during the update. I see some people on the forum wrote about facing such situation.
As i mentioned to You that the probably best solution is to open demo for indexation and monitor the data in google webmaster.
Best wishes

In a situation like this, nothing is going to stop an abusive bot except blocking it from the server side.

Do you really think abusive bots abide by robots.txt or anything else for that matter?

In a situation like this, nothing is going to stop an abusive bot except blocking it from the server side.

Do you really think abusive bots abide by robots.txt or anything else for that matter?

Understand where you are going.

Let's separate them to : good, bad and ugly ;)

What prevents the good one from doing the same as the ugly one?

You can't block google on server level. .. ugly baidy probably ;)
Recommend the movie :p

What prevents the good one from doing the same as the ugly one?

The good ones don't want to be reported and blacklisted.

You can't block google on server level. .. ugly baidy probably ;)

You can block anything you want, good or bad, though blocking the good ones is not recommended.

Recommend the movie :P

My favorite movie.

Back to subcats.... what does this parameter even change? Anything? I cant notice any difference, but obviously I cant check all the URL with that parameter because there are thousands of them. Can I just make a 301 redirection to remove the parameter?

Back to subcats.... what does this parameter even change? Anything? I cant notice any difference, but obviously I cant check all the URL with that parameter because there are thousands of them. Can I just make a 301 redirection to remove the parameter?

As for the parameter.

It means that the server will search the products not only in current category but also in all subcategories.

Actually it is the same as the "Show products from subcategories of the selected category:" option https://www.evernote.com/l/AQHgLyuU68pFg6SxfAqs4hEkSMmEw7WUCcY

So if the option is enabled you can add 301 redirect - everything should work fine.

Please let me know the results, because I'm not sure how this "subcats" appears in the links.

Would be nice if you can provide me with link to a page where I can see such URLs.

Thanks imac. I dont have this option checked. And I dont want to have products from subcats shown. But I dont have a button in shop for customers to enable it, so Im not sure how Google indexed it. The scale of it is enormous though.

Thanks imac. I dont have this option checked. And I dont want to have products from subcats shown. But I dont have a button in shop for customers to enable it, so Im not sure how Google indexed it. The scale of it is enormous though.

Please let me know if you find a page with such link.