Jump to content

  • You cannot start a new topic
  • You cannot reply to this topic

Missing trailing slashes in URL are causing duplicate content issue Rate Topic   - - - - -

 

Posted 23 November 2011 - 09:37 PM #1

Summary:

These two urls are returning server response 200 OK, and are therefore considered TWO distinct separate pages:

mydomain/cscartstore/some-category/
mydomain/cscartstore/some-category
Both are obviously resolving to the same category page in my cart, but this is causing:

1 - Google to warn me of duplicate content in GWT, and
2 - splitting of page rank, PR, and links between these 2 urls, making each less relevant.
3 - traffic in GA is also split, clearly indicating Google is treating these urls as distinct.

To avoid both page dilution, and potentially severe content duplication penalties, I am trying to enforce the use of the trailing slash for all directories throughout my site, and this is
what I've done so far:


Elsewhere in my site (besides the store directory), I've resolved this issue easily with the following htaccess redirect

RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.*[^/])$ /$1/ [L,R=301]
that I've added to my root htaccess, and also to the htacess files in each directory (if a directory has one). As a result, if someone types in

mydomain/somedir

in their browser, then the url is automatically changed to

mydomain/somedir/ (with the slash),

as the page is redirected. And I get these server responses:

mydomain/somedir   ==> 301 redirected to    mydomain/somedir/
mydomain/somedir/  ==> 200 OK
which are exactly the right responses I should get. So my site in general is in great shape and not having duplicate content issues,

EXCEPT


for the cart categories, that don't seem to obey the htaccess above that works perfectly elsewhere.

I've been pulling my hair looking for a solution to this, and would appreciate any advice please.

More details:

a) - as you see above, I have the store in a directory, not at the root of my site. At the same level of the store I have other directories, all obeying the htaccess directive and not creating any duplicate issue. Examples:

mydomain/anydir/subdir1                   redirects to: mydomain/anydir/subdir1/
mydomain/another-dir/subdir2              redirects to: mydomain/another-dir/subdir2/
mydomayn/yetanotherdir/subdir3/subsubdir  redirects to: mydomay/yetanotherdir/subdir3/subsubdir/
mydomain/cscartstore                      redirects to: mydomain/cscartstore/ ( ==> up to here the htaccess redirect works fine)
mydomain/cscartstore/a-category           DOES NOT redirect at all and shows the "A Category" page under url without trailing slash

b ) - I am using the SEO addon, and in it I've chosen to display categories as /category/ That is working fine from links within the cart, so for instance if I go to the menu and select a category, it will show correctly with the "/" as in

mydomain/cscartstore/example-category/

c) - HOWEVER, if someone links to this category from their own site and forgets the trailing slash (which is not only common, but beyond my control in many cases), as in

mydomain/cscartstore/example-category

then Google will eventually crawl that link back to my store and index this url without the trailing slash, failing to identify it with the one having the trailing slash. And that's the source of my content duplication issues. So even though I am being consistent in my internal links, sitemap, etc, this issue still appears and I need to address it.

d) - Besides the htaccess redirect (which is not working for CS-Cat categories only), another solution is to use a canonical tag in the "head" section of every single category page, redirecting the "non-/" to the "/" url version. There are 3 issues with this solution:

1 - Google does not necessarily follows (respect) canonical tags as they would a htaccess 301 redirect

2 - Google may not notice the "/" at the end of the canonical url (web reports are conflicting on this) and therefore not see the difference in this one particular redirecting circumstance, although separating "non-/" from the "/" urls for other purposes - such as when indexing, for example. (and yes, Google is not consistent in their approach of this situation, as even they admit).

3 - This canonical redirect would need to be done automatically, by adding code in the head section of the page-creating template, such as this:

<link rel="canonical" href="{page_url}">
that would become, in the html page created for a category named (let's say) "A Category":

<link rel="canonical" href="http://mydomain.com/cscartstore/a-category/">

So if you do know how to implement this, and would like to contribute information or code, then your contribution will be highly appreciated by me and others in this circumstance and I thank you in advance.

:)

 
  • Struck
  • Teetering on Genious
  • Members
  • Join Date: 07-Mar 09
  • 2502 posts

Posted 23 November 2011 - 11:03 PM #2

Hi Storekeeper,

Just have time to post a quick link to a 3rd party addon which may be of benefit to you.

http://www.cscartroc...-fix-addon.html

Hope it helps! ;)
Cooking with Gas on Version 4.1.2 (But proceeding with caution....)

 
  • Adrian8
  • Junior Member
  • Members
  • Join Date: 04-Feb 11
  • 138 posts

Posted 24 November 2011 - 03:15 PM #3

Hi Storekeeper,

Try this code before any other rewrite code:-
RewriteCond %{REQUEST_URI} !\.[^./]+$
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://domain.com/$1/ [R=301,L]

First line checks that the url does not contain . (dot reference to any file)
Second line checks if url is missing a / trailing slash (avoids 301 redirect if trailing slash is already appended)
Last line finally appends trailing slash and redirect only to directories without the trailing slash

Hope this solves your issue :)
www.ecopolar.com - V2.1.4
www.ledpolar.com - V2.2.4

 

Posted 24 November 2011 - 07:41 PM #4

Hi, Struck, thanks for taking the time to read my massive post all the way to the canonicals part at the end.

I did look into that addon, and while quite interesting and useful, it does not do the simple thing I need, which is to basically insert a tag redirecting a page to its complete url (with trailing slash) - and only for the category pages.

Also, it seems like I don't really need a whole addon, only to insert a couple lines of code in one of the templates. Don't know how to do it, but I would love it someone could fill in the blanks in my knowledge base... :rolleyes:

This addon is in fact so powerful and far-reaching in its results that it requires quite a bit of skill to use wisely, otherwise it may decimate your rankings, as I suspected may have (possibly - not sure) happened to this poor fellow.

But again, thanks for suggesting it.

Hi Storekeeper,

Just have time to post a quick link to a 3rd party addon which may be of benefit to you.

http://www.cscartroc...-fix-addon.html

Hope it helps! ;)



 

Posted 24 November 2011 - 08:12 PM #5

YEAAAAH !! :D

Thanks Adrian8, these 3 lines solved my problem!!!

So now all my categories redirect properly to the url with the trailing slash, and I don't have to worry about duplicate content anymore!

Great suggestion, and way better than the canonical solution I was resigned to adopting.

THANK YOU!


Hi Storekeeper,

Try this code before any other rewrite code:-

RewriteCond %{REQUEST_URI} !\.[^./]+$
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://domain.com/$1/ [R=301,L]

First line checks that the url does not contain . (dot reference to any file)
Second line checks if url is missing a / trailing slash (avoids 301 redirect if trailing slash is already appended)
Last line finally appends trailing slash and redirect only to directories without the trailing slash

Hope this solves your issue :)