Missing trailing slashes in URL are causing duplicate content issue

Summary:



These two urls are returning server response 200 OK, and are therefore considered TWO distinct separate pages:


mydomain/cscartstore/some-category/<br />
mydomain/cscartstore/some-category<br />

```<br />
Both are obviously resolving to the same category page in my cart, but this is causing:<br />
<br />
[indent]1 - Google to warn me of duplicate content in GWT, and  <br />
2 - splitting of page rank, PR, and links between these 2 urls, making each less relevant.<br />
3 - traffic in GA is also split, clearly indicating Google is treating these urls as distinct.[/indent]<br />
To avoid both page dilution, and potentially severe content duplication penalties, I am trying to enforce the use of the trailing slash for all directories throughout my site, and this is[b]<br />
what I've done so far:[/b]<br />
<br />
Elsewhere in my site (besides the store directory), I've resolved this issue easily with the following htaccess redirect<br />
<br />
```php
RewriteCond %{REQUEST_FILENAME} -d<br />
RewriteRule ^(.*[^/])$ /$1/ [L,R=301]
```<br />
that I've added to my root htaccess, and also to the htacess files in each directory (if a directory has one).  As a result, if someone types in <br />
<br />
[indent]mydomain/somedir[/indent]<br />
in their browser, then the url is automatically changed to <br />
<br />
[indent]mydomain/somedir[b]/[/b] ([i]with[/i] the slash), [/indent]<br />
as the page is redirected. And I get these server responses:<br />
<br />
```php
mydomain/somedir   ==> 301 redirected to    mydomain/somedir/<br />
mydomain/somedir/  ==> 200 OK
```<br />
which are exactly the right responses I should get. So my site in general is in great shape and not having duplicate content issues, <br />
[b]<br />
EXCEPT[/b] <br />
<br />
for the cart categories, that don't seem to obey the htaccess above that works perfectly elsewhere.<br />
<br />
I've been pulling my hair looking for a solution to this, and would appreciate any advice please.  <br />
<br />
More details: <br />
<br />
a) - as you see above, I have the store in a directory, not at the root of my site. At the same level of the store I have other directories, all obeying the htaccess directive and not creating any duplicate issue. Examples:<br />
<br />
```php
mydomain/anydir/subdir1                   redirects to: mydomain/anydir/subdir1/<br />
mydomain/another-dir/subdir2              redirects to: mydomain/another-dir/subdir2/<br />
mydomayn/yetanotherdir/subdir3/subsubdir  redirects to: mydomay/yetanotherdir/subdir3/subsubdir/<br />
mydomain/cscartstore                      redirects to: mydomain/cscartstore/ ( ==> up to here the htaccess redirect works fine)<br />
mydomain/cscartstore/a-category           DOES NOT redirect at all and shows the "A Category" page under url without trailing slash<br />

```<br />
<br />
b ) - I am using the SEO addon, and in it I've chosen to display categories as /category/  That is working fine from links within the cart, so for instance if I go to the menu and select a category, it will show correctly with the "/" as in <br />
<br />
[indent]mydomain/cscartstore/example-category/  [/indent]<br />
c) - HOWEVER, if someone links to this category from their own site [i][u]and[/u][/i] forgets the trailing slash (which is not only common, but beyond my control in many cases), as in <br />
<br />
[indent]mydomain/cscartstore/example-category   [/indent]<br />
then Google will eventually crawl that link back to my store and index this url [i]without[/i] the trailing slash, failing to identify it with the one having the trailing slash. And that's the source of my content duplication issues. So even though I am being consistent in my internal links, sitemap, etc, this issue still appears and I need to address it. <br />
<br />
d) - Besides the htaccess redirect (which is not working for CS-Cat categories only), another solution is to use a canonical tag in the "head" section of every single category page, redirecting the "non-/" to the "/" url version.  There are 3 issues with this solution:<br />
<br />
[indent]1 - Google does not necessarily follows (respect) canonical tags as they would a htaccess 301 redirect <br />
<br />
2 - Google may not notice the "/" at the end of the canonical url (web reports are conflicting on this) and therefore not see the difference in this one particular redirecting circumstance, although separating "non-/" from the "/" urls for other purposes - such as when indexing, for example.  (and yes, Google is not consistent in their approach of this situation, as even they admit).<br />
<br />
3 - This canonical redirect would need to be done automatically, by adding code in the head section of the page-creating template, such as this:<br />
<br />
```php
<link rel="canonical" href="{page_url}"> 
```<br />
that would become, in the html page created for a category named (let's say) "A Category":<br />
<br />
```php
<link rel="canonical" href="http://mydomain.com/cscartstore/a-category/">  
```<br />
<br />
So if you do know how to implement this, and would like to contribute information or code, then your contribution will be highly appreciated by me and others in this circumstance and I thank you in advance.[/indent] <img src="upload://yGrKuhdxbbf2nIFvV2XDJuEFQb7.gif" class="bbc_emoticon" alt=":)">

Hi Storekeeper,



Just have time to post a quick link to a 3rd party addon which may be of benefit to you.



CS-Cart SEO Ultimate: Canonical URL + Google Rich Snippets + SEO Name History + More



Hope it helps! ;)

Hi Storekeeper,



Try this code before any other rewrite code:-

<br />
RewriteCond %{REQUEST_URI} !\.[^./]+$<br />
RewriteCond %{REQUEST_URI} !(.*)/$<br />
RewriteRule ^(.*)$ http://domain.com/$1/ [R=301,L]<br />

```<br />
<br />
First line checks that the url does not contain . (dot reference to any file)<br />
Second line checks if url is missing a / trailing slash (avoids 301 redirect if trailing slash is already appended)<br />
Last line finally appends trailing slash and redirect only to directories without the trailing slash<br />
<br />
Hope this solves your issue  <img src="upload://yGrKuhdxbbf2nIFvV2XDJuEFQb7.gif" class="bbc_emoticon" alt=":)">

Hi, Struck, thanks for taking the time to read my massive post all the way to the canonicals part at the end.



I did look into that addon, and while quite interesting and useful, it does not do the simple thing I need, which is to basically insert a tag redirecting a page to its complete url (with trailing slash) - and only for the category pages.



Also, it seems like I don’t really need a whole addon, only to insert a couple lines of code in one of the templates. Don’t know how to do it, but I would love it someone could fill in the blanks in my knowledge base… :rolleyes:



This addon is in fact so powerful and far-reaching in its results that it requires quite a bit of skill to use wisely, otherwise it may decimate your rankings, as I suspected may have (possibly - not sure) happened to this poor fellow.



But again, thanks for suggesting it.


[quote name=‘Struck’ timestamp=‘1322089432’ post=‘126658’]

Hi Storekeeper,



Just have time to post a quick link to a 3rd party addon which may be of benefit to you.



CS-Cart SEO Ultimate: Canonical URL + Google Rich Snippets + SEO Name History + More



Hope it helps! ;)

[/quote]

YEAAAAH !! :D



Thanks Adrian8, these 3 lines solved my problem!!!



So now all my categories redirect properly to the url with the trailing slash, and I don’t have to worry about duplicate content anymore!



Great suggestion, and way better than the canonical solution I was resigned to adopting.



THANK YOU!




[quote name=‘Adrian8’ timestamp=‘1322147751’ post=‘126691’]

Hi Storekeeper,



Try this code before any other rewrite code:-

<br />
RewriteCond %{REQUEST_URI} !\.[^./]+$<br />
RewriteCond %{REQUEST_URI} !(.*)/$<br />
RewriteRule ^(.*)$ http://domain.com/$1/ [R=301,L]<br />

```<br />
<br />
First line checks that the url does not contain . (dot reference to any file)<br />
Second line checks if url is missing a / trailing slash (avoids 301 redirect if trailing slash is already appended)<br />
Last line finally appends trailing slash and redirect only to directories without the trailing slash<br />
<br />
Hope this solves your issue  <img src="upload://yGrKuhdxbbf2nIFvV2XDJuEFQb7.gif" class="bbc_emoticon" alt=":)"><br />
[/quote]