Non-Www To Www Redirect With Subpath Removing Querystrings Such As Gclid

We are troubleshooting why gclid seems to be removed from the url in many cases and stumbled upon, what we think is, an issue with CS: the preservation of querystrings (parameters) when switching from non-www into www mode in a url with a trailing path such as example.com/some-category.html.

Is anyone else experiencing this behavior? If so, where is the file we need to fix? I looked over a few spots but didn’t see anything obvious besides the compile route in tygh/router.php.

Let’s say there is an ad that drops a user into the store category or something in non-www mode. The url contains the GA tracking string:

https://example.com/some-category.html?cglid=klj12h3jh12jk3hjkh123

What happens is that the url is rewritten to use www mode, and in the process truncates off all querystrings before GA is loaded, leaving this:

https://www.example.com/some-category.html

This means that adwords isn’t getting datas it needs. This doesn’t happen when there is no trailing path.

There is no vhost/htaccess rewrite going on that i can tell and it works fine in our other platforms. This behavior is consistent on cs-cart.com as well, which uses cs-cart to power the site, hence the reason we believe its a cart logic issue somewhere. Try it out:

How to create a B2B website? Get a personal demo.

Any thoughts are appreciated. Thanks

Be consistent in your use of your URL. www.example.com and example.com are two different sites (though they may share the same IP address). If you've set your config.local.php to use www.example.com and you have your environment setup to put example.com and www.example.com to use the same document root (usually hosting does this based on IP), then cs-cart will redirect to www.example.com because it is NOT the URL for the cart and to ensure its own security, it will only act on references to the site listed in config.local.php.

You can either do the redirection yourself in your .htaccess file or the best approach is to always use one or the other references for all external references to your site.

Yeha I get consistency…the issue is the redirect method truncating querystring during the canonicalization, and the fact that there may be some rogue non-www request.

Also, external resources can be humans too. Although unlikely they will use a querystring, they may use copy and paste without the “https://www.”or use a historical url. Like, say you change your site mode from non-www to www by default. All those people with saved bookmarks that use querystrings in cats, or previous quotes, emails, product pre-selected, etc are lost (truncated).

We solved it in the meantime at htaccess, however it’s something in the app doing this, was just curious where.

the issue is the redirect method truncating querystring during the canonicalization, and the fact that there may be some rogue non-www request.

Think of this from a security perspective. Someone has accessed the site by a URL that shouldn't exist. Do you really think it's safe to simply pass the QUERY_STRING?

Personally I see this as the exception rather than the rule. I.e. a gclid being stripped would not come from a human and if it's a store bookmark, might actually be best of it isn't passed so you don't get charged for the click.

Lol, a security perspective? Yes i think it’s safe to pass a querystring. It’s no different than passing it on end url. There is no Zero Day for header set location. And we have a WAF. With that in mind, I do believe you are grasping at straws here and trying to un-bug a bug for no apparent reason.

Anyways, the question is, where is the lib/router creating this issue?

You can look in either config.php or in app/functions/fn.init.php for where the fn_redirect() is done based on a mismatch between how you've defined your site and what's being passed via http.

I simply don't think it's a bug and believe cs-cart's use is the correct and most secure method for dealing with an incorrect reference to the HTTP_HOST that is defined for the site.