My server has been experiencing intermittent CPU-related performance issues for a while now, and I (may have) finally found the problem and solution. (My site runs version 2.2.4 on a VPS with a 3.4 GHz CPU and 2.3 GB of RAM.)
THE PROBLEM: After identifying periods of excessive load (100% CPU usage) and reviewing access logs for those times, a pattern emerged. During each of those times, robots (crawlers) were indexing URLs containing “features_hash” (due to use of CS-Cart's “Filters” feature).
As many of you know, having several Filters, each with with several options, produces a very large number of possible URL combinations. Even with caching turned on, these “features_hash” pages seem to be more processor-intensive than standard category and product URLs. (Maybe someone could test this theory and do some benchmarking?) Even if they're not more processor-intensive, the potential volume of pages is huge, and the value of having them indexed is (for me at least), very small.
THE SOLUTION: Change the robots.txt file to include the following:
User-agent: *
Disallow: /*?
or
User-agent: *
Disallow: /*features_hash
The first option disallows indexing of all URLs which contain a question mark. Because my site has SEO-friendly links, I've chosen that option, as I'm happy to avoid indexing all non-SEO-friendly URLs. The second option is more specific, and only disallows indexing of “features_hash” pages.
For my site, the change was immediate. I understand that not all robots follow the robots.txt protocol, but most of my robot traffic now avoids the CPU-intensive indexing of unwanted pages.
I would appreciate any confirmation (or refutation) of this theory and fix.
Important note on the robots.txt file: be sure to modify the robots.txt file which resides in your site's root web directory (www.yoursite.com/robots.txt). I've seen some posters state (incorrectly) that you can install CS-Cart in a subdirectory (www.yoursite.com/yourfolder) and still expect the standard robots.txt installed by CS-Cart to be found and used. If you doubt this, search your server logs for /yourfolder/robots.txt
cheers,
Don