We help IT Professionals succeed at work.

Wordpress category pages in crawl growing

rivkamak
rivkamak used Ask the Experts™
on
When I look at my site url in wordpress I see that when I have a category page, no matter what the middle of the url is, wordpress still loads the correct category archive based off the last folder on the url

IE:

mysite.com/category/hobbies

will return the same posts as

 mysite.com/category/gibberish/hobbies

Even thought gibberish is not a valid category on my site.

A. Is there anything to do to stop that and it should return a 404 when the middle category doesn't exist

B. Somehow the google crawler, it picking up on this problem and my site is generating duplicate categories in the link so my crawl directly is just growing exponentially. How can I stop that?

The list is showing
 mysite.com/category/hobbies
mysite.com/category/hobbies/hobbies
mysite.com/category/hobbies/hobbies/hobbies

 etc.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Fractional CTO
Distinguished Expert 2018
Commented:
Looks like maybe you have some sort of bread crumbing system at work which is incorrectly configured or coded.

Could also be some odd general setting, although this is doubtful.

Describe steps you went through to setup your permalinks + any category/tag management plugin(s) you have installed.

Tip: Provide your real URL. Very difficult to ask enough questions to form a starting point. Better to just have the site to test.

Author

Commented:
My permalink is set up as /%category%/%postname%/

There is nothing set up for category base in the optional settings.
All I have as plugins is genesis. It's a pretty basic site with nothing added.

Thank you
David FavorFractional CTO
Distinguished Expert 2018

Commented:
Likely good test will be to do this.

Clone your site to a dev site for destructive testing, so if you destroy your site, no problem.

Then on your dev site.

1) Ensure all text/html part caching is 100% disabled, by looking at returned headers...

Use curl -I -L https://DavidFavor.com/ output as an example of draconian cache busing headers to set.

2) After #1 is correct, manually flush your browser cache, to ensure you're starting from a point where your changes will actually render in your browser.

Note: If you skip #1 + #2 any change you make which fixes your problem may never show up in your browser.

3) Deactivate all your plugins + retest.

4) Switch to GeneratePress for your theme + retest.

5) If problem persists, then there's database corruption to fix... which is... a long ugly conversation... This is highly unlikely.

6) Either #3 or #4 will likely fix the problem, so you'll start reactivating theme or plugin(s), one by one, till problem reoccurs.

You'll know your problem is fixed when a URL like mysite.com/category/hobbies/hobbies/hobbies throws a 404, rather than returning content.