Solved

SEO, SSL, and Canonical URL Tags

Posted on 2016-11-25
5
38 Views
Last Modified: 2016-11-28
Hi all -
SOME BACKGROUND:
I have an ecommerce site where I noticed that *none* of the common XML Sitemap generators found any pages. The site is https://bachelorettepartystation.com and here's a typical Product Page:  https://bachelorettepartystation.com/bachelorette-party/bachelorette-bash-in-progress-3-quot-button
A credible source of mine took a look and came up with this:
1) All of my "http://" pages are redirected to their "https://" equivalent (via .htaccess)
2) My "canonical" URL tags setup to point the "http://" version of your pages (not the https version)
and he gave me a workaround and a tool to generate an XML sitemap.
SO MY WORRY IS:
I can get an XML Sitemap now, BUT this makes me think that the Search Engines won't find these pages either.
HENCE MY QUESTION IS:
Should I correct items #1 and #2 above for SEO purposes, or do these things not matter to the Search Engines?
0
Comment
Question by:bleggee
  • 2
  • 2
5 Comments
 
LVL 61

Accepted Solution

by:
btan earned 500 total points
ID: 41902168
XML Sitemap is useful for SEO as bot will crawl your site with those declared pages. It helps in indexing the site for SEO e.g. sitemap will ease Google to find your pages when it crawls your website because all your pages could be ranked, not only your website as a domain.

Note that once the generator has created your sitemap, you need to upload it to the root of your domain e.g. www.yoursite.com/sitemap.xml.

Also having a sitemap on your site passes more data to search engines. So it also:

Lists all URLs from your site. And this includes pages that would not have been foundable by search engines

Gives engines page priority and thus crawl priority. You can add a tag on your XML sitemap saying which pages are the most important. Bots will thus first focus on this priority pages.
In your case, your sitemap should only contain the actual, existing pages. Google would not index a page that does not even exist but redirects somewhere like you mentioned. Any redirect in fact has the function to get Google to index another page instead. Typically Google will not index those pages due to both the redirect and the canonical and will not add any value.

In short, below are some principles
- You can pick a canonical (preferred) URL for each of your pages, but don't specify different URLs as canonical for the same page.
(e.g. one URL in a sitemap and a different URL for that same page using rel="canonical")
- Your sitemap should only contain current pages you want indexed on your site, not the pages that you're redirecting to another site.
(e.g. if Google can't find the link on your site proper, it won't index it from the sitemap anyway)
- URLs included in xml sitemaps must use the same protocol and subdomain as the sitemap itself. This means that https urls located in an http sitemap should not be included in the sitemap. This also means that urls on sample.domain.com cannot be located in the sitemap on www.domain.com. So on and so forth.

Suggest you run your existing site against the ywo below and see the findings and suggestion to beef up website performance and SEO factors. These help to keep the site healthy and friendlier to the browser and bots crawling the pages in your site.

 https://www.webpagetest.org
 http://seositecheckup.com

Another useful tool (not free) for testing sitemap is SEO spider from Screaming Frog (see item 8 and 9  in https://www.screamingfrog.co.uk/10-features/). It will then load your sitemap and begin crawling the urls it contains. In real-time, you can view the results of the crawl. And if you have Graph View up and running during the crawl, you can visually graph the results as the crawler collects data.
Once you have the XML file saved to your computer, go to the ‘Mode’ menu in Screaming Frog and select ‘List’. Then, click on ‘Select File’ at the top of the screen, choose your file and start the crawl. Once the spider has finished crawling, you’ll be able to find any redirects, 404 errors, duplicated URLs and more .... in the ‘Internal’ tab.
0
 
LVL 1

Author Comment

by:bleggee
ID: 41902371
Great info, btan, thank you!   A couple of clarifications ...

1. So I should use all http:// in the sitemap (even though the .htaccess forces all pages to https). Is that correct?

2. In the case of using Mod Rewrite, I would want the rewritten URL's in the Sitemap ... meaning I should use the "User Friendy" version of the URL for Sitemap purposes, for example using "http://example.com/red_sports_cars" and not "http://example.com/pagenm?stuff_after_the_question_mark"
0
 
LVL 61

Expert Comment

by:btan
ID: 41902413
1. yes to be consistent as advised in my earlier post.
2. yes that will be fine and this is analogous to canonical url too.
0
 
LVL 1

Author Comment

by:bleggee
ID: 41902534
Great, thanks again!
0
 
LVL 16

Expert Comment

by:Lucas Bishop
ID: 41904524
Should I correct items #1 and #2 above for SEO purposes, or do these things not matter to the Search Engines?

Yes, as a matter of best practice, you should.

Currently you're 301 redirecting the pages that you're specifying as the canonical source. A 301 redirect means the page has permanently moved. When Google goes to the "new" location, you're canonical tag specifies the source as a page that has moved permanently.  For the health of your site, you should set this up correctly. This is technically a poor implementation of the canonical tag.

You really should update the canonical urls so they point to the https pages.

Also, make sure the sitemap references the https pages. When I view the sitemap, it looks like most of the urls are to the http version of the site:
https://bachelorettepartystation.com/index.php?route=feed/google_sitemap

Also, register both the http and https properties in Google Search Console, so you can control which version is indexed.
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
This code takes an Excel list of URL’s and adds a header titled “URL List”. It then searches through all URL’s in column “A”, looking for duplicates. When a duplicate is found, it is moved to the top of the list. The duplicate URL’s are then highlig…
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
This tutorial walks through the best practices in adding a local business to Google Maps including how to properly search for duplicates, marker placement, and inputing business details. Login to your Google Account, then search for "Google Mapmaker…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now