Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


SEO, SSL, and Canonical URL Tags

Posted on 2016-11-25
Medium Priority
Last Modified: 2016-11-28
Hi all -
I have an ecommerce site where I noticed that *none* of the common XML Sitemap generators found any pages. The site is and here's a typical Product Page:
A credible source of mine took a look and came up with this:
1) All of my "http://" pages are redirected to their "https://" equivalent (via .htaccess)
2) My "canonical" URL tags setup to point the "http://" version of your pages (not the https version)
and he gave me a workaround and a tool to generate an XML sitemap.
I can get an XML Sitemap now, BUT this makes me think that the Search Engines won't find these pages either.
Should I correct items #1 and #2 above for SEO purposes, or do these things not matter to the Search Engines?
Question by:bleggee
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
LVL 65

Accepted Solution

btan earned 2000 total points
ID: 41902168
XML Sitemap is useful for SEO as bot will crawl your site with those declared pages. It helps in indexing the site for SEO e.g. sitemap will ease Google to find your pages when it crawls your website because all your pages could be ranked, not only your website as a domain.

Note that once the generator has created your sitemap, you need to upload it to the root of your domain e.g.

Also having a sitemap on your site passes more data to search engines. So it also:

Lists all URLs from your site. And this includes pages that would not have been foundable by search engines

Gives engines page priority and thus crawl priority. You can add a tag on your XML sitemap saying which pages are the most important. Bots will thus first focus on this priority pages.
In your case, your sitemap should only contain the actual, existing pages. Google would not index a page that does not even exist but redirects somewhere like you mentioned. Any redirect in fact has the function to get Google to index another page instead. Typically Google will not index those pages due to both the redirect and the canonical and will not add any value.

In short, below are some principles
- You can pick a canonical (preferred) URL for each of your pages, but don't specify different URLs as canonical for the same page.
(e.g. one URL in a sitemap and a different URL for that same page using rel="canonical")
- Your sitemap should only contain current pages you want indexed on your site, not the pages that you're redirecting to another site.
(e.g. if Google can't find the link on your site proper, it won't index it from the sitemap anyway)
- URLs included in xml sitemaps must use the same protocol and subdomain as the sitemap itself. This means that https urls located in an http sitemap should not be included in the sitemap. This also means that urls on cannot be located in the sitemap on So on and so forth.

Suggest you run your existing site against the ywo below and see the findings and suggestion to beef up website performance and SEO factors. These help to keep the site healthy and friendlier to the browser and bots crawling the pages in your site.

Another useful tool (not free) for testing sitemap is SEO spider from Screaming Frog (see item 8 and 9  in It will then load your sitemap and begin crawling the urls it contains. In real-time, you can view the results of the crawl. And if you have Graph View up and running during the crawl, you can visually graph the results as the crawler collects data.
Once you have the XML file saved to your computer, go to the ‘Mode’ menu in Screaming Frog and select ‘List’. Then, click on ‘Select File’ at the top of the screen, choose your file and start the crawl. Once the spider has finished crawling, you’ll be able to find any redirects, 404 errors, duplicated URLs and more .... in the ‘Internal’ tab.

Author Comment

ID: 41902371
Great info, btan, thank you!   A couple of clarifications ...

1. So I should use all http:// in the sitemap (even though the .htaccess forces all pages to https). Is that correct?

2. In the case of using Mod Rewrite, I would want the rewritten URL's in the Sitemap ... meaning I should use the "User Friendy" version of the URL for Sitemap purposes, for example using "" and not ""
LVL 65

Expert Comment

ID: 41902413
1. yes to be consistent as advised in my earlier post.
2. yes that will be fine and this is analogous to canonical url too.

Author Comment

ID: 41902534
Great, thanks again!
LVL 18

Expert Comment

by:Lucas Bishop
ID: 41904524
Should I correct items #1 and #2 above for SEO purposes, or do these things not matter to the Search Engines?

Yes, as a matter of best practice, you should.

Currently you're 301 redirecting the pages that you're specifying as the canonical source. A 301 redirect means the page has permanently moved. When Google goes to the "new" location, you're canonical tag specifies the source as a page that has moved permanently.  For the health of your site, you should set this up correctly. This is technically a poor implementation of the canonical tag.

You really should update the canonical urls so they point to the https pages.

Also, make sure the sitemap references the https pages. When I view the sitemap, it looks like most of the urls are to the http version of the site:

Also, register both the http and https properties in Google Search Console, so you can control which version is indexed.

Featured Post

Introducing the WatchGuard 420 Access Point

WatchGuard's newest access point includes an 802.11ac Wave 2 chipset, providing the fastest speeds for VoIP, video and music streaming, and large data file transfers. Additionally, enjoy the benefits of strong security as the 3rd radio delivers dedicated WIPS protection!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Original post  on Monitis Blog. Web performance monitoring is broken into two camps: passive and active. Passive monitoring is defined as looking at real-world historical performance by monitoring actual log-ins, site hits, clicks, requests for …
The online market is growing at an unprecedented rate and retail eCommerce sales are expected to reach $4 trillion by 2020. Yet, the profit is not just there for the taking because you have to set yourself apart from the competition.
This Micro Tutorial will demonstrate how to add subdomains to your content reports. This can be very importing in having a site with multiple subdomains.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question