Solved

need proper doc type for xml file

Posted on 2011-02-14
9
303 Views
Last Modified: 2012-05-11
Hi!

Any specific suggestions as to the best doc type for this page:
http://www.topsecurityinc.com/sitemap.xml
0
Comment
Question by:TrueBlue
  • 4
  • 3
  • 2
9 Comments
 
LVL 12

Expert Comment

by:Amick
ID: 34894263
As it is, it appears to be sitemap protocol compliant and properly formed xml.  Are you having a problem that you're trying to address?
0
 
LVL 82

Expert Comment

by:Dave Baldwin
ID: 34894746
<?xml version="1.0" encoding="UTF-8" ?>

which is at the top of the file is the proper DOCTYPE for it.  
0
 

Author Comment

by:TrueBlue
ID: 34897720
I used the below listed tool and it said that I was missing a doctype for the sitemap.

http://www.htmlhelp.com/tools/validator/

•Line 1, character 1:
<?xml version="1.0" encoding="UTF-8" ?>
^Error: character ï not allowed in prolog

0
 
LVL 12

Expert Comment

by:Amick
ID: 34898237
The validator at w3.org  (the web standards group) reports:
Schema validating with XSV 3.1-1 of 2007/12/11 16:20:05
•Target: http://www.topsecurityinc.com/sitemap.xml (Real name: http://www.topsecurityinc.com/sitemap.xml
Length: 12457 bytes
Last Modified: Tue, 25 Jan 2011 18:24:39 GMT Server: Microsoft-IIS/6.0)
• docElt: {http://www.sitemaps.org/schemas/sitemap/0.9}urlset

•Validation was strict, starting with type [Anonymous]
• schemaLocs: http://www.sitemaps.org/schemas/sitemap/0.9 -> http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd
The schema(s) used for schema-validation had no errors
No schema-validity problems were found

See for yourself at:
http://www.w3.org/2001/03/webdata/xsv?docAddrs=http%3A%2F%2Fwww.topsecurityinc.com%2Fsitemap.xml&style=xsl

I suspect that the validator at htmlhelp.com is simply incomplete. You are standards compliant and there is really no need to worry.
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 12

Expert Comment

by:Amick
ID: 34898365
One thing I noticed about your file is that, when viewed byte by byte, the first three bytes are
EF BB BF. This may be what is causing htmlhelp.com's validator to complain. These characters don't show up when the file is viewed as text. I was able to eliminate the leading three bytes by opening sitemap.xml in a text editor, and copying the text into a new document.  This probably isn't too important, but it does account for the prolog error message.

•Line 1, character 1:
<?xml version="1.0" encoding="UTF-8" ?>
^Error: character ï not allowed in prolog

0
 
LVL 82

Expert Comment

by:Dave Baldwin
ID: 34899145
Those characters are the Unicode Byte Order mark http://en.wikipedia.org/wiki/Byte_order_mark .  Note that Firefox, IE8, Chrome, Safari, and Opera open that page without problems.  Firefox and opera tell you that there is no style sheet associated with it and Chrome and Safari display just the text without the tags.
0
 

Author Comment

by:TrueBlue
ID: 34899544
Amick:
I found the same thing in a hex editor, but I deleted the first three bytes. Then saved the file and they returned. So I changed them to 20 saved but when I reopened the file they were back.
I even cut and paste from the old page to a new page and get the same three bytes.
Could you post the file where you removed them?
0
 
LVL 82

Accepted Solution

by:
Dave Baldwin earned 125 total points
ID: 34899676
Ok, you will never be able to 'validate' that page in an HTML validator... because it is Not HTML but simply XML.  Not XHTML or any form of HTML, just simply XML.

There is nothing wrong with your 'sitemap.xml' file.  It looks just like the ones for my websites.
0
 
LVL 12

Assisted Solution

by:Amick
Amick earned 125 total points
ID: 34900261
TrueBlue - This is not an issue.  As DaveBaldwin indicated these are the Unicode byte order mark and the fact that it doesn't pass the HTML validator is inconsequential because this is XML.  I only mentioned these bytes by way of explaining the likely source of the program's complaint regarding character not being allowed in the prolog.  

As a practical matter, you've created a valid, workable sitemap, and there is nothing more that you need to do.  You can turn your attention to more profitable matters with the assurance you've done this right.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

The Problem How to write an Xquery that works like a SQL outer join, providing placeholders for absent data on the outer side?  I give a bit more background at the end. The situation expressed as relational data Let’s work through this.  I’ve …
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now