Link to home
Start Free TrialLog in
Avatar of coder1313514512456
coder1313514512456Flag for United States of America

asked on

C# XmlWriter produces file header with special characters

Ok, simple question.

I have a small block of code that creates some XML using XmlWriter in C# (running .NET 4.0.21006.0).

The code creates a simple XML file, but the header is preceded by what looks like 3 special characters.  (When I type the file in DOS, it looks like a Union symbol followed by a double upper right boarder followed by a single upper right boarder).  If I edit the file in something like Notepad or Notepad++, I do not see these characters, but they're there when I type the file from the command line.

I am using an XmlWriterSettings object with default settings.  Is this the problem?  The rest of the XML file appears to be perfectly fine.

The code snippet provided below shows the characters "n++" instead of the three characters; for some reason when I pasted them into the experts exchange website it did this.

How can I fix this?


// trimmed down from the actual code
// (eg- try/catch blocks removed):

string filename = "something.xml";
XmlWriter writer;
XmlWriterSettings settings;
settings = new XmlWriterSettings();
settings.Indent = true;
settings.NewLineChars = "\r\n";
writer = XmlWriter.Create(filename, settings);
writer.WriteStartDocument(true);        // header

writer.WriteStartElement("garden");
writer.WriteAttributeString("total", "1");
writer.WriteElementString("vegetable", "carrot");
writer.WriteEndElement();
writer.Close();

// ----------------------------
// Produces this result:

n++<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<garden total="1">
  <vegetable>carrot</vegetable>
</garden>

Open in new window

Avatar of dukestaTAI
dukestaTAI

Use an XmlTextWriter and set the encoding:

XmlTextWriter writer = new XmlTextWriter(filename, System.Text.Encoding.UTF8);
Code you provided should not write any "n++" to output file. Below is same refactored code. And check what else are you doing with this file.
string filename = "something.xml";
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.NewLineChars = Environment.NewLine;

using (XmlWriter writer = XmlWriter.Create(filename, settings))
{
    writer.WriteStartDocument(true);
    writer.WriteStartElement("garden");
    writer.WriteAttributeString("total", "1");
    writer.WriteElementString("vegetable", "carrot");
    writer.WriteEndElement();
}

Open in new window

Avatar of coder1313514512456

ASKER

It would appear that both suggestions have the same effect:  they write those first 3 characters, and then do the rest of the xml.
In the case of the XmlTextWriter, I don't get the indenting, probably because I'm not using a constructor that has some kind of settings class, however I haven't seen one of those either.
 
Not sure what to do.  Any suggestions?  Again, when I use something like notepad I don't see these preceding 3 characters (they look like "n++" on this website, but in the console look quite different.
Thanks again for the suggestions.
 
ASKER CERTIFIED SOLUTION
Avatar of feenix
feenix
Flag of Finland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
feenix, that would be the answer!  Thanks!
 
What IS that thing, anyway?  byte order marks?  Whose idea was this?  And this is for...???
 
Thanks again feenix, you get my thanks (and points)!
 
Why Microsoft made it so this is the default I will have no idea.  Thanks feenix, perfect.  And thanks to others, I just really needed to get to the bottom of this.

The byte order mark is there to tell the reading program if the data is in big or little endian format. It's not usually needed in UTF-8, but in UTF-16 it might be usable. The characters are selected so that they are probably never used together in a normal text file (from different languages etc) so there won't be any problems in detection.