C# XmlWriter produces file header with special characters

Ok, simple question.

I have a small block of code that creates some XML using XmlWriter in C# (running .NET 4.0.21006.0).

The code creates a simple XML file, but the header is preceded by what looks like 3 special characters.  (When I type the file in DOS, it looks like a Union symbol followed by a double upper right boarder followed by a single upper right boarder).  If I edit the file in something like Notepad or Notepad++, I do not see these characters, but they're there when I type the file from the command line.

I am using an XmlWriterSettings object with default settings.  Is this the problem?  The rest of the XML file appears to be perfectly fine.

The code snippet provided below shows the characters "n++" instead of the three characters; for some reason when I pasted them into the experts exchange website it did this.

How can I fix this?


// trimmed down from the actual code
// (eg- try/catch blocks removed):

string filename = "something.xml";
XmlWriter writer;
XmlWriterSettings settings;
settings = new XmlWriterSettings();
settings.Indent = true;
settings.NewLineChars = "\r\n";
writer = XmlWriter.Create(filename, settings);
writer.WriteStartDocument(true);        // header

writer.WriteStartElement("garden");
writer.WriteAttributeString("total", "1");
writer.WriteElementString("vegetable", "carrot");
writer.WriteEndElement();
writer.Close();

// ----------------------------
// Produces this result:

n++<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<garden total="1">
  <vegetable>carrot</vegetable>
</garden>

Open in new window

coder1313514512456Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

dukestaTAICommented:
Use an XmlTextWriter and set the encoding:

XmlTextWriter writer = new XmlTextWriter(filename, System.Text.Encoding.UTF8);
0
lazyberezovskyCommented:
Code you provided should not write any "n++" to output file. Below is same refactored code. And check what else are you doing with this file.
string filename = "something.xml";
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.NewLineChars = Environment.NewLine;

using (XmlWriter writer = XmlWriter.Create(filename, settings))
{
    writer.WriteStartDocument(true);
    writer.WriteStartElement("garden");
    writer.WriteAttributeString("total", "1");
    writer.WriteElementString("vegetable", "carrot");
    writer.WriteEndElement();
}

Open in new window

0
coder1313514512456Author Commented:
It would appear that both suggestions have the same effect:  they write those first 3 characters, and then do the rest of the xml.
In the case of the XmlTextWriter, I don't get the indenting, probably because I'm not using a constructor that has some kind of settings class, however I haven't seen one of those either.
 
Not sure what to do.  Any suggestions?  Again, when I use something like notepad I don't see these preceding 3 characters (they look like "n++" on this website, but in the console look quite different.
Thanks again for the suggestions.
 
0
Angular Fundamentals

Learn the fundamentals of Angular 2, a JavaScript framework for developing dynamic single page applications.

feenixCommented:
The three characters are called byte order mark and they are enabled by default in UTF-8 encoding object. So just use the following and you'll get rid of them.
settings.Encoding = new UTF8Encoding(false);

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
coder1313514512456Author Commented:
feenix, that would be the answer!  Thanks!
 
What IS that thing, anyway?  byte order marks?  Whose idea was this?  And this is for...???
 
Thanks again feenix, you get my thanks (and points)!
 
0
coder1313514512456Author Commented:
Why Microsoft made it so this is the default I will have no idea.  Thanks feenix, perfect.  And thanks to others, I just really needed to get to the bottom of this.

0
feenixCommented:
The byte order mark is there to tell the reading program if the data is in big or little endian format. It's not usually needed in UTF-8, but in UTF-16 it might be usable. The characters are selected so that they are probably never used together in a normal text file (from different languages etc) so there won't be any problems in detection.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
.NET Programming

From novice to tech pro — start learning today.