Solved

An XML and File import/export question with C#

Posted on 2013-05-28
7
878 Views
Last Modified: 2013-05-29
Hi experts
I imported an xml file and save it to another xml file by LINQ in C#.
_xmlDocument = XDocument.Load("input.xml", LoadOptions.PreserveWhitespace);
_xmlDocument.Save("output.xml", SaveOptions.DisableFormatting);

Open in new window


I then opened the files and print out the content byte-by-byte.
using (FileStream fs = File.Open(pathAndFileName, FileMode.Open))
{
                int size = (int)fs.Length; 
                byte[] data = new byte[size]; 
                fs.Read(data, 0, size);
                foreach (byte b in data)
                     Console.WriteLine(b);
}

Open in new window


But when I compared the printed output, the file "output.xml" contains some additional characters: 239, 187, 191,
where 239: Latin small letter i with diaeresis
           187: Right double angle quotes
           191: Inverted question mark

It also dropped 32 (ie Space) that was in "input.xml".

My question is: Is there any way to preserve the format of the input without adding funny characters or discarding space character?

They look identical in a text editor though.

Thanks in advance.
0
Comment
Question by:dominicwong
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 2
7 Comments
 
LVL 42

Accepted Solution

by:
sedgwick earned 500 total points
ID: 39203627
yes, use StreamWriter with Encoding.ASCII:
 using (StreamWriter sw = new StreamWriter(@"output.xml", false, Encoding.ASCII))
           {
               _xmlDocument.Save(sw, SaveOptions.DisableFormatting);
           }

Open in new window

0
 

Author Comment

by:dominicwong
ID: 39203646
Thanks sedgwick for your prompt response.
It resolves the funny character problem. But now it creates one issue:

The original file was:
<?xml version="1.0" encoding="utf-8"?>

Now, the saved file became:
<?xml version="1.0" encoding="us-ascii"?>
0
 

Author Comment

by:dominicwong
ID: 39203650
FYI, the reason they need to be identical is because I need to calculate the CRC of the file later in the program.
0
Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

 
LVL 42

Expert Comment

by:sedgwick
ID: 39203661
thats true, however if u compare bytes of the two xmls u gonna find they are identical:

List<byte> xml1 = new List<byte>();
           List<byte> xml2 = new List<byte>();
           using (FileStream fs = File.Open(@"input.xml", FileMode.Open))
           {
               int size = (int)fs.Length;
               byte[] data = new byte[size];
               fs.Read(data, 0, size);
               xml1.AddRange(data);
           }
           using (FileStream fs = File.Open(@"output.xml", FileMode.Open))
           {
               int size = (int)fs.Length;
               byte[] data = new byte[size];
               fs.Read(data, 0, size);
               xml2.AddRange(data);
           }

           var countDiffBytes = xml1.Except(xml2).Count();

Open in new window

countDiffBytes is equal to 0, meaning they will pass crc check.
0
 

Author Comment

by:dominicwong
ID: 39203678
Sorry for the confusion. The later CRC calculation hasn't been included in the code (for clarity reason).
The actual code when it comes to calculating CRC is as follows.
Therefore, I do need them to be completely identical; otherwise, the CRC will be different.

            Crc32 crc32 = new Crc32(); 
            String hash = String.Empty;
            using (FileStream fs = File.Open(pathAndFileName, FileMode.Open))
            {
                foreach (byte b in crc32.ComputeHash(fs))
                    hash += b.ToString("x2").ToLower();
            }

Open in new window

0
 

Author Comment

by:dominicwong
ID: 39203725
I managed to get the software requirement to change from "utf-8" to "us-ascii".
Now, it is OK. The problem is resolved.

Thanks for your help.
0
 

Author Closing Comment

by:dominicwong
ID: 39203727
Thank you.
0

Featured Post

Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
This video Micro Tutorial shows how to password-protect PDF files with free software. Many software products can do this, such as Adobe Acrobat (but not Adobe Reader), Nuance PaperPort, and Nuance Power PDF, but they are not free products. This vide…
Monitoring a network: why having a policy is the best policy? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the enormous benefits of having a policy-based approach when monitoring medium and large networks. Software utilized in this v…

729 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question