Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

An XML and File import/export question with C#

Posted on 2013-05-28
7
Medium Priority
?
894 Views
Last Modified: 2013-05-29
Hi experts
I imported an xml file and save it to another xml file by LINQ in C#.
_xmlDocument = XDocument.Load("input.xml", LoadOptions.PreserveWhitespace);
_xmlDocument.Save("output.xml", SaveOptions.DisableFormatting);

Open in new window


I then opened the files and print out the content byte-by-byte.
using (FileStream fs = File.Open(pathAndFileName, FileMode.Open))
{
                int size = (int)fs.Length; 
                byte[] data = new byte[size]; 
                fs.Read(data, 0, size);
                foreach (byte b in data)
                     Console.WriteLine(b);
}

Open in new window


But when I compared the printed output, the file "output.xml" contains some additional characters: 239, 187, 191,
where 239: Latin small letter i with diaeresis
           187: Right double angle quotes
           191: Inverted question mark

It also dropped 32 (ie Space) that was in "input.xml".

My question is: Is there any way to preserve the format of the input without adding funny characters or discarding space character?

They look identical in a text editor though.

Thanks in advance.
0
Comment
Question by:dominicwong
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 2
7 Comments
 
LVL 42

Accepted Solution

by:
sedgwick earned 2000 total points
ID: 39203627
yes, use StreamWriter with Encoding.ASCII:
 using (StreamWriter sw = new StreamWriter(@"output.xml", false, Encoding.ASCII))
           {
               _xmlDocument.Save(sw, SaveOptions.DisableFormatting);
           }

Open in new window

0
 

Author Comment

by:dominicwong
ID: 39203646
Thanks sedgwick for your prompt response.
It resolves the funny character problem. But now it creates one issue:

The original file was:
<?xml version="1.0" encoding="utf-8"?>

Now, the saved file became:
<?xml version="1.0" encoding="us-ascii"?>
0
 

Author Comment

by:dominicwong
ID: 39203650
FYI, the reason they need to be identical is because I need to calculate the CRC of the file later in the program.
0
The Orion Papers

Are you interested in becoming an AWS Certified Solutions Architect?

Discover a new interactive way of training for the exam.

 
LVL 42

Expert Comment

by:sedgwick
ID: 39203661
thats true, however if u compare bytes of the two xmls u gonna find they are identical:

List<byte> xml1 = new List<byte>();
           List<byte> xml2 = new List<byte>();
           using (FileStream fs = File.Open(@"input.xml", FileMode.Open))
           {
               int size = (int)fs.Length;
               byte[] data = new byte[size];
               fs.Read(data, 0, size);
               xml1.AddRange(data);
           }
           using (FileStream fs = File.Open(@"output.xml", FileMode.Open))
           {
               int size = (int)fs.Length;
               byte[] data = new byte[size];
               fs.Read(data, 0, size);
               xml2.AddRange(data);
           }

           var countDiffBytes = xml1.Except(xml2).Count();

Open in new window

countDiffBytes is equal to 0, meaning they will pass crc check.
0
 

Author Comment

by:dominicwong
ID: 39203678
Sorry for the confusion. The later CRC calculation hasn't been included in the code (for clarity reason).
The actual code when it comes to calculating CRC is as follows.
Therefore, I do need them to be completely identical; otherwise, the CRC will be different.

            Crc32 crc32 = new Crc32(); 
            String hash = String.Empty;
            using (FileStream fs = File.Open(pathAndFileName, FileMode.Open))
            {
                foreach (byte b in crc32.ComputeHash(fs))
                    hash += b.ToString("x2").ToLower();
            }

Open in new window

0
 

Author Comment

by:dominicwong
ID: 39203725
I managed to get the software requirement to change from "utf-8" to "us-ascii".
Now, it is OK. The problem is resolved.

Thanks for your help.
0
 

Author Closing Comment

by:dominicwong
ID: 39203727
Thank you.
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question