Avatar of trevor1940
trevor1940

asked on 

C#; Merging XML files

I need to merge a bunch of xml files and save into 1 removing any duplicate entries
I'm getting errors in ConsoleApplication2 @ line         private history = new history();

Severity	Code	Description	Project	File	Line	Suppression State
Error	CS1519	Invalid token '=' in class, struct, or interface member declaration	ConsoleApplication2	d:\VB\Test\ConsoleApplication2\ConsoleApplication2\Program.cs	13	Active

Open in new window


any suggestions what the above is?

Unless there is a better way I was going to use sort on id to filter out duplicates with the newest file taking precedence over older


using System.IO;
using xml;

namespace ConsoleApplication2
{
    class Program
    {
        private history = new history();

        static void Main(string[] args)
        {
            string RootDir = @"C:\Users\user\AppData\Roaming\AppName\user";
            
            if (Directory.Exists(RootDir))
            {
                var Files = Directory.EnumerateFiles(RootDir, "history.xml", SearchOption.AllDirectories);
                foreach (string File in Files)
                {
                    this.history += xml.history.Load(File);
                }

            }
        }
    }
}

Open in new window


xml.cs

using System.Xml.Serialization;

namespace xml
{
    
    public class history
    {
        List<post> post{ get; set; } = new List<post>();
        private static string xmlFile { get; set; }

        public static history Load(string xmlFile)
        {


            XmlSerializer deserializer = new XmlSerializer(typeof(history));
            using (TextReader reader = new StreamReader(xmlFile))
            {
                history history = (history)deserializer.Deserialize(reader);
                return history;
            }
        }

        public void Save(string xmlFile)
        {
            XmlSerializer serializer = new XmlSerializer(typeof(history));
            using (TextWriter writer = new StreamWriter(xmlFile))
            {
                serializer.Serialize(writer, this);
            }
        }


    }

    public class post
    {
        public string name { get; set; }
        public string url { get; set; }
        public int id { get; set; }
        public int number { get; set; }
        public int imageCount { get; set; }
        public int downloadedImagesCount { get; set; }
        public string finished { get; set; }
    }
}

Open in new window


 XML file

<history>
  <post>
    <name>My Collection </name>
    <url>https://example.com/hread.php?p=12345</url>
    <id>12345</id>
    <number>1</number>
    <imageCount>132</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
...............
</History>

Open in new window

.NET ProgrammingC#XML

Avatar of undefined
Last Comment
it_saige
Avatar of it_saige
it_saige
Flag of United States of America image

You forgot to specify the Type; e.g. -
private static history history = new history();

Open in new window

-saige-
Avatar of trevor1940
trevor1940

ASKER

Erm why do I need history to be staic?

I've changed the program to bellow

It didn't like history += history.Load(File); ??
So wont 'history' get overwritten by the next file?

Also after loading each file why can't I access history.post.id?

To do this
history.post.id.Sort((x, y) => x.Name.CompareTo(y.Name));



    class Program
    {
        private static history history = new history();

        static void Main(string[] args)
        {
           string RootDir = @"C:\Users\user\AppData\Roaming\AppName\user";
            
            if (Directory.Exists(RootDir))
            {
                var Files = Directory.EnumerateFiles(RootDir, "history.xml", SearchOption.AllDirectories);
                foreach (string File in Files)
                {
                    history = history.Load(File);
                }
                // no post after the dot
               history.
            }
        }
    }

Open in new window

User generated image
Avatar of it_saige
it_saige
Flag of United States of America image

To answer your first question, it is static because you are accessing it via a static method.  In order to not require the static keyword, you would either call it via a non-static method or create a new instance of the Program class and use this instance to access the history field.

As for your implementation, I believe this is the direction you are heading:
using System.Collections.Generic;
using System.IO;
using System.Xml.Serialization;

namespace EE_Q29135911
{
    class Program
    {
        static List<History> history = new List<History>();

        static void Main(string[] args)
        {
            string directory = @"C:\Users\user\AppData\Roaming\AppName\user";

            if (Directory.Exists(directory))
            {
                var files = Directory.EnumerateFiles(directory, "history.xml", SearchOption.AllDirectories);
                foreach (string file in files)
                {
                    history.Add(History.Load(file));
                }
            }
        }
    }

    class History
    {
        public List<Post> post { get; set; } = new List<Post>();

        public static History Load(string file)
        {
            var deserializer = new XmlSerializer(typeof(History));
            using (TextReader reader = new StreamReader(file))
            {
                return deserializer.Deserialize(reader) as History;
            }
        }

        public void Save(string file)
        {
            var serializer = new XmlSerializer(typeof(History));
            using (TextWriter writer = new StreamWriter(file))
            {
                serializer.Serialize(writer, this);
            }
        }
    }

    public class Post
    {
        public string Name { get; set; }
        public string Url { get; set; }
        public int Id { get; set; }
        public int Number { get; set; }
        public int ImageCount { get; set; }
        public int DownloadedImagesCount { get; set; }
        public string Finished { get; set; }
    }
}

Open in new window

-saige-
Avatar of trevor1940
trevor1940

ASKER

I'm going to post what I've got so far because I don't understand a couple off things

Why can't I access "History.post" or "history.post" AT ###1

and

I tried to run this I get AT ###2
System.InvalidOperationException was unhandled
  HResult=-2146233079
  Message=MergeXML.History is inaccessible due to its protection level. Only public types can be processed.
  Source=System.Xml
  StackTrace:
       at System.Xml.Serialization.TypeDesc.CheckSupported()
       at System.Xml.Serialization.TypeScope.GetTypeDesc(Type type, MemberInfo source, Boolean directReference, Boolean throwOnError)
       at System.Xml.Serialization.ModelScope.GetTypeModel(Type type, Boolean directReference)
       at System.Xml.Serialization.XmlReflectionImporter.ImportTypeMapping(Type type, XmlRootAttribute root, String defaultNamespace)
       at System.Xml.Serialization.XmlSerializer..ctor(Type type, String defaultNamespace)
       at System.Xml.Serialization.XmlSerializer..ctor(Type type)
       at MergeXML.History.Load(String file) in d:\VB\Test\MergeXML\MergeXML\Program.cs:line 39
       at MergeXML.Program.Main(String[] args) in d:\VB\Test\MergeXML\MergeXML\Program.cs:line 24
       at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
       at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
       at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart()
  InnerException: 

Open in new window


using System.IO;
using System.Xml.Serialization;


namespace MergeXML
{
    class Program
    {
        static List<History> history = new List<History>();
        static void Main(string[] args)
        {
            string RootDir = @"C:\Users\user\AppData\Roaming\AppName\user";

            if (Directory.Exists(RootDir))
            {
                var files = Directory.EnumerateFiles(RootDir, "history.xml", SearchOption.AllDirectories);
                foreach (string file in files)
                {
                    history.Add(History.Load(file));
                }
            }
           // ###1
            History.Save(RootDir + "\\history_out.xml");
            
        }
    }
    class History
    {
        public List<Post> post { get; set; } = new List<Post>();

        #region Loading/Saving
        public static History Load(string file)
        {
            var deserializer = new XmlSerializer(typeof(History)); // ###2
            using (TextReader reader = new StreamReader(file))
            {
                return deserializer.Deserialize(reader) as History;
            }
        }

        internal static void Save(string file)
        {
            var serializer = new XmlSerializer(typeof(History));
            using (TextWriter writer = new StreamWriter(file))
            {
                serializer.Serialize(writer, file);
            }
        }
        #endregion
    }

    public class Post
    {
        public string name { get; set; }
        public string url { get; set; }
        public int id { get; set; }
        public int number { get; set; }
        public int imageCount { get; set; }
        public int downloadedImagesCount { get; set; }
        public string finished { get; set; }
    }
}

Open in new window

Avatar of it_saige
it_saige
Flag of United States of America image

History needs to be made public; e.g. -
using System.IO;
using System.Xml.Serialization;


namespace MergeXML
{
    class Program
    {
        static List<History> history = new List<History>();
        static void Main(string[] args)
        {
            string RootDir = @"C:\Users\user\AppData\Roaming\AppName\user";

            if (Directory.Exists(RootDir))
            {
                var files = Directory.EnumerateFiles(RootDir, "history.xml", SearchOption.AllDirectories);
                foreach (string file in files)
                {
                    history.Add(History.Load(file));
                }
            }
           // ###1
            History.Save(RootDir + "\\history_out.xml");
            
        }
    }

    public class History
    {
        public List<Post> post { get; set; } = new List<Post>();

        #region Loading/Saving
        public static History Load(string file)
        {
            var deserializer = new XmlSerializer(typeof(History)); // ###2
            using (TextReader reader = new StreamReader(file))
            {
                return deserializer.Deserialize(reader) as History;
            }
        }

        internal static void Save(string file)
        {
            var serializer = new XmlSerializer(typeof(History));
            using (TextWriter writer = new StreamWriter(file))
            {
                serializer.Serialize(writer, file);
            }
        }
        #endregion
    }

    public class Post
    {
        public string name { get; set; }
        public string url { get; set; }
        public int id { get; set; }
        public int number { get; set; }
        public int imageCount { get; set; }
        public int downloadedImagesCount { get; set; }
        public string finished { get; set; }
    }
}

Open in new window

-saige-
Avatar of trevor1940
trevor1940

ASKER

Wow  this  is getting really frustrating
I really don't understand why can't I access "History.post" or "history.post"?

I ran your your code against the test file (I even moved it thinking may be permissions)  

I get  the error bellow claiming invalid xml

I also attempted to create a post to save the XML so I could compare

      // ###1

            Post post = new Post()
            {
                name = "Hello World",
                url = "https://www.experts-exchange.com/questions/29135911/C-Merging-XML-files.html",
                id= 45678,
                number =7,
                imageCount =99,
                downloadedImagesCount = 99,
                finished ="true"

            };
            history.Add(post);
          History.Save(RootDir + "\\history_out.xml");

Open in new window


I get
Severity	Code	Description	Project	File	Line	Suppression State
Error	CS1503	Argument 1: cannot convert from 'MergeXML.Post' to 'MergeXML.History'	MergeXML	D:\Vb\Test\MergeXML\MergeXML\Program.cs	41	Active

Open in new window



Test XML (I added the DTDs thinking this might be the issue)
<?xml version="1.0" encoding="windows-1250"?>
<history xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <post>
    <name>My Collection </name>
    <url>https://example.com/hread.php?p=12345</url>
    <id>12345</id>
    <number>1</number>
    <imageCount>132</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
</history>

Open in new window


Error
System.InvalidOperationException was unhandled
  HResult=-2146233079
  Message=There is an error in XML document (2, 2).
  Source=System.Xml
  StackTrace:
       at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
       at System.Xml.Serialization.XmlSerializer.Deserialize(TextReader textReader)
       at MergeXML.History.Load(String file) in D:\Vb\Test\MergeXML\MergeXML\Program.cs:line 44
       at MergeXML.Program.Main(String[] args) in D:\Vb\Test\MergeXML\MergeXML\Program.cs:line 19
       at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
       at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
       at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart()
  InnerException: 
       HResult=-2146233079
       Message=<history xmlns=''> was not expected.
       Source=Microsoft.GeneratedCode
       StackTrace:
            at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderHistory.Read4_History()
       InnerException: 

Open in new window

Avatar of it_saige
it_saige
Flag of United States of America image

You can't add a Post type to a List of History.  In order to add the post to the History instance you would access the Posts list; e.g. -
var history = new History();
Post post = new Post()
{
    name = "Hello World",
    url = "https://www.experts-exchange.com/questions/29135911/C-Merging-XML-files.html",
    id= 45678,
    number =7,
    imageCount =99,
    downloadedImagesCount = 99,
    finished ="true"
};
history.post.Add(post);

Open in new window

-saige-
Avatar of trevor1940
trevor1940

ASKER

I was hopping that would work
So  all I'm doing is trying to create an xml file with a single post if I can do that maybe I can read it back in again

When I try and save I now get

There was an error generating the XML document.

  InnerException: 
       HResult=-2147467262
       Message=Unable to cast object of type 'System.String' to type 'MergeXML.History'.

Open in new window


at
        internal static void Save(string file)
        {
            var serializer = new XmlSerializer(typeof(History));
            using (TextWriter writer = new StreamWriter(file))
            {
                serializer.Serialize(writer, file); // Errors here
            }
        }

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of it_saige
it_saige
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of trevor1940
trevor1940

ASKER

Thank you this is now working

I need to understand why

For instance why do you need to declare XmlElement

        [XmlElement(ElementName = "name")]
        public string Name { get; set; }

Open in new window


also I get this create a new history object then add a new post to it than add the history to histories

           History history = new History();
            Post post = new Post()
            {
                Name = "Hello World",
                Url = "https://www.experts-exchange.com/questions/29135911/C-Merging-XML-files.html",
                Id = 45678,
                Number = 7,
                ImageCount = 99,
                DownloadedImagesCount = 99,
                Finished = "true"
            };
            history.Posts.Add(post);
            histories.Add(history);
            History result = new History();

            foreach (var history in histories.Distinct())
            {
                result.Posts.AddRange(history.Posts);
            }

            
            History.Save(RootDir + "\\history_out.xml", result);

Open in new window


But why can't  you save histories strait to file
History.Save(RootDir + "\\history_out.xml", histories);
gives
Severity	Code	Description	Project	File	Line	Suppression State
Error	CS1503	Argument 2: cannot convert from 'System.Collections.Generic.List<MergeXML.History>' to 'MergeXML.History'	MergeXML	D:\Vb\Test\MergeXML\MergeXML\Program.cs	42	Active

Open in new window

Avatar of it_saige
it_saige
Flag of United States of America image

1.  The elements in your XML files are all lower cased elements and XML serialization is not case agnostic.  I could have used all lower case names for the classes and properties but that goes against the coding standards/naming conventions of most of the industry.  For more information you can refer to - https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/inside-a-program/coding-conventions

2.  Because your static method as a black box should do one thing.  In this case, you are, by definition and assumption, saving a single history item.

-saige-
Avatar of trevor1940
trevor1940

ASKER

Thank you for your time  and Patience

There are aspects I don't totally understand

"static List<History> histories = new List<History>();"

I think histories Is a list of multiple instances of History so 1 history.xml makes 1 History

This
        History result = new History();

            foreach (var history in histories.Distinct())
            {
                result.Posts.AddRange(history.Posts);
            }

Open in new window

Converts all posts into 1 unique History posts this in turn allows manipulation of the single list eg (Saving, adding a single post and Querying ect)
Avatar of it_saige
it_saige
Flag of United States of America image

Correct.  Per your requirements, you wanted to take multiple history files and merge them into one.  So you need to added each history instance to a collection.  

But Distinct really isn't needed.  In order to get rid of duplicates, the best thing to do would be to create a custom post comparer and use the Except method when adding posts; e.g. -
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Xml.Serialization;


namespace MergeXML
{
    class Program
    {
        static List<History> histories = new List<History>();
        static void Main(string[] args)
        {
            var files = Directory.EnumerateFiles(".", "*history.xml", SearchOption.AllDirectories);
            foreach (string file in files)
            {
                histories.Add(History.Load(file));
            }
            // Create a singular out history
            History result = new History();
            foreach (var history in histories)
            {
                result.Posts.AddRange(history.Posts.Except(result.Posts, new PostComparer()));
            }
            History.Save(@"history_out.xml", result);
        }
    }

    [XmlRoot(ElementName = "history", DataType = "string", IsNullable = true)]
    public class History
    {
        [XmlElement(ElementName = "post")]
        public List<Post> Posts { get; set; } = new List<Post>();
        public static History Load(string file)
        {
            var deserializer = new XmlSerializer(typeof(History));
            using (TextReader reader = new StreamReader(file))
            {
                return deserializer.Deserialize(reader) as History;
            }
        }

        internal static void Save(string file, History source)
        {
            var serializer = new XmlSerializer(typeof(History));
            using (TextWriter writer = new StreamWriter(file))
            {
                serializer.Serialize(writer, source);
            }
        }
    }

    public class Post
    {
        [XmlElement(ElementName = "name")]
        public string Name { get; set; }
        [XmlElement(ElementName = "url")]
        public string Url { get; set; }
        [XmlElement(ElementName = "id")]
        public int Id { get; set; }
        [XmlElement(ElementName = "number")]
        public int Number { get; set; }
        [XmlElement(ElementName = "imageCount")]
        public int ImageCount { get; set; }
        [XmlElement(ElementName = "downloadedImagesCount")]
        public int DownloadedImagesCount { get; set; }
        [XmlElement(ElementName = "finished")]
        public string Finished { get; set; }
    }

    class PostComparer : IEqualityComparer<Post>
    {
        public bool Equals(Post x, Post y)
        {
            if (ReferenceEquals(x, y))
            {
                return true;
            }

            if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
            {
                return false;
            }

            return Equals(x.DownloadedImagesCount, y.DownloadedImagesCount) &&
                Equals(x.Finished, y.Finished) &&
                Equals(x.Id, y.Id) &&
                Equals(x.ImageCount, y.ImageCount) &&
                Equals(x.Name, y.Name) &&
                Equals(x.Number, y.Number) &&
                Equals(x.Url, y.Url);
        }

        public int GetHashCode(Post obj)
        {
            if (ReferenceEquals(obj, null))
            {
                return 0;
            }
            return obj.DownloadedImagesCount.GetHashCode() ^ obj.Finished.GetHashCode() ^ 
                obj.Id.GetHashCode() ^ obj.ImageCount.GetHashCode() ^ obj.Name.GetHashCode() ^
                obj.Number.GetHashCode() ^ obj.Url.GetHashCode();
        }
    }
}

Open in new window

With the same files above and one additional one:

3history.xml -
<history>
  <post>
    <name>My Collection 4</name>
    <url>https://example.com/hread.php?p=54321</url>
    <id>54321</id>
    <number>4</number>
    <imageCount>135</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
  <post>
    <name>My Collection 5</name>
    <url>https://example.com/hread.php?p=23456</url>
    <id>23456</id>
    <number>5</number>
    <imageCount>136</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
</history>

Open in new window

Produces the following output -
<?xml version="1.0" encoding="utf-8"?>
<history xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <post>
    <name>My Collection </name>
    <url>https://example.com/hread.php?p=12345</url>
    <id>12345</id>
    <number>1</number>
    <imageCount>132</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
  <post>
    <name>My Collection 2</name>
    <url>https://example.com/hread.php?p=67890</url>
    <id>67890</id>
    <number>2</number>
    <imageCount>133</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
  <post>
    <name>My Collection 3</name>
    <url>https://example.com/hread.php?p=9876</url>
    <id>9876</id>
    <number>3</number>
    <imageCount>134</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
  <post>
    <name>My Collection 4</name>
    <url>https://example.com/hread.php?p=54321</url>
    <id>54321</id>
    <number>4</number>
    <imageCount>135</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
  <post>
    <name>My Collection 5</name>
    <url>https://example.com/hread.php?p=23456</url>
    <id>23456</id>
    <number>5</number>
    <imageCount>136</imageCount>
    <downloadedImagesCount>0</downloadedImagesCount>
    <finished>true</finished>
  </post>
</history>

Open in new window

-saige-
.NET Programming
.NET Programming

The .NET Framework is not specific to any one programming language; rather, it includes a library of functions that allows developers to rapidly build applications. Several supported languages include C#, VB.NET, C++ or ASP.NET.

137K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo