Link to home
Start Free TrialLog in
Avatar of SStory
SStoryFlag for United States of America

asked on

Better understanding on C++ Class serialization and formats

Considering the former question, 28969745, I wanted to ask some new and further questions.

When making my classes for serialization, would I want to offer various means of serialization?
Example: I have a vb.net application that saves data from an employee class as lines in a CSV file.
The | char is used as a delimiter and it works fine for this application. I am trying to redo this application in C++.
In C++ if I did the same, allowing serialization with | between fields, it would be fine for the current application, but if later I wanted to reuse one of my C++ classes I could foresee not wanting that restriction and perhaps wanting to serialize via XML or binary or some other format.

I would like to know:
  1. Should I offer various mechanisms on each class for serialization
  2. or Does a class just spit out its members to the serialization object or method and let it worry about separation
  3. Could anyone provide a brief sample code of this completely in practice

I have gotten great answers on the former question, but not wanting to beat that one to death with the person answering, decided to ask further details as an additional question
Avatar of sarabande
sarabande
Flag of Luxembourg image

Should I offer various mechanisms on each class for serialization

no. you even may consider to separate serialization completely from your classes. that would allow to have different streaming classes each of them provides built-in operators for basic member types for multiple purpose (SQL, CSV, XML, Sockets, ...)

based on these you easily can add the higher-level operators to serialize your classes.

class Person
{
    std::string fn;
    std::string ln;
    Date bd;
    ...
};

class CSVStream
{
     std::string buf;
 
    public: 
          CSVStream & std::operator<<(const std::string &s) { buf += s; buf += '|'; return *this; };
          CSVStream & std::operator<<(const char & c) { buf += c; buf += '|'; return *this; };
          ...
          CSVStream & std::operator>>(char & c) { c = buf[0]; buf = buf.substr(2); return *this; };
};

class SerialBuffer
{
     std::array<byte> buf;
     int opos;
     int ipos;
    public: 
          SerialBuffer& std::operator<<(const std::string &s) { ... }
          SerialBuffer& std::operator<<(const char & c) { ...}
           ...
          SerialBuffer& std::operator>>(char & c) {.. }
};

Open in new window



with that you easily could have both methods  by adding specific streaming operators:

Does a class just spit out its members to the serialization object or method and let it worry about separation

yes, do like

CSVStream & operator<<(CSVStream & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

SerialBuffer & operator<<(SerialBuffer & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

Open in new window


Sara
Avatar of SStory

ASKER

Thanks for the response! Ok. Would this code be inside of an Application or main class for the application that manages saving and loading? Or are these methods inside the Person class?

CSVStream & operator<<(CSVStream & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

SerialBuffer & operator<<(SerialBuffer & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

Open in new window


If these methods must be defined inside of each class then that seems very limiting as each time I want more capability I'd need to go add it to every class I have that I want to stream. On the other hand I guess only the Person object knows how to serialize itself.  It just seems there would be a generic way to serialize to whatever serializer is presented to the Person class.

Would that be to make a Serializer class as a base class or abstract class and have person with methods that take such an object and then make specific classes like XMLSerializer that inherits Serializer and pass that in and through polymorphism it would serialize to XML, and when CSVSerializer it would serialize to CSV?  Is that a better way? Am I over complicating it?
ASKER CERTIFIED SOLUTION
Avatar of sarabande
sarabande
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of SStory

ASKER

Sara,
since the implementation is a one-liner it is not so much limiting. i don't think that you would need more than 2 serializer methods for a class (in General).
It really wasn't the one liner that bothered me. I just thought if I had numerous classes that I wanted to use in one project and serialize in one mechanism, but later wanted to reuse them in another using XML or some something I can't even conceive of now, I'd have to modify every single class to add the new capability. I wondered if that is what people generally do, or if there was a better way.  The template suggest is perhaps closer to what I had in mind.  Would that be better than just making an abstract base class, Serializer, and inheriting all serializer class from it so I can pass in whatever as a type Serializer descendant and call one of its virtual methods...maybe serialize()?  Would that be more readable? Which is a best practice?

Thanks again for all of your answers.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of SStory

ASKER

I like the template idea.  I am wondering what would actually take place in the SerializerObject? It looks like since I would have to use templates and put these lines
 template<class Serializer>
     Serializer & operator<<(Serializer & strm) 

Open in new window

in each class anyway, that I wouldn't be gaining anything from an ABC or inheritance?  Am I right about that? If so it would seem that templates would be the better way to do this. If I am wrong, what would go in SerializerObject to justify its existence?

Or should the SerializationStream be an ABC such that Employee or whatever take an object of type SerializationStream that has virtual methods for << and >> that must be overloaded?  Or am I thinking wrong?
Example:

class Employee {
   SerializationStream & operator<<(SerializationStream & s, const Employee & e) {
          s << e.GetFn() << s.GetLn() << s.GetBD();
          return s;
   }
}

Open in new window


Wouldn't this negate the need for a template and be better? That way the object doesn't care what type of serialization stream is used for serialize and deserialize.  It could be CSVSerializationStream, XMLSerializationStream, Other...  If I am missing something obvious as a C++ newbie that will bite me later on for some reason, please tell me.

Of course this begs the questions:
How would XMLSerializationStream know what to do? I mean if I passed it  
s << e.GetFn() << s.GetLn() << s.GetBD();

Open in new window

How would it know to create:
<employee first="John" last="Doe" bday="1/1/1970"></employee> or

<employee>
    <firstname>John</firstname>
    <lastname>Doe</lastname>
    <birthday>1/1/1970</birthday>
</employee>

I can see how the internal XMLSerialization class could automagically know that.  So as I think through how to go about designing this thing I continue to ponder such thoughts.
And if XMLSerializer requires special methods like void Element(std:string name,std:string value) then nothing would be solved by trying to use an ABC as a lot of code for each type serializer would need to be written in each class.
the advantage of the template functions is that the returned type already is the correct type, while if using a baseclass you need a cast finally.

as already said, i normally need only one kind of serializer for most of my classes which writes the object to a database or reads them from database. in a Client Server application a few objects would need a different serializer to pass objects by p2p. but these objects rarely are stored in database additionally.

nevertheless both works. you have to try what is more suitable for you. what you should avoid is to have virtual functions which also use baseclass pointers or baseclass references. with that you get a mess of functions and cast that hardly can be maintained.

Sara
Avatar of SStory

ASKER

Sorry for sounding so dense as I think through this with your help.

I was trying to have a way to pass in anything and it just serialize to it, but taking XML for example, it appears that only the class would know how to serialize itself to XML.  It would need to know to wrap that tags around elements and start with the main class element and such wouldn't it?  

After this answer I will award points to one of your solutions. I am asking a new related question strictly about XML serialization, should you want to participate:
28973527
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial