Solved

Better understanding on C++ Class serialization and formats

Posted on 2016-09-19
9
47 Views
Last Modified: 2016-09-30
Considering the former question, 28969745, I wanted to ask some new and further questions.

When making my classes for serialization, would I want to offer various means of serialization?
Example: I have a vb.net application that saves data from an employee class as lines in a CSV file.
The | char is used as a delimiter and it works fine for this application. I am trying to redo this application in C++.
In C++ if I did the same, allowing serialization with | between fields, it would be fine for the current application, but if later I wanted to reuse one of my C++ classes I could foresee not wanting that restriction and perhaps wanting to serialize via XML or binary or some other format.

I would like to know:
  1. Should I offer various mechanisms on each class for serialization
  2. or Does a class just spit out its members to the serialization object or method and let it worry about separation
  3. Could anyone provide a brief sample code of this completely in practice

I have gotten great answers on the former question, but not wanting to beat that one to death with the person answering, decided to ask further details as an additional question
0
Comment
Question by:SStory
  • 5
  • 4
9 Comments
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
Should I offer various mechanisms on each class for serialization

no. you even may consider to separate serialization completely from your classes. that would allow to have different streaming classes each of them provides built-in operators for basic member types for multiple purpose (SQL, CSV, XML, Sockets, ...)

based on these you easily can add the higher-level operators to serialize your classes.

class Person
{
    std::string fn;
    std::string ln;
    Date bd;
    ...
};

class CSVStream
{
     std::string buf;
 
    public: 
          CSVStream & std::operator<<(const std::string &s) { buf += s; buf += '|'; return *this; };
          CSVStream & std::operator<<(const char & c) { buf += c; buf += '|'; return *this; };
          ...
          CSVStream & std::operator>>(char & c) { c = buf[0]; buf = buf.substr(2); return *this; };
};

class SerialBuffer
{
     std::array<byte> buf;
     int opos;
     int ipos;
    public: 
          SerialBuffer& std::operator<<(const std::string &s) { ... }
          SerialBuffer& std::operator<<(const char & c) { ...}
           ...
          SerialBuffer& std::operator>>(char & c) {.. }
};

Open in new window



with that you easily could have both methods  by adding specific streaming operators:

Does a class just spit out its members to the serialization object or method and let it worry about separation

yes, do like

CSVStream & operator<<(CSVStream & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

SerialBuffer & operator<<(SerialBuffer & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

Open in new window


Sara
0
 
LVL 25

Author Comment

by:SStory
Comment Utility
Thanks for the response! Ok. Would this code be inside of an Application or main class for the application that manages saving and loading? Or are these methods inside the Person class?

CSVStream & operator<<(CSVStream & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

SerialBuffer & operator<<(SerialBuffer & s, const Person & p)
{
       s << p.GetFn() << p.GetLn() << p.GetBD(); 
       return s;
}

Open in new window


If these methods must be defined inside of each class then that seems very limiting as each time I want more capability I'd need to go add it to every class I have that I want to stream. On the other hand I guess only the Person object knows how to serialize itself.  It just seems there would be a generic way to serialize to whatever serializer is presented to the Person class.

Would that be to make a Serializer class as a base class or abstract class and have person with methods that take such an object and then make specific classes like XMLSerializer that inherits Serializer and pass that in and through polymorphism it would serialize to XML, and when CSVSerializer it would serialize to CSV?  Is that a better way? Am I over complicating it?
0
 
LVL 32

Accepted Solution

by:
sarabande earned 500 total points
Comment Utility
If these methods must be defined inside of each class then that seems very limiting
since the implementation is a one-liner it is not so much limiting. i don't think that you would need more than 2 serializer methods for a class (in General).

nevertheless you could make the serialize and unserialize member function a template function and pass the serialize class as template type:

class Person
{
    ...
    template <class SO>
    SO & Serialize(SO & s)
    {
         s << firstname << lastname;
         address.Serialize(s);
         s << birthday;
    }
    ...

Open in new window


note, that i used member address which must have its own serialize function if you omit the public streaming operators.

with the above, you could do

Person p(....);
CSVStream csv;
p.Serialize<CSVStream>(csv);

SerialBuffer buf;
p.Serialize<SerialBuffer>(buf);

Open in new window


it spares some code but might be less readable.

then make specific classes like XMLSerializer that inherits Serializer and pass that in and through polymorphism it would serialize

yes, you may derive all serialize classes from same base class and define appropriate virtual functions that would do the streaming. std::basic_ostream is a sample as it is as well a base class and a template class. but surely, such abstractions need much more efforts, and perhaps you should use boost serializing instead of reinventing that big piece of a wheel.

a "smaller" approach is to derive all your classes from a baseclass, say class Serializable.

that baseclass would have pure virtual functions Load, Store and serialization is used for to make memory objects persistent and restore persistent objects to memory objects. together with the factory pattern you even could restore a serialized buffer to a different class. a sample for this is a client-server application. the Server would create server objects from class person, say by SQL Query. then it serializes the objects and sends the result set to the client. the Client now creates new contact objects and loads data from stream. Client class Contact was derived from class Person and therefore could use Unserialize function of its base class.

Sara
0
 
LVL 25

Author Comment

by:SStory
Comment Utility
Sara,
since the implementation is a one-liner it is not so much limiting. i don't think that you would need more than 2 serializer methods for a class (in General).
It really wasn't the one liner that bothered me. I just thought if I had numerous classes that I wanted to use in one project and serialize in one mechanism, but later wanted to reuse them in another using XML or some something I can't even conceive of now, I'd have to modify every single class to add the new capability. I wondered if that is what people generally do, or if there was a better way.  The template suggest is perhaps closer to what I had in mind.  Would that be better than just making an abstract base class, Serializer, and inheriting all serializer class from it so I can pass in whatever as a type Serializer descendant and call one of its virtual methods...maybe serialize()?  Would that be more readable? Which is a best practice?

Thanks again for all of your answers.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 32

Assisted Solution

by:sarabande
sarabande earned 500 total points
Comment Utility
The template suggest is perhaps closer to what I had in mind.  Would that be better than just making an abstract base class

don't think that you easily could decide between good and better.

if you ever have to find a bug in your code, regarding deserializing some input stream, you would be happy if there is one source where you can see member by member what is expected and what is missing or goes wrong. it would be really worse if the input stream itself had all the meta data and you have to step deep into abstract code sequences without knowing whether the code is relevant for the issue or not.

templates allow to use one code for unrelated classes.

baseclasses allow to use one code for derived classes and to virtually call into derived classes. once if you have reached a function of the most outer derived class, you could do as much special things required without caring for abstraction.

you can combine the two by using templates for the serializing classes and class hierarchies for the object classes. in the object classes  you should explicitly serialize and deserialize each class member in the class where the member was defined. i never would use abstract definitions for serializing that were defined somewhere else, since it would make your life very hard later.

example:

class Person : public SerializableObject
{
     std::string fn;
     std::string ln;
     Date bd;
public:
     template<class Serializer>
     Serializer & operator<<(Serializer & strm) 
     {      
            // call baseclass operator first
            SerializableObject::operator<<(strm);
            // then add own members
            strm << fn << ln << bd; return strm; 
            return strm;
     }
      ....
};

class Employee : public Person
{
     std::string job;
     double salary;
     Date date;
public:
     template<class Serializer>
     Serializer & operator<<(Serializer & strm) 
     {   
            Person::operator<<(strm);
            strm << job << salary << date; return strm; 
      }
      ....
};

Open in new window


the operator functions even could be made virtual what allows code like

XMLSerializer xml(filename);

std::vector<SerializeableObject *> allMyObjects;
....

for (size_t n =  0; n < allMyObjects.size(); ++n)
{
       allMyObjects[n]->operator<< <XMLSerializer> (xml);
}

Open in new window


(it would need some instantiations, though, to make the compiler happy).



Sara
0
 
LVL 25

Author Comment

by:SStory
Comment Utility
I like the template idea.  I am wondering what would actually take place in the SerializerObject? It looks like since I would have to use templates and put these lines
 template<class Serializer>
     Serializer & operator<<(Serializer & strm) 

Open in new window

in each class anyway, that I wouldn't be gaining anything from an ABC or inheritance?  Am I right about that? If so it would seem that templates would be the better way to do this. If I am wrong, what would go in SerializerObject to justify its existence?

Or should the SerializationStream be an ABC such that Employee or whatever take an object of type SerializationStream that has virtual methods for << and >> that must be overloaded?  Or am I thinking wrong?
Example:

class Employee {
   SerializationStream & operator<<(SerializationStream & s, const Employee & e) {
          s << e.GetFn() << s.GetLn() << s.GetBD();
          return s;
   }
}

Open in new window


Wouldn't this negate the need for a template and be better? That way the object doesn't care what type of serialization stream is used for serialize and deserialize.  It could be CSVSerializationStream, XMLSerializationStream, Other...  If I am missing something obvious as a C++ newbie that will bite me later on for some reason, please tell me.

Of course this begs the questions:
How would XMLSerializationStream know what to do? I mean if I passed it  
s << e.GetFn() << s.GetLn() << s.GetBD();

Open in new window

How would it know to create:
<employee first="John" last="Doe" bday="1/1/1970"></employee> or

<employee>
    <firstname>John</firstname>
    <lastname>Doe</lastname>
    <birthday>1/1/1970</birthday>
</employee>

I can see how the internal XMLSerialization class could automagically know that.  So as I think through how to go about designing this thing I continue to ponder such thoughts.
And if XMLSerializer requires special methods like void Element(std:string name,std:string value) then nothing would be solved by trying to use an ABC as a lot of code for each type serializer would need to be written in each class.
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
the advantage of the template functions is that the returned type already is the correct type, while if using a baseclass you need a cast finally.

as already said, i normally need only one kind of serializer for most of my classes which writes the object to a database or reads them from database. in a Client Server application a few objects would need a different serializer to pass objects by p2p. but these objects rarely are stored in database additionally.

nevertheless both works. you have to try what is more suitable for you. what you should avoid is to have virtual functions which also use baseclass pointers or baseclass references. with that you get a mess of functions and cast that hardly can be maintained.

Sara
0
 
LVL 25

Author Comment

by:SStory
Comment Utility
Sorry for sounding so dense as I think through this with your help.

I was trying to have a way to pass in anything and it just serialize to it, but taking XML for example, it appears that only the class would know how to serialize itself to XML.  It would need to know to wrap that tags around elements and start with the main class element and such wouldn't it?  

After this answer I will award points to one of your solutions. I am asking a new related question strictly about XML serialization, should you want to participate:
28973527
0
 
LVL 32

Assisted Solution

by:sarabande
sarabande earned 500 total points
Comment Utility
It would need to know to wrap that tags around elements and start with the main class element and such wouldn't it?
don't think so. for xml you would use an xml library and your class pretty simple could use a serializer to serve the calls for the xml tree according to the document object model (DOM). note, an xml model is nothing but a tree model which already needed a well mapping representation in your class. if that is not the case, there is something wrong. either that the DOM doesn't real fit to your class object model, or the class is not well defined.

Sara
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Often, when implementing a feature, you won't know how certain events should be handled at the point where they occur and you'd rather defer to the user of your function or class. For example, a XML parser will extract a tag from the source code, wh…
This article will show you some of the more useful Standard Template Library (STL) algorithms through the use of working examples.  You will learn about how these algorithms fit into the STL architecture, how they work with STL containers, and why t…
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now