• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 511
  • Last Modified:

Deep copying or serialization

OK, here is my problem, and what I have done so far. I need to be able to parse input data, and create a fairly complex object hierarchy from this data. Once the hierarchy is created a number of properties of these objects will be changed, then the objects need to be saved as a "template", and later be reloaded and used as a starting base for new data. The data I am working with is dynamic and there is no way to know beforehand it's structure; all parsing and object creation is done by following a rule set. So basically it works like this:

1.) Sample data is parsed, producing an object hierarchy that represents the sample data's structure, but does not actually contain any values.

2.) Properties are then manually set by the user. Now in essence we have a template customized by the user.

3.) Currently I am saving the entire object hierarchy by serializing the highest object in the chain, and writing the data to a file.

So far I have accomplished all this and it is working perfectly. My problem is that many copies of these templates will have to be created as a starting base, filled with actual data, and further processing performed. Performance is of the utmost importance, since tens of thousands of these templates will have to be created in a single run and this cycle will repeat itself many times during the day and night. The way I see it I have three options, but I am completely open to suggestions.:

Option 1> I can load these templates from a file every time I need to create a new copy, but I am relatively sure this would be the worst option for performance.

Option 2 > I can load the base template, and add deep copying ability to each of the objects, but since there are complex relationships between them, and the depth of the chain is variable, I am afraid that this might not be the most efficient way either.

Option 3> I can load the base template, store it in a memory stream, then deserialize it from memory each time a new object is created from this template. I am not sure if option this would even work, as I have not yet tested it. I just finished everything else.

Please keep in mind that performance is of paramount importance. Any thoughts, ideas, or suggestions would be greatly appreciated. Thanks.
0
exptech
Asked:
exptech
  • 3
1 Solution
 
Meir RivkinFull stack Software EngineerCommented:
can u reflect your objects hierarchy in xml file?
0
 
exptechAuthor Commented:
Sedgwick, I see don't any reason that I could not. Though, I think the question would still be; is it faster to create all of the objects in complex object hierarchy each time they are needed, or is it faster to serialize a complete hierarchy to memory, as the object source, and deserialize it to create new objects. I hope I am explaining my thoughts correctly.
0
 
MogalManicCommented:
hand written code to clone an object is almost guaranteed to be faster than serialization/deserialization.  What you have to figure out is it worth spending 10-200 hours to develop a cloning method or write a method that serializes an object to a MemoryStream, and then reads the memory stream back into a new cloned object.

Another consideration is that serialization/deserialization would either consume Disk IO or memory.  If you use memory the memory footprint would consume more than 3X memory:
   1X == 1st object
   2X*(serialization Size) ==  2nd object
   3X == Cloned object

I would do some timings on how long serialization/deserialization takes to a memory stream (also mark how large the memory stream gets).  If the performance / memory usage is unacceptable, then write a deep-clone method.
0
 
exptechAuthor Commented:
MogalManic,

I will run some tests, but I am leaning toward spending the extra time and writing the deep clone methods. The modules performance definitely takes precedence over the development time and effort.

 
0
 
exptechAuthor Commented:
MogalManic,

Just a follow up; after writing and tweaking my code, the system is able to create approx 1.5 million objects per second. Since the hierarchy contains about 1,000 object, quite a few can be created in a single second.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now