Solved

C++ Q: suggestion on how to do an object database

Posted on 1998-08-13
14
192 Views
Last Modified: 2010-04-01
Hi,

I'm in the process of making a database in which object instances are storing their data. This data can have a variable length, and of undefined type!

The way i solved this (with a bit of help from an earlier Q here :) is to register the objects in the database and then let the object do the things it want to do in the database (like reading and writing). The advantage is that the database can store every kind of object (which has the base database object as parent).

BUT

This is very error-sensitive. If one object f*cks up, the entire database might get corrupted.

Any suggestions here would be very appreciated,

 FlorizzzZz

[Compiler: doesn't matter, pure C++ Question]
[OS: look above]
0
Comment
Question by:TheMadManiac
  • 7
  • 7
14 Comments
 
LVL 22

Expert Comment

by:nietod
Comment Utility
My approach is to use a two tiered database.  I use an indexed database where each record has the same format.  The format describes the object and has other "fixed" data.  This is hard to screw up and is of course managed by a single set of procedure. The the actual object data is stored in a second file. The records in the first file contain offsets to the start of the data in second file.
0
 
LVL 22

Expert Comment

by:nietod
Comment Utility
Actually, now that I think about it.  The second file is kept more structured as well.  The second file allocates records of various sizes (because different objects need different amounts of data), but the sizes are always multiples of a base size.  Each record begins with a header that  descrbes the length of the record and contains information (a copy of the index key) that links the record back to its parent in the first file.  

Now that I think about, the real key to make this work is the fact that the objects never deal with the file directly.  They produce a stream of data.   (They write to a stream and don't care what happens to the data.)  Then the data is write to the file by a single set of procedures.
0
 
LVL 1

Author Comment

by:TheMadManiac
Comment Utility
I think i don't understand what you mean... :)
could you explain it a bit ?

(Am i correct this would involve a lot of pointers in the file?)
0
 
LVL 22

Expert Comment

by:nietod
Comment Utility
Which part are you inderested in?  The storage format, I assume.

I use 3 files.  2 are a standard indexed random access database that is.  Are you familiar with how an indexed random access database works?  Breifly, I'll explain more if you need.  There are 2 files.  One file is an index file that can be used to locate records in the data file.  This file stores index keys, that is a string of data that identifies a record, and the record number from the data file where the record is stored.  The data file uses a fixed length record format (Mine can be in lots of formats, but if you only wnat one format, I would recommend using xBase, then other programs can access it for debugging or use.)  since it is fixed length, the data can be easily located and there is less danger of corruption.  (With variable record lengths, once one record is corrupted, the whole rest of the file is lost--ussually).

The important thing is that the variable-length data is stored in a second file.  Now this data is not fixed length,   As I said above this is ussully bad, because ussually these files are read sequentially.  When a file like this gets corrupted, it is ussually impossible to read past the corruption so you loose everything after the corruption.  Not so with the approach I use.  I don't read the file sequentially.  The Fixed length data file contains the offset into the variable-length data file where the data for the object starts.  Thus the variable-length data for each object can be accessed directly, without having to read the data for the previous objects.  You just "lookup" the object in the index, find its record in the fixed-length data file.  Get the offset where the objects data is stored.  Seek to that offset in the variable-length data file.  Read the variable length data.

Is that any better?  If you have specific questions it might help me target my comments.  Or am I just totally talking over your head?
0
 
LVL 1

Author Comment

by:TheMadManiac
Comment Utility
ok.. that cleared things up :)
however, this would require the database to know each field an object uses (or doesn't use). The way i do it now this isn't known.. (i know this is bad .. i need to change it anyway :)

thus one file would contain object definitions like:

 Name : "Object"
 Type : "VarData" (or int or whatever :)

for every datafield in an object (again, variable number)

 Object Definition file (ODF)/variable length records
 Index File (IF)/thus also variable length records
 Data File (DF)/also variable length records

so, ODF -> IF -> DF ?
then, IF contains variable length records.

(I just put your variable/fixed length data in one file.. i don't  care about corruption not caused by the program :)

hmm now i think about it.. i don't really know what you ment ;-)
I cannot get fixed sized records in the index file.. any way i look at it. This because there can be more different objects in the database.

how do you come to fixed length records?
0
 
LVL 22

Expert Comment

by:nietod
Comment Utility
The fixed length file just contains data I always want stored for each object, like its type name, its format its version information etc.  It doesn't contain any of the "real" data.  That is stored in the variable length data file.

0
 
LVL 22

Expert Comment

by:nietod
Comment Utility
A very simple example.  Say I want to store a window object that represents the main window of the program in the database.  I give this object the index key "MainWind"  (mine is much more complex allowing it to find user specific/computer specific objects).  Thus the object is found by searching the index for the key MainWind.  From there it gets a record number.  It reads the specified record from the fixed length data file.  This allws it to confirm that the object it is reading is of the correct type and that code exists to handle it (it has been registered)  This allows a default constructed object to be created of the associated type.  Then it takes the offset expressed in this record and finds the variable-length data in the varialbe length data file.  In this case it might be the coordinates of the window (I said this was a simple example).  so the object reads in it 4 integer coordinates.  If it had been a different type of object, it would have read in a different type of data.

better?
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 1

Author Comment

by:TheMadManiac
Comment Utility
ok.. so if i would put all that info in the index file, i wouldn't have to put it in there as well.. making the need of a 4th file (in my case) unneccisary (yeah i know it's spelled wrong ;). (it would be nice for recovery purposes tho in case the index get's wasted...)

the index file would then just contain the object name (class name if you will), a pointer in the variable data file, and length etc of the instance. Still, this index would be variable length tho :( aah well.. just have to make sure it doesn't blow up :)

Could you answer the question so i can assign points?
Then i will just ask another question how to make a stream object ;) (for making sure the objects don't destroy the variable length file.. i would just make them write to memory, and then let the database handle the file :)

thank you for your time
0
 
LVL 1

Author Comment

by:TheMadManiac
Comment Utility
oh, and using the fixed length record file would also make managing empty-space in the variable length data file easier too :) (the index is a linked list of linked list.. (every object type in it's own list, for faster searching)might get very messy)
0
 
LVL 22

Accepted Solution

by:
nietod earned 150 total points
Comment Utility
You could combine the fixed length data file and the index file.  But I don't for several reasons.  
First of all, we used general purpose database procedures that are used for other types of files--that may not apply to you.  

The information stored in an index should be kept to a minimum for the sake of speed.  This information needs to be moved about as the index grows.  Also the information (the key at least) appears in many nodes of the index.  you probably don't want the other information to appear multiple times, so you will have a harder time using the space efficiently.

Indexs have complex formats and can be come corrupted--your original concern.  Thus by storing the data in a fixed length data file, we can rebuild the index from the data file if the index becomes corrupted.

The data file can be in a format that can be read by other programs and utilities,  We support unlimited formats, but I would recomend Xbase if you have only one.  

Out design supports random access to these objects, that is why I need both the index and the fixed length data file.  If you wanted sequential access (reading the objects in order from beginning to end), you could use two files.  just the fixed-length data file and the varaible-length data file.  For sequential access, you could consider even using one file.  You would have the fixed length data at the start followed by the varaible length data.  This is not quite as safe, but has the advantage of a single file to tack.  It is still a little safer than just having the variable length data in a stream, because corrupted objects don't affect the ones after them.


0
 
LVL 1

Author Comment

by:TheMadManiac
Comment Utility
just one thing.. what is xBase ? ;-)

moving the fixed length data to the top would be getting difficult... objects in the database use inheritance :)
0
 
LVL 22

Expert Comment

by:nietod
Comment Utility
Have you heard of dBase?  That file format became very popular in the late 80's so many database systems supported there format.  Clipper, FoxPro, Paradox etc.  Others programs at least provided ways to read and write the data, like MS access and Excel.  Since so many programs recognized the format it became known as xBase instead of xBase as in the algebraic use of "x" to represent anything.  In additon, dBase stopped using the format in version IV (and soon went bankrupt) so it really couldn't be called dBase format anylonger.  If you've never written a detabase before it is advanced programming.  You might want to consider buying a database library.  The index is of course the hard part.  The rest is pretty easy.
0
 
LVL 1

Author Comment

by:TheMadManiac
Comment Utility
I know dBase.. didn't know they went bankrupt tho .. grin neither did i know that it was also known as xBase.

And yes, this is my first database, but lucky for me, the index doesn't have to be really advanced... as long as everything besides the index works at reasonable speed, i'm happy. Speed on the index isn't a big issue.. so i won't use binary tree's or stuff like that. A sorted linked list will do nicely. And if not, it it very easy to change the index afterwards. (i could even reconstruct it from the fixed data file :)
0
 
LVL 1

Author Comment

by:TheMadManiac
Comment Utility
hmm. it's actually my first 'real' object oriented making (source code, not the database :).. do you have some idea how to do it nicely ? I usually make all wrong objects and in the end it all get's a big mess, messier than if i would just do it in C :(
(but then it's also a big mess.. well, just no data abtraction, the same)

could also ask it in another Q if that's better.. i just thought: everything is in here already :)
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Article by: SunnyDark
This article's goal is to present you with an easy to use XML wrapper for C++ and also present some interesting techniques that you might use with MS C++. The reason I built this class is to ease the pain of using XML files with C++, since there is…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now