asked on

Saving a general tree to a flat file

I am parsing a tagged file and creating a tree structure if it. This tree is unordered and each node can have any number of child nodes. I need to use this tree often and do not want to parse and construct the tree each time. Is it possible to save the entire tree to a disk such that we can easily reconstruct it from the disk when needed ? C/C++ Code would greatly help. I have access to STL too.

Pacman

my suggestion:

walk through the tree and write it node for node to the file.
Link the nodes in the file by saving their filepositions.

tellis_george

ASKER

Hi Pacman, Walking thro the tree and recording each node is OK, but I am not too clear on the second point of saving file positions ???

tellis_george

ASKER

Hi Pacman, Walking thro the tree and recording each node is OK, but I am not too clear on the second point of saving file positions ???

KangaRoo

The exact details depend on the nodes, but this is the general idea. Its a bit simpler then 'reconstructing' the tree in the file as pacman suggested.

class Node
{
void write(file)
{
// write this nodes data
write(file, somedatamembers);
// for each of its children:
child->write(file);
}
void read(file)
{
read(file, somedatamembers);
while(there are children written in the file)
new_child->read(file);
}
}

One such 'detail' is a way to delimit the list of children, which is left open in the example.

tellis_george

ASKER

Could you explain in some more detail assuming each node is a simple structure with a data field, and a series of link fields.

Pacman

tellis george,

you want to do it with procedures ?
Then just code your WiteNode() and ReadNode() function:

WriteNode (file, node)
{
write(file, node)
for each child:
WriteNode (file, child);
}

ReadNode( ... )
{
// like WriteNode()
}

tellis_george

ASKER

I was reading it up and learnt that to reconstruct the tree exactly as it was in memory we need to traverse the tree in some fashion (inorder traversal) and mark the nodes with "keys". We then traverse these nodes again in the inorder fashion to write the nodes to file along with with these keys... and we can then reconstruct the tree when needed using these keys. But this method holds good gor unordered binary trees, mine is a general tree (any number of child nodes) so the problem gets more complicated.
In the solutions given above I am not sure how the tree structure will be maintained when reconstructed....

KangaRoo

Is the number of links fixed?
#define NODE_NO_LINKS
class Node
{
Node links[NODE_NO_LINKS];
DataType data;
public:
// stuff

void write(FILE* file);
{
write(file, data); // or whatever writes datatype to file
for(int i = 0; i < NODE_NO_LINKS;++i)
links[i].write(file);
}

void read(FILE* file)
read(file, data); // or whatever reads datatype from file
for(int i = 0; i < NODE_NO_LINKS;++i)
links[i].read(file);
}

};

Maybe it is better if you give the declaration of your Node. What type of files, C style FILE* or C++ iostreams do you want to use? Should the node use operator<<?

KangaRoo

And how do they vary. Is the number of children fixed for all the nodes of one specific tree or can it vary from node to node? Does a node 'know' exactly how many children it has, ie can we do something like
for(i = 0; i < GetChildCount(); ++i)...
or more like
while(!EndOfChildren())...

If the number of children is known we can write this in the file, just before we start writing the children. If not we need some way of making the beginning and end of the list of children in the file.

abstraction

Just an idea.. see if it works.. if the tree is something like below :

A
| |
B C
| |
D E

store it as
<HEADING DELIMITER>
1,1,<info of node A> <some delimiter> 2,1,<info of node B> <some delimiter> 2,2,<info of node C>
2,1,<info of node B> <some delimiter> 2,2,<info of node C>, etc.
<DELIMTER END>

Immediately after the information about all the nodes, you store the relations between the nodes. So, each entry can be something like 1,1|2,1 <some delimter> 1,1 | 2,2. etc.

Hope you got the general idea. I was trying to say that first of all you can store the nodes with an additional prefix that identifies the node's position. Next you store the relations between the nodes. So, when you read the tree from the file, you will fill your array / linked list (whatever data structure) accordingly and then fill in the relations.
Hope I'm able to help you.

tellis_george

ASKER

Answers from Kangaroo do not allow proper re-construction.
The comment by "abstraction" appears promising but there is some amount of data redundancy in it. Also how exactly do we traverse the tree while reading and writing ? Recursively ?
Also the tree can be anyway
Ex
A
/ | \
B C D
/| | /|\
E F GH I J
I need to reconstruct the tree exactly as above from the file.

ozo

http://www.faqs.org/rfcs/rfc1014.html

tellis_george

ASKER

OZO, I did not see any useful info in the link. The page was dealing with "XDR: External Data Representation Standard" which was a totally different topic

Rusk

For this tree
A
/ | \
B C D
/| | /|\
E F GH I J

The pre-order file would be like this:
A/B/E/o/F/o/o/C/G/o/o/D/H/o/I/o/J/o/o/o

the 'o's mean that the following node is the child of the parent(sibling)
so for each time you encounter an 'o'
just go up one level.

you can implement this recursively

briefly how to read:
Next is the next item on disk

Place(Root)
{while (Next!=o)
{Root=Next
Place(Root->Nextchild)
}
return
}
Writing would be really similar
it is hard to say w/o knowing the exact tree implementation

ozo

XDR can portably save a tree in a flat file using Optional-data (3.18)
(You could also store it in ASCII, in the same format as a C initializer for the data, which would be even more portable, but more work to generate)

tellis_george

ASKER

I am trying to implement Rusk's code. I'll get back as soon as I know the results.

tellis_george

ASKER

Rusk, The solution you provided appears like it will work just fine.
It will be great if you could provide some sample read/write code in C/C++ as I'm pretty new here.
The node is a structure
structure
{
string data;
linkList NodePtr(each list element in this linklist points to a node structure)
} node

Each node is as shown

[data][L1]-[L2]-[L3]
|
[data]-[L1]
|
[data]-[L1]

ASKER CERTIFIED SOLUTION

Rusk

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

KangaRoo

It doesn't matter how the tree looks, it would be reconstructed exactly the same. Redundancy can be avoided, but if there is redundancy you'r not writing a tree but a graph. You are pretty late with information on your node.

tellis_george

ASKER

Thanks everyone for the help....