Solved

load XML file iteration

Posted on 2013-05-28
5
485 Views
Last Modified: 2013-05-30
Hi,
I have an xml File (see attached file)
This file is composed of Blocks, Lines, words, characters:

Every Block  is composed of 1,...,n Lines
Every line is composed of 1,...,k  words
Every word is composed of 1,...,l  characters

I am trying to create objects as follows:
Block(Int top, Int left, Int bottom, int right, vector<Lines>)
Line(Int top, Int left, Int bottom, int right, vector<words>)
Word(Int top, Int left, Int bottom, int right, vector<characters>)



I am using TinyXML on C++, but i can't link them together, My code can take one object( block,line,word,character) at a time.

void Keywords::checkChild(TiXmlElement *child)
{
       if(child)
        {

            if((string)child->Value() == "block")
            {
                cout << child->Value()<<endl;

                double x1 = atoi(child->Attribute("left"));
                double y1 = atoi(child->Attribute("top"));
                double x2 = atoi(child->Attribute("right"));
                double y2=  atoi(child->Attribute("bottom"));
              //Vector<Line>lineList
              //  blockList.push_back(newBlock(y1,x1,y2,x2,lineList));
            }


          checkChild(child->FirstChildElement());
          
          checkChild(child->NextSiblingElement());

        }///end if child
}

Open in new window


Thank you.
00000012-1-R.xml
0
Comment
Question by:HaniDaher
  • 2
5 Comments
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 39204521
You need to have a different function for each type (or if they all have the same attributes you could use templates).

Something like
void Keywords::checkBlock(TiXmlElement *child)
{
       if(child)
        {

            if((string)child->Value() == "block")
            {
                cout << child->Value()<<endl;

                double x1 = atoi(child->Attribute("left"));
                double y1 = atoi(child->Attribute("top"));
                double x2 = atoi(child->Attribute("right"));
                double y2=  atoi(child->Attribute("bottom"));
                blockList.push_back(newBlock(y1,x1,y2,x2);
            }

          child = child->FirstChildElement();
          while(child)
          {
              getLine(child, blockList.Back());
              child = child->NextSiblingElement();
           }
        }///end if child
}
[code]
void Keywords::checkLine(TiXmlElement *child, Block* block)
{
       if(child)
        {

            if((string)child->Value() == "line")
            {
                cout << child->Value()<<endl;

                double m = atoi(child->Attribute("slope")); //or whatever
                double x0 = atoi(child->Attribute("intercept"));
                block->m_line_list.push_back(newLine(m,x0));
            }

          child = child->FirstChildElement();
          while(child)
          {
              getWord(child, block->m_line_list.Back());
              child = child->NextSiblingElement();
           }
        }///end if child
}
      

Open in new window

0
 

Author Comment

by:HaniDaher
ID: 39204588
Yes Tommy that's what i thought. I actually managed to find the following solution:
void parseFile(TiXmlElement* document, vector<Block*>& blocks)
{
  for (TiXmlElement* sub = document->GetFirstChildElement("block"); sub; sub = sub->GetNextSiblingElement("block"))
    blocks.push_back(parseBlock(sub));
}
Block* parseBlock(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Line*> lines;
  for (TiXmlElement* sub = element->GetFirstChildElement("line"); sub; sub = sub->GetNextSiblingElement("line"))
    lines.push_back(parseLine(sub));
  return new Block(x1, ..., lines);
}
Line* parseLine(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Word*> words;
  for (TiXmlElement* sub = element->GetFirstChildElement("word"); sub; sub = sub->GetNextSiblingElement("word"))
    words.push_back(parseWord(sub));
  return new Line(x1, ..., words);
}
Word* parseWord(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Char*> chars;
  for (TiXmlElement* sub = element->GetFirstChildElement("char"); sub; sub = sub->GetNextSiblingElement("char"))
    chars.push_back(parseChar(sub));
  return new Word(x1, ..., chars);
}
Char* parseChar(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  return new Char(x1, ...);
}

Open in new window


I think it is basically the same idea as yours.
What do you think about the above code?
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 250 total points
ID: 39204642
Yes, that is the same basic idea. Looks like it would work. Personally, I would try to avoid all those calls to new so you don't have to worry about cleaning up all the memory later.
Something like this
void parseFile(TiXmlElement* document, vector<Block>& blocks)
{
  for (TiXmlElement* sub = document->GetFirstChildElement("block"); sub; sub = sub->GetNextSiblingElement("block"))
  {
    
    blocks.push_back(Block());
    parseBlock(sub, blocks.back())
  }
}
void parseBlock(TiXmlElement* element, Block* block)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  for (TiXmlElement* sub = element->GetFirstChildElement("line"); sub; sub = sub->GetNextSiblingElement("line"))
  {
    block->m_lines.push_back(Line);
    parseLine(sub, block->m_lines.back());
   } 

//etc
}

Open in new window


Either way works. I've just found that using dynamic memory like that can lead to segfaults and memory leaking down the road (unless this is just a small one-off thing).
If you are building this as part of a larger program that other people may modify later, I would recommend only using new in a constructor or in a function that also has the delete.
0
 
LVL 32

Accepted Solution

by:
sarabande earned 250 total points
ID: 39205123
if using vector<Block> instead of vector<Block*> the parseBlock function needs to get the second argument by reference and not by pointer:

...
   parseBlock(sub, blocks.back()); // the blocks.back returns a reference to the new Block
  }
}
void parseBlock(TiXmlElement* element, Block& block)

Open in new window


nevertheless as you already have a class 'Keywords' there is no need to turn to c function style. all the objects Block, Line, Word, Character share the same attributes of a rectangle. hence the following class tree seems to map:

struct Rectangle
{
    int left;
    int top;
    int right;
    int bottom;
    Rectangle() : left(0), top(0), right(0), bottom(0) { }
    Rectangle(int l, int t, int r, int b) : left(l), top(t), right(r), bottom(b) { }
};

class Base
{
    int id;
    Rectangle rect;
    std::vector<Base*> subs;
public:
    virtual ~Base()
    { while subs.empty() == false) { delete subs[0]; subs.erase(subs.begin(); } }
    void setRectangle(TiXmlElement* obj);
    virtual Base * createSub();
    virtual std::string getSubName();
    bool parseSubs(const std::string & keyword, TiXmlElement* obj);
};

class Block : public Base
{
...
    Base * createSub() { return new Line; }
    std::string getSubName() { return "line"; }
};

...

class Word : public Base
{
    std::string value;
    int confidence;
    std::string font;
    int type;
public:
    ...
    Base * createSub() { return new Character; }
    std::string getSubName() { return "character"; }
    
};

Open in new window


if doing so you could use the Base container std::vector<Base*> subs as container for lines, words, characters and implement the function parseSubs such that it works for all 4 class objects. you would create new pointers of the 'sub' class by calling the virtual function createSub.

note, the pointers in the containers were deleted when the Base object was destructed. so no need to worry for leaks.

Sara
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
basic hardware to learn oop advanced design patterns 3 73
word0 challenge 3 58
Homework Help 5 48
XSLT Help 12 19
There is an easy way, in .NET, to centralize the treatment of all unexpected errors. First of all, instead of launching the application directly in a Form, you need first to write a Sub called Main, in a module. Then, set the Startup Object to th…
This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now