Solved

load XML file iteration

Posted on 2013-05-28
5
499 Views
Last Modified: 2013-05-30
Hi,
I have an xml File (see attached file)
This file is composed of Blocks, Lines, words, characters:

Every Block  is composed of 1,...,n Lines
Every line is composed of 1,...,k  words
Every word is composed of 1,...,l  characters

I am trying to create objects as follows:
Block(Int top, Int left, Int bottom, int right, vector<Lines>)
Line(Int top, Int left, Int bottom, int right, vector<words>)
Word(Int top, Int left, Int bottom, int right, vector<characters>)



I am using TinyXML on C++, but i can't link them together, My code can take one object( block,line,word,character) at a time.

void Keywords::checkChild(TiXmlElement *child)
{
       if(child)
        {

            if((string)child->Value() == "block")
            {
                cout << child->Value()<<endl;

                double x1 = atoi(child->Attribute("left"));
                double y1 = atoi(child->Attribute("top"));
                double x2 = atoi(child->Attribute("right"));
                double y2=  atoi(child->Attribute("bottom"));
              //Vector<Line>lineList
              //  blockList.push_back(newBlock(y1,x1,y2,x2,lineList));
            }


          checkChild(child->FirstChildElement());
          
          checkChild(child->NextSiblingElement());

        }///end if child
}

Open in new window


Thank you.
00000012-1-R.xml
0
Comment
Question by:HaniDaher
  • 2
5 Comments
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 39204521
You need to have a different function for each type (or if they all have the same attributes you could use templates).

Something like
void Keywords::checkBlock(TiXmlElement *child)
{
       if(child)
        {

            if((string)child->Value() == "block")
            {
                cout << child->Value()<<endl;

                double x1 = atoi(child->Attribute("left"));
                double y1 = atoi(child->Attribute("top"));
                double x2 = atoi(child->Attribute("right"));
                double y2=  atoi(child->Attribute("bottom"));
                blockList.push_back(newBlock(y1,x1,y2,x2);
            }

          child = child->FirstChildElement();
          while(child)
          {
              getLine(child, blockList.Back());
              child = child->NextSiblingElement();
           }
        }///end if child
}
[code]
void Keywords::checkLine(TiXmlElement *child, Block* block)
{
       if(child)
        {

            if((string)child->Value() == "line")
            {
                cout << child->Value()<<endl;

                double m = atoi(child->Attribute("slope")); //or whatever
                double x0 = atoi(child->Attribute("intercept"));
                block->m_line_list.push_back(newLine(m,x0));
            }

          child = child->FirstChildElement();
          while(child)
          {
              getWord(child, block->m_line_list.Back());
              child = child->NextSiblingElement();
           }
        }///end if child
}
      

Open in new window

0
 

Author Comment

by:HaniDaher
ID: 39204588
Yes Tommy that's what i thought. I actually managed to find the following solution:
void parseFile(TiXmlElement* document, vector<Block*>& blocks)
{
  for (TiXmlElement* sub = document->GetFirstChildElement("block"); sub; sub = sub->GetNextSiblingElement("block"))
    blocks.push_back(parseBlock(sub));
}
Block* parseBlock(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Line*> lines;
  for (TiXmlElement* sub = element->GetFirstChildElement("line"); sub; sub = sub->GetNextSiblingElement("line"))
    lines.push_back(parseLine(sub));
  return new Block(x1, ..., lines);
}
Line* parseLine(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Word*> words;
  for (TiXmlElement* sub = element->GetFirstChildElement("word"); sub; sub = sub->GetNextSiblingElement("word"))
    words.push_back(parseWord(sub));
  return new Line(x1, ..., words);
}
Word* parseWord(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Char*> chars;
  for (TiXmlElement* sub = element->GetFirstChildElement("char"); sub; sub = sub->GetNextSiblingElement("char"))
    chars.push_back(parseChar(sub));
  return new Word(x1, ..., chars);
}
Char* parseChar(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  return new Char(x1, ...);
}

Open in new window


I think it is basically the same idea as yours.
What do you think about the above code?
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 250 total points
ID: 39204642
Yes, that is the same basic idea. Looks like it would work. Personally, I would try to avoid all those calls to new so you don't have to worry about cleaning up all the memory later.
Something like this
void parseFile(TiXmlElement* document, vector<Block>& blocks)
{
  for (TiXmlElement* sub = document->GetFirstChildElement("block"); sub; sub = sub->GetNextSiblingElement("block"))
  {
    
    blocks.push_back(Block());
    parseBlock(sub, blocks.back())
  }
}
void parseBlock(TiXmlElement* element, Block* block)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  for (TiXmlElement* sub = element->GetFirstChildElement("line"); sub; sub = sub->GetNextSiblingElement("line"))
  {
    block->m_lines.push_back(Line);
    parseLine(sub, block->m_lines.back());
   } 

//etc
}

Open in new window


Either way works. I've just found that using dynamic memory like that can lead to segfaults and memory leaking down the road (unless this is just a small one-off thing).
If you are building this as part of a larger program that other people may modify later, I would recommend only using new in a constructor or in a function that also has the delete.
0
 
LVL 33

Accepted Solution

by:
sarabande earned 250 total points
ID: 39205123
if using vector<Block> instead of vector<Block*> the parseBlock function needs to get the second argument by reference and not by pointer:

...
   parseBlock(sub, blocks.back()); // the blocks.back returns a reference to the new Block
  }
}
void parseBlock(TiXmlElement* element, Block& block)

Open in new window


nevertheless as you already have a class 'Keywords' there is no need to turn to c function style. all the objects Block, Line, Word, Character share the same attributes of a rectangle. hence the following class tree seems to map:

struct Rectangle
{
    int left;
    int top;
    int right;
    int bottom;
    Rectangle() : left(0), top(0), right(0), bottom(0) { }
    Rectangle(int l, int t, int r, int b) : left(l), top(t), right(r), bottom(b) { }
};

class Base
{
    int id;
    Rectangle rect;
    std::vector<Base*> subs;
public:
    virtual ~Base()
    { while subs.empty() == false) { delete subs[0]; subs.erase(subs.begin(); } }
    void setRectangle(TiXmlElement* obj);
    virtual Base * createSub();
    virtual std::string getSubName();
    bool parseSubs(const std::string & keyword, TiXmlElement* obj);
};

class Block : public Base
{
...
    Base * createSub() { return new Line; }
    std::string getSubName() { return "line"; }
};

...

class Word : public Base
{
    std::string value;
    int confidence;
    std::string font;
    int type;
public:
    ...
    Base * createSub() { return new Character; }
    std::string getSubName() { return "character"; }
    
};

Open in new window


if doing so you could use the Base container std::vector<Base*> subs as container for lines, words, characters and implement the function parseSubs such that it works for all 4 class objects. you would create new pointers of the 'sub' class by calling the virtual function createSub.

note, the pointers in the containers were deleted when the Base object was destructed. so no need to worry for leaks.

Sara
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
C++ :Change value from  DisableCMD registry 4 60
Re-position the objects 7 109
Programatically extract date from website 8 77
XML Removal- Powershell 4 26
Article by: Nadia
Suppose you use Uber application as a rider and you request a ride to go from one place to another. Your driver just arrived at the parking lot of your place. The only thing you know about the ride is the license plate number. How do you find your U…
Article by: Nadia
Linear search (searching each index in an array one by one) works almost everywhere but it is not optimal in many cases. Let's assume, we have a book which has 42949672960 pages. We also have a table of contents. Now we want to read the content on p…
The goal of this video is to provide viewers with basic examples to understand and use switch statements in the C programming language.
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.

778 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question