Solved

load XML file iteration

Posted on 2013-05-28
5
507 Views
Last Modified: 2013-05-30
Hi,
I have an xml File (see attached file)
This file is composed of Blocks, Lines, words, characters:

Every Block  is composed of 1,...,n Lines
Every line is composed of 1,...,k  words
Every word is composed of 1,...,l  characters

I am trying to create objects as follows:
Block(Int top, Int left, Int bottom, int right, vector<Lines>)
Line(Int top, Int left, Int bottom, int right, vector<words>)
Word(Int top, Int left, Int bottom, int right, vector<characters>)



I am using TinyXML on C++, but i can't link them together, My code can take one object( block,line,word,character) at a time.

void Keywords::checkChild(TiXmlElement *child)
{
       if(child)
        {

            if((string)child->Value() == "block")
            {
                cout << child->Value()<<endl;

                double x1 = atoi(child->Attribute("left"));
                double y1 = atoi(child->Attribute("top"));
                double x2 = atoi(child->Attribute("right"));
                double y2=  atoi(child->Attribute("bottom"));
              //Vector<Line>lineList
              //  blockList.push_back(newBlock(y1,x1,y2,x2,lineList));
            }


          checkChild(child->FirstChildElement());
          
          checkChild(child->NextSiblingElement());

        }///end if child
}

Open in new window


Thank you.
00000012-1-R.xml
0
Comment
Question by:HaniDaher
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
5 Comments
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 39204521
You need to have a different function for each type (or if they all have the same attributes you could use templates).

Something like
void Keywords::checkBlock(TiXmlElement *child)
{
       if(child)
        {

            if((string)child->Value() == "block")
            {
                cout << child->Value()<<endl;

                double x1 = atoi(child->Attribute("left"));
                double y1 = atoi(child->Attribute("top"));
                double x2 = atoi(child->Attribute("right"));
                double y2=  atoi(child->Attribute("bottom"));
                blockList.push_back(newBlock(y1,x1,y2,x2);
            }

          child = child->FirstChildElement();
          while(child)
          {
              getLine(child, blockList.Back());
              child = child->NextSiblingElement();
           }
        }///end if child
}
[code]
void Keywords::checkLine(TiXmlElement *child, Block* block)
{
       if(child)
        {

            if((string)child->Value() == "line")
            {
                cout << child->Value()<<endl;

                double m = atoi(child->Attribute("slope")); //or whatever
                double x0 = atoi(child->Attribute("intercept"));
                block->m_line_list.push_back(newLine(m,x0));
            }

          child = child->FirstChildElement();
          while(child)
          {
              getWord(child, block->m_line_list.Back());
              child = child->NextSiblingElement();
           }
        }///end if child
}
      

Open in new window

0
 

Author Comment

by:HaniDaher
ID: 39204588
Yes Tommy that's what i thought. I actually managed to find the following solution:
void parseFile(TiXmlElement* document, vector<Block*>& blocks)
{
  for (TiXmlElement* sub = document->GetFirstChildElement("block"); sub; sub = sub->GetNextSiblingElement("block"))
    blocks.push_back(parseBlock(sub));
}
Block* parseBlock(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Line*> lines;
  for (TiXmlElement* sub = element->GetFirstChildElement("line"); sub; sub = sub->GetNextSiblingElement("line"))
    lines.push_back(parseLine(sub));
  return new Block(x1, ..., lines);
}
Line* parseLine(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Word*> words;
  for (TiXmlElement* sub = element->GetFirstChildElement("word"); sub; sub = sub->GetNextSiblingElement("word"))
    words.push_back(parseWord(sub));
  return new Line(x1, ..., words);
}
Word* parseWord(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  vector<Char*> chars;
  for (TiXmlElement* sub = element->GetFirstChildElement("char"); sub; sub = sub->GetNextSiblingElement("char"))
    chars.push_back(parseChar(sub));
  return new Word(x1, ..., chars);
}
Char* parseChar(TiXmlElement* element)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  return new Char(x1, ...);
}

Open in new window


I think it is basically the same idea as yours.
What do you think about the above code?
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 250 total points
ID: 39204642
Yes, that is the same basic idea. Looks like it would work. Personally, I would try to avoid all those calls to new so you don't have to worry about cleaning up all the memory later.
Something like this
void parseFile(TiXmlElement* document, vector<Block>& blocks)
{
  for (TiXmlElement* sub = document->GetFirstChildElement("block"); sub; sub = sub->GetNextSiblingElement("block"))
  {
    
    blocks.push_back(Block());
    parseBlock(sub, blocks.back())
  }
}
void parseBlock(TiXmlElement* element, Block* block)
{
  double x1 = atof(element->Attribute("left"));
  // ...
  for (TiXmlElement* sub = element->GetFirstChildElement("line"); sub; sub = sub->GetNextSiblingElement("line"))
  {
    block->m_lines.push_back(Line);
    parseLine(sub, block->m_lines.back());
   } 

//etc
}

Open in new window


Either way works. I've just found that using dynamic memory like that can lead to segfaults and memory leaking down the road (unless this is just a small one-off thing).
If you are building this as part of a larger program that other people may modify later, I would recommend only using new in a constructor or in a function that also has the delete.
0
 
LVL 34

Accepted Solution

by:
sarabande earned 250 total points
ID: 39205123
if using vector<Block> instead of vector<Block*> the parseBlock function needs to get the second argument by reference and not by pointer:

...
   parseBlock(sub, blocks.back()); // the blocks.back returns a reference to the new Block
  }
}
void parseBlock(TiXmlElement* element, Block& block)

Open in new window


nevertheless as you already have a class 'Keywords' there is no need to turn to c function style. all the objects Block, Line, Word, Character share the same attributes of a rectangle. hence the following class tree seems to map:

struct Rectangle
{
    int left;
    int top;
    int right;
    int bottom;
    Rectangle() : left(0), top(0), right(0), bottom(0) { }
    Rectangle(int l, int t, int r, int b) : left(l), top(t), right(r), bottom(b) { }
};

class Base
{
    int id;
    Rectangle rect;
    std::vector<Base*> subs;
public:
    virtual ~Base()
    { while subs.empty() == false) { delete subs[0]; subs.erase(subs.begin(); } }
    void setRectangle(TiXmlElement* obj);
    virtual Base * createSub();
    virtual std::string getSubName();
    bool parseSubs(const std::string & keyword, TiXmlElement* obj);
};

class Block : public Base
{
...
    Base * createSub() { return new Line; }
    std::string getSubName() { return "line"; }
};

...

class Word : public Base
{
    std::string value;
    int confidence;
    std::string font;
    int type;
public:
    ...
    Base * createSub() { return new Character; }
    std::string getSubName() { return "character"; }
    
};

Open in new window


if doing so you could use the Base container std::vector<Base*> subs as container for lines, words, characters and implement the function parseSubs such that it works for all 4 class objects. you would create new pointers of the 'sub' class by calling the virtual function createSub.

note, the pointers in the containers were deleted when the Base object was destructed. so no need to worry for leaks.

Sara
0

Featured Post

Forrester Webinar: xMatters Delivers 261% ROI

Guest speaker Dean Davison, Forrester Principal Consultant, explains how a Fortune 500 communication company using xMatters found these results: Achieved a 261% ROI, Experienced $753,280 in net present value benefits over 3 years and Reduced MTTR by 91% for tier 1 incidents.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Problem to App 4 118
PHP delete contents of file- before writing to it 6 49
How to determine if the result of an equation is an integer in C++? 3 28
ASP and Extracting XML 7 27
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question