Roll Yer Own XML Output in C++

DanRollins
CERTIFIED EXPERT
Published:
The beauty of XML is that it is simple "eyeball-readable" text.  Bring it up in any text editor or web browser, and you have an easy direct way to debug it and find problems.  Sure, it can be handy to use a complex object like MSXML or a third-party XML parsing utility when processing and interpreting the XML, but to generate XML output, you can use simple string-manipulation functions.

Some time ago, I needed to output some database records using a certain XML schema.  I spent time figuring out how to make SQL Server do it automatically, and I found some other tools... but as soon as I needed to do anything at all unusual, I found myself spending more time working around the tool quirks than doing the actual programming.  These monolithic programs and objects made it too complicated.  After all, for the most part, XML is very simple:

    < tagName > data goes here </tagName >
...or...
    < tagName attrName = value > data goes here </tagName >

In the data goes here spot, there can be additional <tagName>...</tagName> sequences.  What could be simpler?

It's just an issue of inserting textual data in between tags.  To my eyes, it looks like a perfect spot to use C++'s printf-style formatting specifications and its Variable Argument List capability.  For instance:

   prinft("<LastName>%s</LastName>\r\n", rc.sNamelast );
   prinft("<Age>%d</Age>\r\n", rc.nAge );
   prinft("<Hair color=%s style=%s/>\r\n", rc.sHairColor, rc.sHairStyle );

The other thing that jumps out is that closing tags are the exact same as opening tags except that they start with a slash (/) character.  

The ATL/MFC CString object has a printf-like formatting feature.  So one could write a short function like:
CString XmlOut( LPCSTR sTag, LPCSTR sData )
                      {
                          CString sOut;
                          sOut.Format("<%s>%s</%s>\r\n", sTag, sData, sTag );  // tag twice!
                          return( sOut );
                      }

Open in new window

...and call it like:
    m_sOut += XmlOut( "LastName", rc.sNameLast );

Open in new window

I actually ended up using C/C++'s variable argument list capability to write a general-purpose function that would accept a tag name, a tag value, and any number of attribute name/value pairs.  The details of that are described here:

     Working with Variable Argument Lists in C/C++

The source code that does that is included in this article, as well.  But here, I'm going to focus on other aspects of XML-generating class object.

Some XML Generating Functions
The printf-like features are useful, but the workhorse methods are simpler.  You'll want your XML generator to provide simple means to create a complete Element in one call, and a way to start an Element tag, add some Attributes, and then close the tag:
 
   XmlOutElem( tagName, value );
or
   XmlOut( "<tag" );
   XmlOutAttrib( attrName, value );
    ...
   XmlOut( ">" );

After using the individual pieces like that for a while, I saw some repeated functionality and worked out a general-purpose mechanism for generating a largish output section all in one go.

CString XmlElem( LPCSTR pszElemName, LPCSTR pszData, BOOL fLineBreak/*=FALSE*/, BOOL fNothingIfBlank/*=TRUE*/ )
                      {
                          CString sData= pszData;
                          CString sRet;
                          CString sEndTag= pszElemName;
                          int nOffset= sEndTag.Find(" " );
                          if ( nOffset != -1 ) sEndTag= sEndTag.Left( nOffset );
                      
                          sData.TrimLeft( " \r\n");
                          sData.TrimRight(" \r\n");
                      
                          if ( sData == "" ) {
                              if ( ! fNothingIfBlank ) {
                                  sRet.Format("<%s></%s>", pszElemName, (LPCSTR)sEndTag );
                              }
                          } 
                          else {
                              sRet.Format("<%s>%s</%s>", pszElemName, (LPCSTR)sData, (LPCSTR)sEndTag );
                              if ( fLineBreak ) {
                                  sRet.Insert(0,"\r\n");
                              }
                          }
                          return sRet;
                      }

Open in new window


This type of function will let you use a sequence like:
CString sXML= 
                      XmlElem( "Name", 
                          XmlElem( "FirstName",  cr.m_sNameFirst  )
                         +XmlElem( "MiddleName", cr.m_sNameMiddle )
                         +XmlElem( "LastName",   cr.m_sNameLast   )
                         +XmlElem( "Generation", cr.m_sNameSuffix )
                      );

Open in new window

...to output XML like:
    <Name>
        <FirstName>Dan</FirstName>
        <LastName>Rollins</LastName>
        <Generation>III</Generation>
    </Name>

For generating longer XML sequences, you will want to write functions that build whole nodes.  For instance,

CString sAllCustomers, sElemOneCustomer;
                      while ( ! cr.IsEOF() ) {
                          sElemOneCustomer= 
                              GetElemCustName( cr )   // <Name>...</Name>
                             +GetElemCustAddr( cr )   // <Address>...</Address>
                             +GetElemCustPhone( cr )  // <Phone>...</Phone>
                             +XmlElem( "Rating", cr.m_sRating )
                          ;
                          sAllCustomers += sElemOneCustomer;
                          cr.MoveNext();
                      }
                      CString sXML= XmlElem( "Customers", sAllCustomers );

Open in new window


The XmlGen Object
Being a good object-oriented programmer, you will want to wrap it all up into a class object.  I found it convenient to have the object accumulate and manage its own output string.  Rather than having functions return a string to the caller, I decided to use a series of XmlOutXxxxx functions that append data to the current output, and a GetXml() function to access that output.  The following is far from a definitive tool; it's designed to give you some functional code and a starting point for your own XML-generating toolkit.

Here's the header file:
// XmlGen.h
                      // 
                      #pragma once
                      const int  CNUM_nMaxLenTmpXML= 10000;
                      
                      class CXmlGen  
                      {
                      public:
                          CXmlGen( );
                          virtual ~CXmlGen();
                      
                          CString GetXml();
                          void    SetXml( LPCSTR szTxt );
                      
                          void LineBreaks( BOOL fBreaks=TRUE ) {m_fAddLineBreaks= fBreaks;};
                          void SkipBlanks( BOOL fSkip=TRUE   ) {m_fNothingIfBlank= fSkip; };
                      
                          void XmlOutTag( LPCSTR szElem );
                      
                          void XmlOutElem( LPCSTR sElem, LPCSTR sVal );
                          void XmlOutElem( LPCSTR sElem, int    nVal );
                      
                          void XmlOutAttr( LPCSTR sAttr, LPCSTR sVal );
                          void XmlOutAttr( LPCSTR sAttr, int    nVal );
                      
                          void XmlOutFormat(  LPCSTR sFmt,... );
                      
                          void XmlOut(       LPCSTR sTxt );
                          void XmlOutLine(   LPCSTR sTxt );
                      
                          void ToFile( LPCSTR szFilename );
                      
                          // Functions that don't affect m_sXml accumulator
                          static CString XmlFormat( LPCSTR sFmt,... );
                          static CString XmlElemWithAttrs( LPCSTR szTagName, LPCSTR szTagVal, int nAttrCnt, ... );
                      
                          static CString XmlElem( LPCSTR sElem, LPCSTR sVal, BOOL fLineBreak=FALSE );
                          static CString XmlElem( LPCSTR sElem, int    nVal, BOOL fLineBreak=FALSE );
                      
                          static CString Quoted( LPCSTR sTxt );
                          static CString FixForXml( LPCSTR szText ) ;
                          static CString NumToStr( int n );
                          
                      private:
                          CString      m_sXML;
                          CString      m_sTmpXML;
                          int          m_nLenTmpXML;
                          BOOL         m_fNothingIfBlank;
                          BOOL         m_fAddLineBreaks;
                      };

Open in new window

And here's the C++ code file:
// XmlGen.cpp
                      // implements the CXmlGen class
                      //
                      #include "stdafx.h"
                      #include "XmlGen.h"
                      
                      CXmlGen::CXmlGen( ) {
                          m_fNothingIfBlank= FALSE;
                          m_fAddLineBreaks= FALSE;
                          m_sXML= "";
                          m_sTmpXML.Preallocate( CNUM_nMaxLenTmpXML ); // see notes
                          m_sTmpXML="";
                          m_nLenTmpXML= 0;
                      }
                      CXmlGen::~CXmlGen() {
                      }
                      
                      //-------------------------------------------------------
                      // Use printf-like formatting string, but %q means "quoted" 
                      //
                      void CXmlGen::XmlOutFormat( LPCSTR szFmt,... )
                      {
                          CString sTxt;
                          va_list args;  va_start(args, szFmt);   
                      
                          CString sFmt= szFmt;
                          sFmt.Replace("%q","\"%s\"" );
                      	
                          sTxt.FormatV( (LPCSTR)sFmt, args);
                          XmlOut( sTxt );
                      }
                      //--------------------------------------------------------
                      void CXmlGen::XmlOutLine( LPCSTR sTxt )
                      {
                          CString sTmp= sTxt; sTmp+="\r\n";
                          XmlOut( sTmp );
                      }
                      //--------------------------------------------------------
                      void CXmlGen::SetXml( LPCSTR szTxt )
                      {
                          m_sXML= szTxt;
                          m_sTmpXML= "" ;
                          m_nLenTmpXML= 0;
                      }
                      //--------------------------------------------------------
                      CString CXmlGen::GetXml()
                      {
                          m_sXML += m_sTmpXML;
                          m_sTmpXML= "" ;
                          m_nLenTmpXML= 0;
                          return( m_sXML );
                      }
                      //--------------------------------------------------------
                      // Append some output; Manage the temporary string
                      void CXmlGen::XmlOut( LPCSTR sTxt )
                      {
                          m_sTmpXML += sTxt;
                          if ( m_fAddLineBreaks ) {
                              m_sTmpXML += "\r\n";
                          }
                          m_nLenTmpXML += strlen( sTxt );
                          if ( m_nLenTmpXML > CNUM_nMaxLenTmpXML ) {
                              m_sXML += m_sTmpXML;
                              m_sTmpXML= "" ;
                              m_nLenTmpXML= 0;
                          }
                      }
                      
                      //---------------- these output nothing if nVal is -1 or szVal is ""
                      //
                      void CXmlGen::XmlOutTag( LPCSTR szElem ) {   // output a Tag only  
                          XmlOutFormat( "<%s>", szElem ); 
                      }
                      
                      void CXmlGen::XmlOutElem( LPCSTR szElem, LPCSTR szVal ) { 
                          if ( szVal[0] != 0 ) {
                              XmlOutFormat( "<%s>%s</%s>", szElem, (LPCSTR)szVal, szElem ); 
                          }
                      }
                      void CXmlGen::XmlOutElem( LPCSTR szElem, int nVal )     { 
                          if ( nVal != -1 ) { 
                              XmlOutElem( szElem, NumToStr( nVal ) );
                          }
                      }
                      void CXmlGen::XmlOutAttr( LPCSTR szAttr, LPCSTR szVal ) { 
                          if ( szVal[0] != 0 ) {
                              XmlOutFormat( " %s=\"%s\"", szAttr, (LPCSTR)FixForXml(szVal) );         
                          }
                      }
                      void CXmlGen::XmlOutAttr( LPCSTR szAttr, int nVal )     { 
                          if ( nVal != -1 ) { 
                              XmlOutAttr( szAttr, NumToStr( nVal ) );                 
                          }
                      }
                      
                      //--------------------------------------------------------
                      void CXmlGen::ToFile( LPCSTR szFilename )
                      {
                          CFile cFile( szFilename, CFile::modeCreate | CFile::modeWrite  );
                          CString sOut= GetXml();
                          cFile.Write( sOut, sOut.GetLength() );
                      }
                      
                      
                      //--------------------------------------------------------
                      // static utility functions
                      //
                      CString CXmlGen::Quoted( LPCSTR sText )
                      {
                          CString sRet= "\"";
                          sRet += sText;
                          sRet += "\"";
                          return( sRet );
                      }
                      
                      //--------------------------------------------------------
                      CString CXmlGen::FixForXml( LPCSTR szText ) 
                      {
                          CString sRet= szText;
                      
                          sRet.TrimRight(); sRet.TrimLeft();  // strip leading and trailing spaces
                      
                          if ( sRet.FindOneOf("&><\"'") != -1 ) {
                              sRet.Replace("&","&amp;");
                              sRet.Replace(">","&gt;");
                              sRet.Replace("<","&lt;");
                              sRet.Replace("\"","&quot;");
                              sRet.Replace("'","&apos;");
                          }
                          return( sRet );
                      }
                      
                      //--------------------------------------------------------
                      CString CXmlGen::NumToStr( int n ) {
                          CString sRet;
                          sRet.Format("%d", n);
                          return( sRet );
                      }
                      
                      CString CXmlGen::XmlFormat( LPCSTR szFmt,... )
                      {
                          CString sTxt;
                          va_list args;  va_start(args, szFmt);   
                      
                          CString sFmt= szFmt;
                          sFmt.Replace("%q","\"%s\"" );
                      	
                          sTxt.FormatV( (LPCSTR)sFmt, args);
                          return ( sTxt );
                      }
                      
                      CString CXmlGen::XmlElem( LPCSTR szElem, LPCSTR szVal, BOOL fLineBreak/*=FALSE*/ ) { 
                          CString sRet="";
                          if ( szVal[0] != 0 ) {
                              sRet= XmlFormat( "<%s>%s</%s>", szElem, (LPCSTR)szVal, szElem ); 
                              if (fLineBreak) sRet += "\r\n"; 
                          }
                          return( sRet );
                      }
                      CString CXmlGen::XmlElem( LPCSTR szElem, int nVal, BOOL fLineBreak/*=FALSE*/ )     { 
                          CString sRet="";
                          if ( nVal != -1 ) { 
                              sRet= XmlFormat( "<%s>%d</%s>", szElem, nVal, szElem ); 
                              if (fLineBreak) sRet += "\r\n"; 
                          }
                          return( sRet );
                      }
                      //--------------------------------------------------------
                      // variable args.  IMPORTANT: nAttrCnt must be accurate
                      //
                      CString CXmlGen::XmlElemWithAttrs( LPCSTR szTagName, LPCSTR szTagVal, int nAttrCnt, ... )
                      {
                          va_list pVarArg;
                      
                          CString sRet, sClose;
                          sRet.Format("<%s ", szTagName );
                          sClose.Format("</%s>", szTagName );
                      
                          CString sAttrName, sAttrVal;
                          va_start( pVarArg, nAttrCnt );
                          for ( int j=0; j<nAttrCnt; j++ ) {
                              try {
                                  sAttrName= va_arg( pVarArg, LPCSTR);
                                  sAttrVal=  va_arg( pVarArg, LPCSTR);
                              }
                              catch( ... ) {
                                  // LogErr("bad args in XmlElemWithAttrs" );
                                  ASSERT(0); // catch during debug runs
                                  sAttrName= sAttrVal="";
                              }
                              if ( sAttrVal > "" ) {
                                  sRet += sAttrName + "=\"";
                                  sRet += sAttrVal + "\" ";
                              }
                         }
                          va_end( pVarArg );  
                          if ( CString(szTagVal) > "" ) { // lazy check for NULL or ""
                              sRet += ">";
                              sRet += szTagVal;
                              sRet += sClose; // e.g., "</Element>";
                          }
                          else {
                              sRet += "/>";
                          }
                          return( sRet );
                      }

Open in new window

Some notes about this object:
The XmlOutXxxxx functions all output to the object's accumulation string.   You must use the accessor function -  GetXml() -- to access the accumulated data.
Before going into the accumulator, all of the data is output to a temporary string; the functions occasionally concatenate that string to the real accumulator variable.

The reason for this is something I've called The Problem with Large Strings in other articles.  XML output is likely to get very lengthy, and string concatenation can end up being a real performance killer.  Adding even a single character to a string might force the string's concatenation logic to allocate a new buffer, copy the original data, and then add the new data to the end.  When the buffer gets large -- say 100K -- that overhead can slow your program to a crawl.  

In the above code, the temporary string is pre-allocated with a certain length.  When appending new data would force it past that length, the data is flushed to the final output string.  The end result is that the Large String only has to be reallocated and copied a few times.  This two-buffer technique is better than pre-allocating the final string in cases where the maximum length is not known and should be left open-ended.  We want to avoid allocating a huge buffer if it might not needed.
The XmlGen object provides a number of static member functions that can be used to create temporary strings which you can then either send to the accumulator by using XmlOut(), or use as a parameter with any of the other XmlOutXxxxx functions.
The XmlElemWithAttrs() function is from an earlier article and the use of the peculiar va_list() and va_start() calls in it and in the XmlFormat() function is described there.
There are two overloaded versions of XmlOutElem() and XmlOutAttr() so that you can pass either a string or an integer value to the function.  This is part of the reason to "roll yer own" -- so that when you need a new feature, you can add it in a few minutes.
Also, regarding XmlOutElem() and XmlOutAttr():  There is a peculiarity in the code that you may not want... In my database handling, I often use -1 for numeric values when I need to indicate "No Data."  This happens to be true for me and my data, but it is not the common way to handle that.  So, here's a "heads up" notice:  You should rewrite those functions if you want different handling.

Here's some example code that mixes-and-matches some of the options.
CXmlGen cXml;
                      cXml.LineBreaks( TRUE );
                      
                      cXml.XmlOutFormat("<?xml version=%q encoding=%q?>\r\n", "1.0", "UTF-8" );
                      cXml.XmlOut("<Root>" );
                      
                      cXml.XmlOutElem( "PersonData", 
                           cXml.XmlElem( "FirstName", "Dan" )
                          +cXml.XmlElem( "LastName",  "Rollins" )
                          +cXml.XmlElem( "Children",  2 )
                          +cXml.XmlElem( "HairStyle", "ponytail" )
                      );
                      CString sElemProduct1= cXml.XmlElemWithAttrs(
                          "Product", "Widget", 4, 
                          "size",   (LPCSTR)"large",
                          "color",  (LPCSTR)"blue",
                          "data",   (LPCSTR)"",
                          "rating", (LPCSTR)"7"
                      );
                      CString sElemProduct2= cXml.XmlElemWithAttrs(
                          "Product", "Gizmo", 1, 
                          "size",   (LPCSTR)"small"
                      );
                      cXml.XmlOutElem( "Products", sElemProduct1+sElemProduct2 );
                      cXml.XmlOut("</Root>" );
                      
                      cXml.ToFile( "c:\\temp\\junk.xml" );

Open in new window


The output of that is:
<?xml version="1.0" encoding="UTF-8"?>
                      
                      <Root>
                      <PersonData><FirstName>Dan</FirstName><LastName>Rollins</LastName><Children>2</Children><HairStyle>ponytail</HairStyle></PersonData>
                      <Products><Product size="large" color="blue" rating="7" >Widget</Product><Product size="small" >Gizmo</Product></Products>
                      </Root>

Open in new window

Or, as shown in a webbrowser:
As seen in a webbrowser=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
If you liked this article and want to see more from this author,  please click the Yes button near the:
      Was this article helpful?
label that is just below and to the right of this text.   Thanks!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
2
3,660 Views
DanRollins
CERTIFIED EXPERT

Comments (1)

Commented:
Nice. Helpfull. Thanks.

Regards,
Pavel

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.