Parsing hex values from an xml file

I am converting a software application from reading and parsing a standard ASCII text file containing operational parameters to an input file in XML format. One of the parameters needs to be a sequence of bytes. Each byte can have a value from 0x00 to 0xFF. So a sample line from the XML file might then be:
<value>0x34 0x3 0x08 0xf2 0xD7 0X39</value>
Note that I purposely presented variations in the format of how a user might populate this field. What is the best way to convert that string into an array of bytes with these values?
Who is Participating?

Improve company productivity with a Business Account.Sign Up

philrosenbergConnect With a Mentor Commented:
You could simplify Sara's code by using showbase I think
   fstream fin;
   //code to open file and read to correct place
   unsigned int hexnumber;
   unsigned char hexbyte;
   fin >> std::showbase >> std::hex >> hexnumber;
   hexbyte=(unsigned char) hexnumber;

Open in new window

Pierre FrançoisSenior consultantCommented:
No problem. In which language do you want to code?
Dave BaldwinFixer of ProblemsCommented:
Actually, there will be problems because a number of characters are not legal in XML.  Only a few control characters are allowed and four characters must be 'escaped' to be included in an XML document.

XML is supposed to be a text format.  The standard says you can use % encoding for unrecognized characters.  There is nothing, however, about a real 'byte array'.
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

bbrmsAuthor Commented:
I apologize. I forgot to mention the language. I am coding in C++.  
Per Dave’s comment: please note that my example demonstrates that I am inputting a  human-readable ASCI string that contains only digits, spaces and the letters A through F in any combination of lower and upper case. This should pose no problem for  XML. The application will process the string and convert the text to  an array of bytes corresponding to the string. I can code up my own solution using sscanf() but I am hoping that there is a solution that is more elegant than that and less time-consuming to code. The string can have any number of hexadecimal  values designated (i.e. the size of the string can be from zero to N, where N is probably limited to a few hundred at the most, and that is a fringe case)
Dave BaldwinFixer of ProblemsCommented:
I misunderstood.  I thought you were trying to put the byte array into the XML.
Michel PlungjanIT ExpertCommented:
bbrmsAuthor Commented:
mplungjan provded a google search for C++ string representation of  hex. If you read my two posts carefully under this topic you will see that I already have a string of hex characters being parsed by my XML parser. When I posted this originally I had hoped that some expert user of XML would know of a routine available from perhaps some open source utility that would process the text line as I described it and generate perhaps a vector<uint_8> container of translated values from that string, without being sensitive to case of the 'x' character, the number of spaces or tabs and other wise be very tolerant in its range of acceptable syntax for the input string. I don't believe what I was looking for exists based on the posts I am getting, so I have coded up my own solution.
Michel PlungjanIT ExpertCommented:
I asked because the top links gave examples of how to convert strings back to binary.
I am not at all a c programmer but since the question was not commented on at all I hoped I could at least get you started.
Feel free to post what you ended up with and accept your own comment as answer.
sarabandeConnect With a Mentor Commented:
the below code should do

 std::string s = "0x34 0x3 0x08 0xf2 0xD7 0X39";
 std::istringstream istmp(s);
 std::vector<unsigned char> v;
 unsigned int b;
 char c1, c2;
 while (istmp >> c1 >> c2 >> std::hex >> b)
       if (c1 != '0' || toupper(c2) != 'X' || b >= 256)
        v.push_back((unsigned char)b);

Open in new window

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.