Solved

Weird string value assignment problem

Posted on 2004-09-15
32
393 Views
Last Modified: 2012-06-22
I am writing a small class to handle my *.ini file functions. I need to to be cross platform, so I'm not using the built in Windows functions.

Anyway, I am having a strange problem in part of the code in regards to string assignment. Sometimes it is working, sometimes not. I will try to explain in more detail here.

Below is a snippit from the function that edits an existing  key/value in a given section. The program loops through the given *.ini file, and when it finds a line that starts with a "[", it assigns that value to CurrentSection. If the line contains an equals sign (=), then it assigns everything before the equals sign to CurrentKey (i.e MyKey=MyValue).

(Just FYI, UCase is my own function that trims the extra spaces from the sides of the string, and makes it all uppercase)
====================================
while (!inFile.eof()) // Read until the end of the file
{
  inFile.getline(buffer,255);  // Get one line of data
  string s = buffer;  // Convert the data to a string to make life easier
  UCase(s);  //Transform to UCase for case insensitive compare

  if (s[0]=='[')  // If the first char is a [
  {
    CurrentSection = s;  // Then set the CurrentSection flag  **** HERE IS THE PROBLEM
    UCase(CurrentSection);
  }

  if (s.find("=") != string::npos)  // If there's an = sign in this line
  {
    CurrentKey = s.substr(0,s.find_first_of('='));  //Then set the CurrentKey flag   **** AND ALSO HERE
    UCase(CurrentKey);
  }

  // If the CurrentSection and CurrentKey are the ones we are looking for then, then write out the new value
  if ((CurrentSection.compare(SectionName)==0)&&(CurrentKey.compare(KeyName)==0))
  {
    outFile << KeyName + "=" + Value << endl;
  } else {  // otherwise, just write out the line as we read it
    outFile << s << endl;
  }
}

This code works just fine, until I get to a section or a key/value on any sections or keys that I created in Notepad. However, when it gets to a section or key/value that was written by the program, then the assignment doesn't work!

I setup a watch on the variables s,CurrentSection and CurrentKey - sure enough, I can see that s has a value, but after it runs s=CurrentSection, CurrentSection is blank (only on lines that were created by the program).

I assume this might have something to do with the way the strings are terminated when I write them to the file - maybe I am doing something wrong? But I am very surprised that the line with the .substr() value is not assigning a value either.

Any ideas? I'll be happy to send a copy of the project and test file if needed.
 
0
Comment
Question by:toddhd
  • 12
  • 7
  • 6
  • +2
32 Comments
 
LVL 30

Expert Comment

by:Axter
Comment Utility
inFile.getline(buffer,255);  // Get one line of data

The above code does not read just one line of data.
Instead, it's reading the entire text file.
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
oops!!!
Disregard previous comment.

I recomend you change that code to the following:

string s;
std::getline(inFile, s);

This is much more efficient, and it doesn't matter if the target line has more then 255 characters.
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
Try opening the file after it has been modified by your program.
Check that you have each line ending with correct catridge return and next line characters set.
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
Axter,

Nope, it's reading one line. If nothing else, I can watch its value in Debug mode, and see that only one line is being read. The getline written this way will read up to 255 characters, or until it hits a '\n', whichever comes first. The third parameter is optional, and defaults to 'n'.

See this page for details:
http://www.cplusplus.com/ref/iostream/istream/getline.html
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
>>The third parameter is optional, and defaults to 'n'.

Actually, the default is '\n'
I misread your code, and I thought it was using the global std::getline function, which has a last argument specifying the terminator character.
When reading data from a text file, should use the global std::getline function because it can be used with the std::string type directly.
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
Sorry, just saw your retraction about the getline - I'll try the other method you suggested as well.

Yes, I do open file after each run. I've noticed now, after more tests, that is isn't just the lines generated by the program, but whatever the last section and key/value are that are getting screwed up somehow. I'll run some more tests and report what I find.
0
 
LVL 11

Expert Comment

by:bcladd
Comment Utility
(1) eof is true after reading past the end of the file. Thus it is typical to read before the loop and read again at the end of the loop. Or to use the more standard C++ idiom:
  std::string s;
  while(std::getline(inFile, s)) {
    // process string s
  }

The way your code is structured it does the body of the while loop after the getline reads PAST the end of the file. As s is left unchanged by the call, you actually process the last line twice.

(2) You will want to handle blank lines because if UCase(s) is empty, s[0] is an access violation.

(3) In
      CurrentKey = s.substr(0,s.find_first_of('='));  //Then set the CurrentKey flag   **** AND ALSO HERE
why use find_first_of? You already used find and it seems to make more sense.

Hope this helps,
-bcl
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
Well, some things I noticed:

First, after switching to the format you reccomended:

string s;
std::getline(inFile, s);

The way the program read the file changed. Using MY code, hitting a blank line showed in the debugger as:

s={""}      std::basic_string<char,std::char_traits<char>,std::allocator<char> >


Using YOUR code, it shows:

s={"( "}      std::basic_string<char,std::char_traits<char>,std::allocator<char> >

I have no idea what that means. Also, using your code, some lines that *aren't* blank are also showing up as that strange symbol. Maybe I'll see if I can dig up a hex editor and look more closely at that text file?
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
bcladd - Thanks for the input - you were right about it reading past the end of the file, which was causing another problem in the code. Your other suggestions were helpful as well.
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
Ugh, just some more info - Ok, here is the *.ini file I am reading:

[BOOT LOADER]
TESTKEY=TESTVALUE
TIMEOUT=100
DEFAULT=MULTI(0)DISK(0)RDISK(0)PARTITION(2)\WINDOWS
[OPERATING SYSTEMS]
TESTME=100
MULTI(0)DISK(0)RDISK(0)PARTITION(2)\WINDOWS="MICROSOFT WINDOWS XP PROFESSIONAL" /FASTDETECT
[TODD DAVIS]
KATJA=DAVIS

The first and second lines read into s just fine. When it gets to TIMEOUT, s="" according to the debugger. However, CurrentKey=s shows that CurrentKey = "TIMEOUT"!

The next two lines again read and assign properly. When it gets to TESTME, s= <the strange little symbol> , but again, CurrentKey gets the correct value!

Finally, when it gets to the [TODD DAVIS] and KATJA=DAVIS lines, everything shows up as "".

One more thing - in my post above (and in this one), the little symbol is showing up as a left paren "(", but the symbol I'm talking about looks something like the symbol for a "male" - the circle with an arrow coming out the top of it.
0
 
LVL 11

Expert Comment

by:bcladd
Comment Utility
(1) On my browser I read the odd character as character 11. I don't know if any magic munging went on between your paste and my read. Doesn't help as much as I would have liked to know that but it seems to be the case.

(2) Everything shows up as "" --- meaning that CurrentKey is not set correctly?

(3) Could you repost your code? That is, the modified version that you are now running?

-bcl
0
 
LVL 17

Expert Comment

by:rstaveley
Comment Utility
Can you check your .ini file with a binary/hex editor and see if TIMEOUT=100 has characters other than 0x30 0x30 0x30 for the "100"?
0
 
LVL 20

Expert Comment

by:ikework
Comment Utility
if you need a cross platform-solution for handling *.ini's, i would suggest you to look at the
mysql-source. they have excactly this stuff in their code, and some other good things, like
an easy-to-use commandline-handler, it's for almost each platform and under GPL, have a look
at it, it worths, at least you can copy and paste a lot... ;)

im using it in my crossplatform project too, and im happy with it ...


hope it helps
maik
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
1) I posted an image of the little symbol on the web. To be clear, it is a left paren, then the symbol:
http://www.seaburydesign.com/arrow.gif

2) Yes, that is correct - when it gets to [TODD DAVIS] and beyond, it acts as though a line is read, but s="", and CurrentSection/CurrentKey = "" as well.

3) My current code (PS - I tried remming out the UCase() function just in case, but it didn't make a difference):

bool CIniFile::SetValue(string KeyName, string Value, string SectionName, string FileName)
{
      string CurrentSection, CurrentKey;                                    // Flag to track which section/key we're in

      UCase(SectionName); UCase(KeyName); UCase(Value);
      
      if (!CIniFile::SectionExists(SectionName,FileName))            // Does the Section already exist?
      {                                                                                    
            AppendValue(KeyName,Value,SectionName,FileName);      // No, it doesn't, so we just have to append it
            return true;                                                            // And we're done
      }

      if (!CIniFile::KeyExists (KeyName,SectionName,FileName))// Does the key already exist?
      {
            InsertValue(KeyName,Value,SectionName,FileName);      // No, so just locate the section and add it
            return true;                                                            // And we're done
      }

      Bracket(SectionName);                                                       // Format the section name for our use

      ifstream inFile (FileName.c_str());                                    // Create an input filestream
      string OutFileName = FileName + ".tmp";                              // Name the temporary file
      ofstream outFile(OutFileName.c_str());                              // Create an output/truncate filestream

      if (!inFile.is_open()) return false;                              // If the input file doesn't open, then return
      if (!outFile.is_open()) return false;                              // If the output file doesn't open, then return

      string s;
      while(std::getline(inFile, s))                                            // Read until the end of the file
      {
            //if (!s.empty()) UCase(s);                                          //Transform to UCase for case insensitive compare

            if (s[0]=='[')                                                             // If the first char is a [
            {
                  CurrentSection = s;                                                // Then set the CurrentSection flag
                  UCase(CurrentSection);
            }

            if (s.find("=") != string::npos)                              // If there's an = sign in this line
            {
                  CurrentKey = s.substr(0,s.find('='));                  // Then set the CurrentKey flag
                  UCase(CurrentKey);
            }

            // If this is the Section and Key we are looking for, then output the key with the new value,
            // Otherwise, just output the Key/Value verbatim
            if ((CurrentSection.compare(SectionName)==0)&&(CurrentKey.compare(KeyName)==0))
            {
                  outFile << KeyName + "=" + Value << endl;
            } else {
                  outFile << s << endl;
            }
      }
      inFile.close();
      outFile.close();

      if(remove(FileName.c_str()) != -1) rename(OutFileName.c_str(), FileName.c_str());

      return false;
}
0
 
LVL 20

Expert Comment

by:ikework
Comment Utility
to prevent misunderstanding:
"at least you can copy and paste a lot" -> for sure only under the terms of the GPL !!!!!!!!
0
 
LVL 20

Expert Comment

by:ikework
Comment Utility
here you can download it:

 http://dev.mysql.com/downloads/
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 11

Expert Comment

by:bcladd
Comment Utility
Also, the .ini file, as shown, has no blank lines in it. Is that accurate? The reason I ask is you are still going to have problems with blank lines.

You probaly want something like
  while (std::getline(inFile, s)) {
    UCase(s);
    if (!s.empty()) {
      // the body of your code (down to the last if...
      if ((CurrentSection.compare(SectionName)==0)&&(CurrentKey.compare(KeyName)==0))
      {
           s = KeyName + "=" + Value;
 
    }
    outFile << s << endl;
  } // end of the while not eof loop

-bcl
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
To rstavely -

It has "31 30 30" (which is what I assume you meant).
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
To ikework - Thanks, I'll look for that code!
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
Just to make easier, I zipped up the VS.NET project and posted it
http://www.seaburydesign.com/testinifile.zip

It was written in VS.NET 2003, pretty much just a class and test file. The file c.txt is in there too, it's the INI file I'm testing (ok, so its a text file, but you know what I mean).

If you start a trace at the "while" statement, and watch the variables s,CurrentSection and CurrentKey, you'll see the weirdness.
0
 
LVL 11

Expert Comment

by:bcladd
Comment Utility
Just looking over the code:

There is no need to UCase CurrentKey (s has been captitalized by the time you get there).
Suggestions for debugging:
Print the value in s each time through the loop. I know, you're looking with the debugger. I like double checking the debugger sometimes. I also like the old cout << style of debugging because I can usually analyze the output using a text processor if necessary.

Print out the length of s. This will tell you if there are any non-printing characters in the string.

What compiler are you using? MS VS? Or one of Borland's products?

-bcl
0
 
LVL 11

Expert Comment

by:bcladd
Comment Utility
You are breakiing before s is read in.

More in a moment.

-bcl
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
I'm using MS VS 2003. Maybe this weekend I'll load it up into Linux and compile it there to see what happens. You never know with MS sometimes...
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
One of the problems I had before was that "Katja=Davis was repeated many times after running the program. That was solved by your suggestion regarding the EOF() handling. Now Katja=Davis only shows up twice, which is still once more than needed.

But here is the weird thing. Move the [Todd Davis] section to the top, like this:
[TODD DAVIS]
KATJA=DAVIS
[BOOT LOADER]
TESTKEY=TESTVALUE
TIMEOUT=100
DEFAULT=MULTI(0)DISK(0)RDISK(0)PARTITION(2)\WINDOWS
[OPERATING SYSTEMS]
TESTME=100
MULTI(0)DISK(0)RDISK(0)PARTITION(2)\WINDOWS="MICROSOFT WINDOWS XP PROFESSIONAL" /FASTDETECT

Now when you run it, it works fine???
0
 
LVL 11

Expert Comment

by:bcladd
Comment Utility
The comment on your break point: You break on the while loop which is before the getline executes. Thus you are seeing (or can be seeing) garbage in the value of s.

It appears that the program works for me with the following loop:

      while(std::getline(inFile, s))                                            // Read until the end of the file
      {
            cout << "s = \"" << s << "\": " << s.size() << endl;
            if (!s.empty()) {
                  UCase(s);                                          //Transform to UCase for case insensitive compare

                  if (s[0]=='[')                                                             // If the first char is a [
                  {
                        CurrentSection = s;                                                // Then set the CurrentSection flag
                        UCase(CurrentSection);
                  }

                  else
                  {
                        if (s.find("=") != string::npos)                              // If there's an = sign in this line
                        {
                              CurrentKey = s.substr(0,s.find('='));                  // Then set the CurrentKey flag
                              UCase(CurrentKey);
                        }

                        // If this is the Section and Key we are looking for, then output the key with the new value,
                        // Otherwise, just output the Key/Value verbatim
                        if ((CurrentSection.compare(SectionName)==0)&&(CurrentKey.compare(KeyName)==0))
                        {
                              s =  KeyName + "=" + Value;
                        }
                  }
            }
            outFile << s << endl;
      }

The problem is the extra blank line at the end of the .ini file.

-bcl
0
 
LVL 17

Expert Comment

by:rstaveley
Comment Utility
How very odd. I think I'm seeing what you see, toddhd.

On the first getline, s._Bx._Buf gets "BOOT LOADER" and thus s == "BOOT LOADER"
On the second getline, s._Bx._Buf gets gibber (but you see s._Bx._Ptr "TESTKEY=TESTVALUE")

It looks like getline isn't doing what it is supposed to do with the std::string implementation, but that's just too weird. I'll look at this again tomorrow with a clear head. It's not obvious what's going on.
0
 
LVL 17

Expert Comment

by:rstaveley
Comment Utility
If you change it to...
--------8<--------
      string s;
      char buf[1000];
      //while(std::getline(inFile, s))                                            // Read until the end of the file
      while(inFile.getline(buf,1000))                                            // Read until the end of the file
      {
            /*string*/ s = buf;
--------8<--------
...it still gets clobbered.

If you change it to...
--------8<--------
      //string s;
      char buf[1000];
      //while(std::getline(inFile, s))                                            // Read until the end of the file
      while(inFile.getline(buf,1000))                                            // Read until the end of the file
      {
            string s = buf;
--------8<--------
...it works.

It looks like s gets clobbered somewhere in the loop, but I can't see where. Once it has been clobbered it cannot have new values assigned to it?!
0
 
LVL 17

Expert Comment

by:rstaveley
Comment Utility
...I really will leave this until tomorrow, unless someone else spots the problem first.
0
 
LVL 11

Assisted Solution

by:bcladd
bcladd earned 50 total points
Comment Utility
Using the microsoft implementation of string, there is a built-in 16 character buffer (for making short strings easy to use) and there is a pointer at a dynamically allocated buffer (_Buf and _Ptr respectively in the _Bx field of the string). Once the string has gone past 16 characters, _Buf is filled with some values (it looks like that is where the arrow character comes from). If you open up the fields inside the string and watch them function, you can watch the first few SHORT strings get stored inside _Buf and then, when the length has to get bigger, they are stored in _Ptr only.

The other thing to keep in mind is that you want to examine the value in the string AFTER the call to getline.

The value in the string is not being clobbered; instead you have to look deeper into the structure of string to interpret what you are seeing.

-bcl
0
 
LVL 17

Accepted Solution

by:
rstaveley earned 75 total points
Comment Utility
Thanks for that bcl. That makes sense and with hindsight I vaguely remember reading about this approach in Sutter/Meyers.

It *looks* like things are going awry in the debugger VS .NET 2003 in toddhd's project, when the string is allocated something shorter than 17 characters, when the previous string was 17 characters or more.

Here's what we see in the debugger:
Line 1:s = "[BOOTLOADER]",_mysize=13,_myres=15,_Buf="[BOOTLOADER]",_Ptr=gibber
Line 2:s="TESTKEY=TESTVALUE",_mysize=17,_myres=31,_Buf=gibber,_Ptr="TESTKEY=TESTVALUE"
Line 3:s="{???}",_mysize=11,_myres=31,_Buf=gibber,_Ptr="TIMEOUT=100"

However, I think what's happening is that the debugger gets confused when _myres>16 and _mysize<=11. I'll bet that rather than looking at _myres to figure out if the string is in _Ptr or _Buf, it is looking at _mysize and that's what's causing the confusion.

toddhd, methinks we've both been tricked by the VS .NET 2003 debugger.
0
 
LVL 17

Expert Comment

by:rstaveley
Comment Utility
You can reproduce this bug quite simply with the following code.

--------8<--------
int _tmain(int argc, _TCHAR* argv[])
{
      std::string str;
      char *str_list[] = {
            "0123456789"            // <= 16 characters long
            ,"0123456789ABCDEFGHIJ"      // > 16 characters long
            ,"0123456789"            // <= 16 characters long
            ,0};
      char **p = str_list;
      while (*p) {
            str = *p++;
            ;      // <- Put the breakpoint here
                  // Watch str, str._Bx._Buf, str._Bx._Ptr,
                  // str._MySize and str._Myres in the watch window
                  // in the debugger
                  // See what happens after the 3rd assignment
                  // where _Mysize <= 16 and _Myres > 16
      }
      return 0;
}
--------8<--------

I'll report this bug to MS - for what it's worth
0
 
LVL 8

Author Comment

by:toddhd
Comment Utility
Great work guys - I eventually ended up trashing this code and going with a model that reads everything into memory and does its work there. Still working on that. But at least now I know not to trust the debugger in this case - I'll have to stick to cout and other methods to debug.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Unlike C#, C++ doesn't have native support for sealing classes (so they cannot be sub-classed). At the cost of a virtual base class pointer it is possible to implement a pseudo sealing mechanism The trick is to virtually inherit from a base class…
Article by: SunnyDark
This article's goal is to present you with an easy to use XML wrapper for C++ and also present some interesting techniques that you might use with MS C++. The reason I built this class is to ease the pain of using XML files with C++, since there is…
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now