Solved

Method for replacing string in a file

Posted on 2002-06-30
41
409 Views
Last Modified: 2008-02-01
I'm sure someone has done this before.
I'm looking for a function or method to replace keyword in a file.

Does any one have some source code that can do this?

I need a function that can replace the string even if they're different sizes.

Like
replace("Hello", "Goodbye!")
replace("!", "!!")
replace("Good", "God")
replace("God", "")




0
Comment
Question by:hestercybil
  • 12
  • 10
  • 7
  • +6
41 Comments
 
LVL 4

Expert Comment

by:mblat
Comment Utility
1. Load file into memory.
2. Search for occurance of the string.
3. Replace it.
4. Write it back out.


That seems to be the easiest way to do it.


Hope it helps....
0
 

Author Comment

by:hestercybil
Comment Utility
Thanks mblat, but I'm looking for code.
I've already tried creating my own replace method, and the problem turns out to be more complicated then it looks.

If it's easy, then can you please provide the code?
0
 

Author Comment

by:hestercybil
Comment Utility
Loading the file to memory, and finding the string is easy enough, but the replace method get tricky if the strings/keywords are not the same size.
0
 

Expert Comment

by:Hill8982
Comment Utility
Is this for an MFC project?
0
 

Author Comment

by:hestercybil
Comment Utility
It's non-MFC, and it needs to work in both Windows and Unix.

I'll also add an extra 100 points if the function can work on Wide-String char files.
0
 

Expert Comment

by:CornDog932
Comment Utility
>>I've already tried creating my own replace method, and
>>the problem turns out to be more complicated then it
>>looks.

Can you post the code you have?
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
You can try the following function:

#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include <functional>

template<typename T>
void ReplaceArray(T &SrcString, const T &OldStr, const T &NewStr)
{
     T::iterator i =  SrcString.begin();
     size_t SizeNewStr = NewStr.size();
     size_t SizeOldStr = OldStr.size();
     size_t SizeDiff = (SizeNewStr>SizeOldStr)?SizeNewStr-SizeOldStr:SizeOldStr-SizeNewStr;
     while((i=std::search(i,SrcString.end(),OldStr.begin(),OldStr.end())) != SrcString.end())
     {
          size_t OffsetPtr = i-SrcString.begin();
          if (SizeDiff)
          {
               if (SizeNewStr > SizeOldStr)
               {
                    SrcString.insert(SrcString.begin()+OffsetPtr, NewStr.begin(), NewStr.begin()+SizeDiff);
               }
               else
               {
                    SrcString.erase(SrcString.begin()+OffsetPtr, SrcString.begin()+OffsetPtr+SizeDiff);
               }
          }
          if (SizeNewStr) SrcString.replace(SrcString.begin()+OffsetPtr, SrcString.begin()+OffsetPtr+SizeNewStr, NewStr.begin());
          i = SrcString.begin()+ OffsetPtr+ SizeNewStr;
     }
}

bool ReplaceStrInFile(const std::string &FileName, const std::string &OldStr, const std::string &NewStr)
{
     std::ifstream fin(FileName.c_str(),std::ios::in);
     std::string FileData;
     std::getline(fin, FileData, fin.widen('\255'));
     fin.close();
     ReplaceArray(FileData, OldStr, NewStr);
     std::ofstream fout(FileName.c_str(),std::ios::out);
     fout << FileData;
     fout.close();
     return true;
}
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
Example usage:
ReplaceStrInFile("c:\\test.txt", "Hello", "Goodbye!");
0
 

Author Comment

by:hestercybil
Comment Utility
Axter, your code seems to work pretty good, but I have a few questions:
I was woundering about this line in your code:
SrcString.insert(SrcString.begin()+OffsetPtr, NewStr.begin(), NewStr.begin()+SizeDiff);

Why are you inserting part of the NewStr there?

Also, I notice instead of using "i", you're using an OffsetPtr variable.

I tried changing your code to use "i" instead of OffsetPtr, but it crashes.
Why does it work with OffsetPtr, but it doesn't work with "i"?

And my last question, is how can I get your code to work with wide charactor files?
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
>>Why are you inserting part of the NewStr there?
It doesn't really matter what I insert in that line of the code.  The only purpose for that line is to expand the string.

>>Also, I notice instead of using "i", you're using an
>>OffsetPtr variable.

"i" is not garanteed to work after an insert.
If the current string buffer is not big enough to complete the insertion, a new buffer has to be created.
If a new buffer is created, that makes "i" pointer invalid, because it's still pointing to the old buffer.

Because the OffsetPtr variable is the offset point from the start of the string, it doesn't matter if a new buffer is created.  The offset is the same in the newly created buffer as it was in the old buffer.

>>And my last question, is how can I get your code to work
>>with wide charactor files?
Hmmm.  I'll have to get back to you on that.
0
 
LVL 9

Expert Comment

by:miron
Comment Utility
my teacher taught me to read the file from and and use reverse function, so that the final write can be done using "a+"
0
 
LVL 9

Expert Comment

by:miron
Comment Utility
arghh, typo, not and, end with 'E'
0
 

Author Comment

by:hestercybil
Comment Utility
>>my teacher taught me to read the file from and and use
>>reverse function, so that the final write can be done
>>using "a+"

I have no idea what you're trying to say here....

Anyway, I'm really looking for code, and not a descriptional method.
0
 

Author Comment

by:hestercybil
Comment Utility
Axter,
Thanks for the explanation.
If no one else comes up with a better method, I'll award you the points.
0
 

Expert Comment

by:ricarditopicaron
Comment Utility
#define FALSE 0
#define TRUE 1
#include<ctype.h>
#include<stdio.h>

void CopyString(char *dest, const char *source, long sourcelength)
{
/*sourcelength= 1 to infinity*/
     long count;

     *dest=*source;
     for(count=1; (count-1)>= sourcelength; ++count)
          *(dest+count)=*(source+count);
}

long FindStringInString(char *string1, long len1,
char *string2, long len2)
{
/*String1 must be smaller than string2*/ /*Ok*/
     long cont, dofind, cont2, character;

     for(cont2=(len2-len1-1), dofind=FALSE;
        cont2>=0 && (!dofind); --cont2)
          for(cont=0, character=0, dofind=1;
                cont<len1 && dofind; ++character, ++cont)
          {
               if(string2[cont2+cont]!=
                        string1[cont])
                    {
                         dofind=FALSE;
                    }
          }
     if(dofind)
          return (long)(cont2+1);
     return -1;
}

long StrSize(const char *string)
{
     long cont;  /*It returns the '\0' position*/

     for(cont=0L; *(string+cont)!='\0'; ++cont);
     return cont;
}

char *FileToMemory(char *file)
{
     FILE *fpointer;
     long cont=0;
     char *memory, *mem2;
     memory=new char[2];
     if((fpointer=fopen(file, "r+"))==NULL)
          return "\0";
     while( fread((void *)(memory+cont),1,1,fpointer))
     {
          mem2=new char[cont+2];
          CopyString(mem2, memory,cont+1);
          delete[] memory;
          memory=mem2;
          ++cont;
     }
     fclose(fpointer);
     memory[cont]='\0';
     return memory;
}

void MemoryToFile(char *file, char *string)
{
     FILE *fp;
     fp=fopen(file,"r+");
     fwrite(string, StrSize(string),1,fp);
     fclose(fp);
}

// THE FINAL ULTRA FUNCTION!!!

int ReplaceStringInFile(char *strfind, char *strchange, char *file)
/*Returns FALSE when replace could not be done(FALSE on error)*/
/*Ex: ReplaceStringInFile("BadHello", "BetterHello","/pub/Myfile.MyExtension")*/
{
     long len1, len2, lenfile, pos,lennewfile;
     char *thefile, *thenewfile;

     thefile=FileToMemory(file);
     lenfile=StrSize(thefile);
     len1=StrSize(strfind);
     len2=StrSize(strchange);
     if(len1>lenfile)
          return FALSE;//The word you search
                         //is greater than the file
     pos=FindStringInString(strfind,len1,thefile,len2);
     if(pos==-1)
          return FALSE;//The search did not
                             //match anything
     lennewfile=lenfile+len2-len1+1;
     thenewfile=new char[lennewfile];
     CopyString(thenewfile, thefile, pos);
     CopyString((thenewfile+pos),strchange, len2);
     CopyString((thenewfile+pos+len2), (thefile+pos+len1), lenfile-pos-len1);
     thenewfile[lennewfile]='\0';
     MemoryToFile(file, memory);
     return TRUE;
}
0
 

Expert Comment

by:ricarditopicaron
Comment Utility
/*Better tabulation*/
#define FALSE 0
#define TRUE 1
#include<ctype.h>
#include<stdio.h>

void CopyString(char *dest, const char *source, long sourcelength)
{
/*sourcelength= 1 to infinity*/
     long count;

     *dest=*source;
     for(count=1; (count-1)>= sourcelength; ++count)
          *(dest+count)=*(source+count);
}

long FindStringInString(char *string1, long len1, char *string2, long len2)
{
/*String1 must be smaller than string2*/ /*Ok*/
     long cont, dofind, cont2, character;

     for(cont2=(len2-len1-1), dofind=FALSE; cont2>=0  && (!dofind); --cont2)
          for(cont=0, character=0, dofind=1; cont<len1 && dofind; ++character, ++cont)
          {
               if(string2[cont2+cont]!=string1[cont])
                    {
                         dofind=FALSE;
                    }
          }
     if(dofind)
          return (long)(cont2+1);
     return -1;
}

long StrSize(const char *string)
{
     long cont;  /*It returns the '\0' position*/

     for(cont=0L; *(string+cont)!='\0'; ++cont);
     return cont;
}

char *FileToMemory(char *file)
{
     FILE *fpointer;
     long cont=0;
     char *memory, *mem2;
     memory=new char[2];
     if((fpointer=fopen(file, "r+"))==NULL)
          return "\0";
     while( fread((void *)(memory+cont),1,1,file))
     {
          mem2=new char[cont+2];
          CopyString(mem2, memory,cont+1);
          delete[] memory;
          memory=mem2;
          ++cont;
     }
     fclose(fpointer);
     memory[cont]='\0';
     return memory;
}

void MemoryToFile(char *file, char *string)
{
     FILE *fp;
     fp=fopen(file,"r+");
     fwrite(string, StrSize(string),1,fp);
     fclose(fp);
}

// THE FINAL ULTRA FUNCTION!!!

int ReplaceStringInFile(char *strfind, char *strchange, char *file)
/*Returns FALSE when replace could not be done(FALSE on error)*/
//Ex: ReplaceStringInFile("BadHello", "BetterHello","/pub/Myfile.MyExtension")
{
     long len1, len2, lenfile, pos,lennewfile;
     char *thefile, *thenewfile;

     thefile=FileToMemory(file);
     lenfile=StrSize(thefile);
     len1=StrSize(strfind);
     len2=StrSize(strchange);
     if(len1>lenfile)
          return FALSE;//The word you search is greater than the file
     pos=FindStringInString(strfind,len1,thefile,len2);
     if(pos==-1)
          return FALSE;//The search did not match anything
     lennewfile=lenfile+len2-len1+1;
     thenewfile=new char[lennewfile];
     CopyString(thenewfile, thefile, pos);
     CopyString((thenewfile+pos),strchange, len2);
     CopyString((thenewfile+pos+len2), (thefile+pos+len1), lenfile-pos-len1);
     thenewfile[lennewfile]='\0';
     MemoryToFile(file, memory);
     return TRUE;
}
0
 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
This one's may be faster since the memory buffer never needs to be moved around or reallocated.

#include <stdio.h>
#include <string.h>
void main()
{
     char szFileIn[]     = "c:\\temp\\testIn.txt";
     char szFileOut[]    = "c:\\temp\\testOut.txt";
     char szFile[]       = "c:\\temp\\test.txt";
     char szToFind[]     = "Description";
     char szReplacement[]= "Information all about";

     FILE* f= fopen( szFileIn, "rb" );
     fseek( f, 0,SEEK_END );
     long nFileLen= ftell( f );
     char* pBuf= new char[nFileLen];
     rewind( f );
     fread( pBuf, 1, nFileLen, f ); // read entire file
     fclose( f );

     f= fopen( szFileOut, "wb" );

     int  nReplacementLen= strlen( szReplacement );
     int  nToFindLen=      strlen( szToFind );
     char* pCur=  pBuf;
     char* pNext= 0;

     int nReplacementCnt= 0;

     do {
          pNext= strstr( pCur, szToFind );
          if ( pNext ) {
               fwrite( pCur, pNext - pCur, 1, f );
               fwrite( szReplacement, nReplacementLen, 1, f );
               pCur= pNext + nToFindLen;
               nReplacementCnt++;
          }
          else {
               fwrite( pCur, &pBuf[nFileLen] - pCur, 1, f );
          }
     } while( pNext );
     fclose( f );
     delete pBuf;
     printf( "%d replacements made", nReplacementCnt );
}

If it will get me points, I'll write you the UNICODE version -- just a few changes needed.

-- Dan
0
 

Author Comment

by:hestercybil
Comment Utility
DanRollins,
I tried running your code, but I got a runtime error.

I put your code inside a function so I can test it.

#include <stdio.h>
#include <string.h>

void ReplaceStrInFile(const char*szFileIn, const char*szToFind, const char*szReplacement)
{
    char szFileOut[]    = "c:\\temp\\testOut.txt";
    FILE* f= fopen( szFileIn, "rb" );
    fseek( f, 0,SEEK_END );
    long nFileLen= ftell( f );
    char* pBuf= new char[nFileLen];
    rewind( f );
    fread( pBuf, 1, nFileLen, f ); // read entire file
    fclose( f );

    f= fopen( szFileOut, "wb" );

    int  nReplacementLen= strlen( szReplacement );
    int  nToFindLen=      strlen( szToFind );
    char* pCur=  pBuf;
    char* pNext= 0;

    int nReplacementCnt= 0;

    do {
         pNext= strstr( pCur, szToFind );
         if ( pNext ) {
              fwrite( pCur, pNext - pCur, 1, f );
              fwrite( szReplacement, nReplacementLen, 1, f );
              pCur= pNext + nToFindLen;
              nReplacementCnt++;
         }
         else {
              fwrite( pCur, &pBuf[nFileLen] - pCur, 1, f );
         }
    } while( pNext );
    fclose( f );
    delete pBuf;
}

int main(int argc, char* argv[])
{
     ReplaceStrInFile("c:\\TestIn.txt", "Hello", "Goodbye!");
     ReplaceStrInFile("c:\\TestIn.txt", "!", "!!");
     ReplaceStrInFile("c:\\TestIn.txt", "Good", "God");
     ReplaceStrInFile("c:\\TestIn.txt", "God", "");

     return 0;
}

The value of TestIn.txt to start with is
"Hello there.  This is a Hello test to see if it works or not. Hello"

Anyway, thanks for trying, but I prefer a method that doesn't use a temporary file.
0
 

Author Comment

by:hestercybil
Comment Utility
DanRollins,
I figured out the run time error.  It was the "c:\\temp\\testOut.txt" file name causing problems, because I don't have a c:\temp directory.
I changed it so it didn't use the additional file, and the code works now.
Here's what I got.

void ReplaceStrInFile(const char*szFileIn, const char*szToFind, const char*szReplacement)
{
    FILE* f= fopen( szFileIn, "rb" );
    fseek( f, 0,SEEK_END );
    long nFileLen= ftell( f );
    char* pBuf= new char[nFileLen];
    rewind( f );
    fread( pBuf, 1, nFileLen, f ); // read entire file
    fclose( f );

    f= fopen( szFileIn, "wb" );

    int  nReplacementLen= strlen( szReplacement );
    int  nToFindLen=      strlen( szToFind );
    char* pCur=  pBuf;
    char* pNext= 0;

    int nReplacementCnt= 0;

    do {
         pNext= strstr( pCur, szToFind );
         if ( pNext ) {
              fwrite( pCur, pNext - pCur, 1, f );
              fwrite( szReplacement, nReplacementLen, 1, f );
              pCur= pNext + nToFindLen;
              nReplacementCnt++;
         }
         else {
              fwrite( pCur, &pBuf[nFileLen] - pCur, 1, f );
         }
    } while( pNext );
    fclose( f );
    delete pBuf;
}

Although your code works, Axter's method seems to be a more STL approach.

If I get nothing better, I'll decide between your method and Axter's method, and then award the points.
0
 

Author Comment

by:hestercybil
Comment Utility
ricarditopicaron,
Thanks for the code, but I like Axter's and DanRollins method better.
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
To help you decide which to choose, I suggest that you single-step through both pieces of code.  For instance, Axter's
    SrcString.insert(SrcString.begin()+OffsetPtr, NewStr.begin(), NewStr.begin()+SizeDiff);
and
     if (SizeNewStr) SrcString.replace(SrcString.begin()+OffsetPtr, SrcString.begin()+OffsetPtr+SizeNewStr, NewStr.begin());
will be a real joy to debug and to explain to the rest of your team :o)

-- Dan
0
 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
>>I figured out the run time error... I don't have a c:\temp directory.

Wow, at least I guessed correctly that you had a C: drive and that somewhere on it was a file named TestIn.txt -- just psychic, I guess!

-- Dan
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
>>To help you decide which to choose, I suggest that you
>>single-step through both pieces of code.  For instance,
>>Axter's

For those who find previous code to complicated, here's a more simplified version.
template<typename T>
void ReplaceArray(T &SrcString, const T &OldStr, const T &NewStr)
{
    size_t SizeNewStr = NewStr.size(), SizeOldStr = OldStr.size();
    size_t SizeDiff = (SizeNewStr>SizeOldStr)?SizeNewStr-SizeOldStr:SizeOldStr-SizeNewStr;
     for(T::size_type  pCur =  0;;)
     {
        T::size_type  pNext= SrcString.find(OldStr, pCur);
        if (pNext !=  T::npos)
          {
               if (SizeDiff)
               {
                    if (SizeNewStr > SizeOldStr)
                    {
                         SrcString.insert(pNext, NewStr.begin(), SizeDiff);
                    }
                    else
                    {
                         SrcString.erase(pNext, SizeDiff);
                    }
               }
               if (SizeNewStr) SrcString.replace(pNext, SizeNewStr, NewStr.begin());
               pCur= pNext + SizeOldStr + 1;
        }
        else break;
     }
}
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
The following modified version should work on both char type and wchar_t type.


template<typename T>
void ReplaceArray(T &SrcString, const T &OldStr, const T &NewStr)
{
    size_t SizeNewStr = NewStr.size(), SizeOldStr = OldStr.size();
    size_t SizeDiff = (SizeNewStr>SizeOldStr)?SizeNewStr-SizeOldStr:SizeOldStr-SizeNewStr;
     for(T::size_type  pCur =  0;;)
     {
        T::size_type  pNext= SrcString.find(OldStr, pCur);
        if (pNext !=  T::npos)
          {
               if (SizeDiff)
               {
                    if (SizeNewStr > SizeOldStr)
                    {
                         SrcString.insert(pNext, NewStr.begin(), SizeDiff);
                    }
                    else
                    {
                         SrcString.erase(pNext, SizeDiff);
                    }
               }
               if (SizeNewStr) SrcString.replace(pNext, SizeNewStr, NewStr.begin());
               pCur= pNext + SizeOldStr + 1;
        }
        else break;
     }
}

template<typename T, typename TS>
void ReplaceStrInFile_STL_(const std::string &FileName, const TS &OldStr, const TS &NewStr)
{
    std::basic_ifstream<T, std::char_traits<T> > fin(FileName.c_str(),std::ios::in);
    TS FileData;
    std::getline(fin, FileData, fin.widen('\255'));
    fin.close();
    ReplaceArray(FileData, OldStr, NewStr);
    std::basic_ofstream<T, std::char_traits<T> > fout(FileName.c_str(),std::ios::out);
    fout << FileData;
    fout.close();
}

void ReplaceStrInFile_STL(const std::string &FileName, const std::string &OldStr, const std::string &NewStr)
{
     ReplaceStrInFile_STL_<char>(FileName, OldStr, NewStr);
}

void ReplaceStrInFile_STL(const std::string &FileName, const std::wstring &OldStr, const std::wstring &NewStr)
{
     ReplaceStrInFile_STL_<wchar_t>(FileName, OldStr, NewStr);
}
0
 
LVL 30

Accepted Solution

by:
Axter earned 400 total points
Comment Utility
hestercybil,
I notice in your question, you list multiple replace request.
If you plan on doing multiple replacements in one shot, the following code would work better:

template<typename T>
void ReplaceArray(T &SrcString, const T &OldStr, const T &NewStr)
{
    size_t SizeNewStr = NewStr.size(), SizeOldStr = OldStr.size();
    size_t SizeDiff = (SizeNewStr>SizeOldStr)?SizeNewStr-SizeOldStr:SizeOldStr-SizeNewStr;
     for(T::size_type  pCur =  0;;)
     {
        T::size_type  pNext= SrcString.find(OldStr, pCur);
        if (pNext ==  T::npos) break;
          if (SizeDiff)
          {
               if (SizeNewStr > SizeOldStr)
                    SrcString.insert(pNext, NewStr.begin(), SizeDiff);
               else
                    SrcString.erase(pNext, SizeDiff);
          }
          if (SizeNewStr) SrcString.replace(pNext, SizeNewStr, NewStr.begin());
          pCur= pNext + SizeOldStr + 1;
     }
}

template<typename T, typename TC>
void ReplaceStrInFile_(const std::string &FileName, const TC &StrPair)
{
    std::basic_ifstream<T, std::char_traits<T> > fin(FileName.c_str(),std::ios::in);
    std::basic_string<T, std::char_traits<T> > FileData;
    std::getline(fin, FileData, fin.widen('\255'));
    fin.close();
     for( TC::const_iterator i = StrPair.begin();i != StrPair.end();++i)
     {
          ReplaceArray(FileData, i->first, i->second);
     }
    std::basic_ofstream<T, std::char_traits<T> > fout(FileName.c_str(),std::ios::out);
    fout << FileData;
    fout.close();
}

void ReplaceStrInFile(const std::string &FileName, const std::string &OldStr, const std::string &NewStr)
{
     std::vector<std::pair<std::string, std::string> > StrPair;
     StrPair.push_back(std::pair<std::string, std::string>(OldStr, NewStr));
     ReplaceStrInFile_<char>(FileName, StrPair);
}

void ReplaceStrInFile(const std::string &FileName, const std::wstring &OldStr, const std::wstring &NewStr)
{
     std::vector<std::pair<std::wstring, std::wstring> > StrPair;
     StrPair.push_back(std::pair<std::wstring, std::wstring>(OldStr, NewStr));
     ReplaceStrInFile_<wchar_t>(FileName, StrPair);
}
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
The following is a usage example for the above code.
Use Method1 to do one replacement at a time.
Use Method2 to do multiple replacements.

void Method1()
{
     const char* WFileName = "c:\\Test_wchar.txt";
     std::wofstream mywout(WFileName);
     const wchar_t * wc = L"Hello World!  This is a Hello test to see if this works!";
     mywout << wc << std::endl;
     mywout << wc << std::endl;
     mywout << wc << std::endl;
     mywout.close();

     const char* cFileName = "c:\\Test_char.txt";
     std::ofstream myout(cFileName);
     const char * cc = "Hello World!  This is a Hello test to see if this works!";
     myout << cc << std::endl;
     myout << cc << std::endl;
     myout << cc << std::endl;
     myout.close();

     ReplaceStrInFile(WFileName, L"test", L"best");
     ReplaceStrInFile(cFileName, "test", "best!");
     
     ReplaceStrInFile(WFileName, L"Hello", L"Goodbye!");
    ReplaceStrInFile(WFileName, L"!", L"!!");
    ReplaceStrInFile(WFileName, L"Good", L"God");
    ReplaceStrInFile(WFileName, L"God", L"");


     ReplaceStrInFile(cFileName, "Hello", "Goodbye!");
    ReplaceStrInFile(cFileName, "!", "!!");
    ReplaceStrInFile(cFileName, "Good", "God");
    ReplaceStrInFile(cFileName, "God", "");
}

void Method2()
{
     const char* cFileName = "c:\\Test_char2.txt";
     std::ofstream myout(cFileName);
     const char * cc = "Hello World!  This is a Hello test to see if this works!";
     myout << cc << std::endl;
     myout << cc << std::endl;
     myout << cc << std::endl;
     myout.close();

     std::vector<std::pair<std::string, std::string> > StrPair;
     StrPair.push_back(std::pair<std::string, std::string>("Hello", "Goodbye!"));
     StrPair.push_back(std::pair<std::string, std::string>("!", "!!"));
     StrPair.push_back(std::pair<std::string, std::string>("Good", "God"));
     StrPair.push_back(std::pair<std::string, std::string>("God", ""));
     ReplaceStrInFile_<char>(cFileName, StrPair);
}
0
 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
Axter,
That rewrite is much cleaner.  However, I can't get the wchar version to work.  I created a unicode file with Notepad, but when I run the code, I end up with an empty file.

There is also something else that needs to be looked into.  I ran a little timing as follows:

     DWORD dwTick= GetTickCount();
//     int nReplacementCnt= QuickReplaceStrInFile( sFile, "yadda yadda", "zap" );
     int nReplacementCnt= ReplaceStrInFile_STL( sFile,  "yadda yadda", "zap",  );
     DWORD dwElapsed= GetTickCount()- dwTick;

     printf( "%d replacements in %d ms", nReplacementCnt, dwElapsed );

on a largish XML file (ANSI 8-bit) and the results are enlightning:.

5681 replacements in 47 ms        <--- Output of my "quick" algorithm
5681 replacements in 94516 ms <--- Output of your STL code

That is, my code was finished in 5/100ths of a second (a veritable blink of an eye) while your code took over 90 seconds.  One and 1/2 minutes.  I calculate that my code is therefore in nice round numbers, about 2000 times faster than yours.  What is happening?

-- Dan
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
>>That is, my code was finished in 5/100ths of a second (a >>veritable blink of an eye) while your code took over 90
>>seconds.
Sounds like you have something setup wrong.  But since I dont' know the specifics of your test (size-file, Compiler & ver, Operating sys, STL-ver, Hardware, qty-replacement, etc...) it would be dificult for me to comment on it one way or another.
0
 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
It's easy to verfify.  Just try it with any large file.  I used a 250K XML file.  It seems to take most of its time in the initial read.  If you click the debugging 'pause' button, it usually stops in the memory allocation ASM code.

-- Dan  
0
 
LVL 4

Expert Comment

by:Chizl
Comment Utility
#include <algorithm>
#include <string>

Then use:

std::string MyString = YourCharArray;
std::replace(MyString.begin(),MyString.end(), 'abc','cba');

That is:
Start, End, Find, ReplaceWith
0
 
LVL 4

Expert Comment

by:Chizl
Comment Utility
Oh..  The way back:

sprintf(YourCharArray, MyString.c_str());
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
>>std::replace(MyString.begin(),MyString.end
>>(), 'abc','cba');

Good try, but that replaces individual char's, and not a specific string.
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
DanRollins,
I assume you're using VC++ 6.0.

It's the VC++ std::getline implementaion that is causing the problem in your results.
I'm sure if you tried this in a different compiler and/or OS, that the results will be far different.
That's why it's never good to make a efficientcy judgement call on any code using one compiler results.
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
The following code should not perform that much differently if you remove or keep the commented out #define.
However, in VC++ 6.0, there's a big difference, which clearly indicates that the STL stream implementation is very inneficient.

This is certainly not the case in other implementations/compilers.

//#define USE_STL_STREAM_LIB
template<typename T, typename TC>
void ReplaceStrInFile(const std::string &FileName, const TC &StrPair)
{
#ifdef USE_STL_STREAM_LIB
     std::basic_ifstream<T, std::char_traits<T> > fin(FileName.c_str(),std::ios::in);
     fin.seekg(0, std::ios::end);
     std::basic_string<T, std::char_traits<T> > FileData(fin.tellg(), 0);
     fin.seekg(0, std::ios::beg);
     fin.read(&FileData[0], FileData.size());
     fin.close();
#else
     FILE* f= fopen( FileName.c_str(), "rb" );
     fseek( f, 0,SEEK_END );
     std::basic_string<T, std::char_traits<T> > FileData(ftell( f ), 0);
     rewind( f );
     fread(&FileData[0], 1, FileData.size(), f);
     fclose(f);
#endif

    for( TC::const_iterator i = StrPair.begin();i != StrPair.end();++i)
    {
          ReplaceArray(FileData, i->first, i->second);
    }
#ifdef USE_STL_STREAM_LIB
     std::basic_ofstream<T, std::char_traits<T> > fout(FileName.c_str(),std::ios::out);
     fout << FileData;
     fout.write(&FileData[0], FileData.size());
     fout.close();
#else
     f= fopen(FileName.c_str(), "wb" );
     fwrite(&FileData[0], 1, FileData.size(), f);
     fclose(f);
#endif
}

0
 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
>> However, in VC++ 6.0, there's a big difference, which clearly indicates that
>> the STL stream implementation is very inneficient.

The implementation differences are irrelevant here and you know it.  That is why you rewrote your function.  Why not 'fess up?   You changed your code from:

    std::getline(fin, FileData, fin.widen('\255'));
to
    std::basic_string<T, std::char_traits<T> > FileData(fin.tellg(), 0);
    fin.seekg(0, std::ios::beg);
    fin.read(&FileData[0], FileData.size());

As always, the error is in the algorithm and the programming technique, but the blame goes to Microsoft.  It is to laugh.  

There is no need to rewrite your entire function, just pre-allocate the string buffer before reading into it.
-=-==-=-=-=-=-=-=-=-=-=-
Any ideas on the problems with the wchar_t version?  I'm pretty sure Bill Gates is somehow responsible.

-- Dan
0
 
LVL 30

Expert Comment

by:Axter
Comment Utility
>>That is why you rewrote your function.  Why not 'fess
>>up?   You changed your code from:
Yes, I rewrote the function because the getline implementation was rediculously slow.
It shouldn't take that long, and I'm sure other implementations don't have this problem.

>>As always, the error is in the algorithm and the
>>programming technique, but the blame goes to Microsoft.  
>>It is to laugh.  

I'm not one to blame MS for everything, and infact I'm more of a fan then a hate monger when it comes to MS.
BUT, in this perticular situation, I do blame MS, because they've had plenty of time to improve there STL code, but they have not.
They depend to much on the MFC code, and pretty much ignore optimization they can be doing on the STL implementation.

In general I like VC++, but the STL implementation realy bites.  I see that as VC++ greatest weekness.
0
 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
OK.  But I think you'll find similar problems with all STL implementations when concatenating single bytes to a lengthening string.  I hit this problem with CString, but it crops up all over the place.  

Consider that when the buffer is 2,000,000 bytes long, and you add one byte to it, the underling code (be it STL or any library) must now allocate a new buffer that is 2,000,001 bytes long.  It then needs to copy all 2,000,000 bytes of the existing data to the new buffer and append the single byte, then free the previous allocation.  

This problem has been solved various ways... It is why there is often an 'allocation chunk size' option for many collection classes -- to avoid having to increase the buffer with each append operation.  If you can know the size of the buffer in advance (as we do here), it saves enormous amount of time to pre-allocate the max size, then release the unused part after doing all of the appending.

-- Dan
0
 
LVL 4

Expert Comment

by:Chizl
Comment Utility
http://www.chizl.com/Dev/C++/ChizlsUtils/ChizlsUtils.zip

I wrote a few functions that acted like the cooler VB functions.

Mid()
Instr()
InstrRev()
Split()
Replace()

Let me know if you like of find bugs or even know a better way to do what I'm doing.
0
 

Author Comment

by:hestercybil
Comment Utility
DanRollins,
I posted a question for you.
See....
http://experts-exchange.com/jsp/qManageQuestion.jsp?ta=cplusprog&qid=20319452

Thanks everyone for your help.  Axter's and DanRollins's methods were both good, but I had to pick one.
0
 
LVL 9

Expert Comment

by:miron
Comment Utility
well,

there is only one true implementation of highly efficient, very optimized string manipulation compiler - look around :)
0
 
LVL 4

Expert Comment

by:Chizl
Comment Utility
Non-MFC CString is one.  Used that back in 97, don't know what I ever did with it..
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Article by: SunnyDark
This article's goal is to present you with an easy to use XML wrapper for C++ and also present some interesting techniques that you might use with MS C++. The reason I built this class is to ease the pain of using XML files with C++, since there is…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now