Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Overwrite a line in a file

Posted on 2003-04-01
20
Medium Priority
?
491 Views
Last Modified: 2008-02-01
What I want to know is how to delete a line in a file and replace it with something else?

I made the file using the  folling code:

ofstream file;
file.open("data", ios_base::out|ios_base::app);

file<<parity_bits[0]<<parity_bits[1]<<binary[0]<<parity_bits[2]<<binary[1]<<binary[2]<<binary[3]<<parity_bits[3]<<binary[4]<<binary[5]<<binary[6]<<endl;
file.close();

As you can see I store all my data in a file call data.

I use this command to pick the line from the data file:
void pick_line()
{
     string fileName="data";
     string readStr;

     ifstream file;
     file.open(fileName.c_str());

     Numbers.clear();

     while (!file.eof())
     {
          getline(file, readStr);
          Numbers.push_back(readStr);
     }

     file.close();
}

Now can sometell me how to delete that line and replace it with a different line.
0
Comment
Question by:ifrit417
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
  • 5
  • +2
20 Comments
 
LVL 2

Expert Comment

by:almes
ID: 8245115
you can read all file and then rewrite it without the line u want, not a fast solution, but it works and it is easy.
0
 
LVL 12

Expert Comment

by:Salte
ID: 8245858
to modify a file by modifying parts of its contents is a complicated issue.

In general it is only useful for binary files since they have an array of fixed size records and replacing one record with another of the same size is no problem.

For text files it is a completely different issue. The problem is that you can only modify n bytes by writing exactly n bytes to the file over the n bytes you want to modify. There are several reasons why text files are not good at handling that situation:

1. Typically a record in a text file is a 'line' and the lines have varying length. Replacing a line of 20 bytes with a line of 12 bytes or 25 bytes is a bad idea and so modifying the file won't work. The only possibility is if the line is 20 bytes and the line you want to replace it with also happen to be exactly 20 bytes then you can modify the file as you do for binary files.

2. Text files often translate sequences of bytes to characters, for example in windows an end of line character '\n' character in the file is stored as the sequence of two bytes CR and LF. So if one byte is CR and the next byte is LF you don't read two characters but only one single '\n' of the two.

3. If the text file uses UTF-8 or some other encoding a character can be 1, 2, 3 or 4 bytes so it is very hard to predict how many bytes a string of 5 unicode characters is, it can be anything from 5 bytes to 20 bytes and any value in between those two extremes. This comes in addition to the end of line translation mentioned in 2.

So all in all attempting to modify a text file as if it was binary and modify in place is generally a bad idea. It might work in your particular case but it might not and it will in general not work for general text substitution.

For this reason text files are usually modified by reading the source file and writing a new file with the modifications. When this process is done the new file with the modifications is used instead of the original source file.

The safest way to do this is by using a backup file so first step is to move the source file to a backup file, if the source file is data then you can move it to data.bak or some such. If there is a file data.bak already and your file move function cannot handle that situation you must make sure that the old data.bak is deleted before you move data to data.bak.

Then you simply read data.bak and write data with the modifications using the source data.bak and the modifications as input.

Now, for the rare situation that you CAN modify the file in place and for binary files where this is more common you can use seek to do it:

step 1. Open the file for both input and output:

fstream file("data",ios::in|ios::out|ios::binary);

I would add binary there even if the file is a text file, I don't want to take any chance that we add extra bytes that we don't want.

step 2. Read the file until you find the line you want to modify. Have a file position beg pointing to the beginning of the line you want to modify and also another file position end pointing to the end of the line. Use file.tellg() to get the current file position at any time.

step 3. Verify that the line you want to replace is exactly end - beg bytes long.

step 4. do a file.seekp(beg);

step 5. write the bytes: file.write(line,length);

step 6. do a seekp() to the end of the file: file.seekp(0,ios::end);

step 7. Close the file (or let the destructor close it).

Hope this explains.

Alf
0
 
LVL 8

Expert Comment

by:fl0yd
ID: 8247332
A better solution would be to use memory mapped files. They are available for just about any OS I've come across, but the API is platform specific. Since you didn't state anything about platforms in your question I will just leave it at that. If you are interested post your system requirements and we can go from there.

.f
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 12

Expert Comment

by:Salte
ID: 8251677
Again,

memory mapped files works best as binary files or if text files as read only (just read the text). A write only memory mapped file is sort of useless since writing is just as well.

a read/write text file works bad for the same reason modifying a text file works bad. It works fine only in the case that the text you want to change and the new text has exactly the same number of bytes.

Alf
0
 

Author Comment

by:ifrit417
ID: 8252182
I use this function to pick my line in the file
void pick_line()
{
    string fileName="data";
    string readStr;

    ifstream file;
    file.open(fileName.c_str());

    Numbers.clear();

    while (!file.eof())
    {
         getline(file, readStr);
         Numbers.push_back(readStr);
    }

    file.close();
}

Numbers[line-1]; where line is the line I want to pcik from the data file.


 Salte, how would I do your steps mentioned above, can you give abit more detail thanks.
0
 
LVL 12

Expert Comment

by:Salte
ID: 8252411
Basically if you do as you appear from the code above, then you read the whole file in one bunch so the Numbers vector contain the whole file.

If that is the case it is easy to replace a line and then write the file out again.

Just start from beginning and dump the whole structure out to file again.

The point is that you cannot just write the line you changed.

Since the reading/writing is the slow part it is obviously best to do it this way:

step 1. read the whole file into the Numbers vector.
step 2. Modify the Numbers vector as you please.
step 3. write the whole Numbers vector out to file again.

The idea is that step 2 is a step where you do all your modifications, so if you want to do two modiciations to the vector, for example change line 2 and line 5 then you do this:

step 1. read the file
step 2. change line 2 and line 5.
step 3. write the file.

and not this:

step 1. read the file
step 2 change line 2
step 3 write the file
step 4. read the file
step 5. change line 5
step 6. write the file.

The steps 3 and 4 are absolutely not necessary and even if you want such a 'in between' save you can do it by keeping step 3 (write the file) but as you already have the file in the vector you don't have to read it again.

Alf
0
 

Author Comment

by:ifrit417
ID: 8252490
Right now I'm using another method as the vector method seems to hard:

#include <fstream>
#include<iostream>
#include <string>
using namespace std;
void main()
{
  int input;
  fstream file("test.txt",ios::in | ios::out);

  cout<<"Which line:";
  cin>>input;
  input=input*11;
 
  file.seekg(input);
  file<<"\n12345678111\n";

  file.close();
}

Because each line in my txt file have 11 digit, trouble is it doesn't do it right after line 1:(
0
 

Author Comment

by:ifrit417
ID: 8252512
So basically for the first line:
0
11
37
50
and so it's 13 is the difference
0
 

Author Comment

by:ifrit417
ID: 8252523

 #include <fstream>
#include<iostream>
#include <string>
using namespace std;
void main()
{

     int input;
      fstream file("test.txt",ios::in | ios::out);

 

     cout<<"Which line:";
     cin>>input;
     input=(input*13)-2;
     file.seekg(input);
     file<<"\n12345678111\n";


      file.close();
}
0
 

Author Comment

by:ifrit417
ID: 8252529
But It doesn't work for line 0 which is the start of the file, the reason for that is because I have a \n. Any suggestion around it? Or do I have to make an if statement inside?
0
 
LVL 8

Expert Comment

by:fl0yd
ID: 8252691
@ Salte:

Memory mapped files: You are wrong in assuming that they have to be write only [weird way of opening a file anyway]. In contrary, they must be opened with read/write access. Copying parts around using memmove() is a lot easier than any other approach you've suggested so far. If I was to accomplish something like replacing a line in a file I wouldn't want to reinvent the wheel.

.f
0
 
LVL 12

Accepted Solution

by:
Salte earned 300 total points
ID: 8252832
fl0yd,

The problem is that inserting text in a text file where a line has varying length, require you to do a lot of memmove. Now, if you open the file with read/write access that might make that slightly easier but a memmove on a file mapped into memory is slower than a memmove in your regular RAM (which is already mapped). You have therefore little to gain from it.

For text files it is essentially only useful if you want to read the file, the advantage is that you have the whole file in a big 'buffer' and can move around as you please without worrying about reaching an end of buffer so you have to read the data over again.

If you plan to modify the file, mapping it into memory is generally have little benifit compared to just writing the file.

ifrit417,

First off, multiplying line number with 11 is correct if the line is exactly 11 bytes. If the line has an end of line marker equal to CR LF that means 9 digits. If the line has an end of line marker equal to '\n' (only one byte) then the line is exactly 10 digits.

If you have spaces in the line either before or after the number those must also be counted in the 9 or 10.

If you use any other format other than plain ASCII (utf-8 for example) then this is a whole lot more complicated.

Trusting that each line has exactly the same length, the file can actually be treated the same way as a binary file and you can modify a single line without affecting the other lines. You can also easily pick a specific line by multiplying the line number with the line length. Remember that file offsets start from 0 while line numbers traditionally starts from 1, so you would do something like:

file.seekg((line_number - 1) * line_length);

where line_length is the length of a line including the end of line marker. If all lines have the same length, line_length is the value of the file offset just after reading the first '\n'.

For such a file you can modify a specific line:

istream & modify_line(int line, int length, const char * buf)
{
   // modify line number line. Each line has length length
   // the new line at line number line is specified in buf.
   file.seekp((line - 1) * length);
   file.write(buf,length);
}

remember to move seekp to end of file before you close it.

Alf
0
 
LVL 8

Expert Comment

by:fl0yd
ID: 8253842
Salte,

I don't know what weird sort of memmove-orgy you are planning on. It is commonly accepted that 1 (one!) memmove is sufficient to insert/overwrite/replace 1 line. Obviously, disk access is slower than memory access on most devices, however, the OS is making sure to map the appropriate portion of the file being altered into physical memory, making a memmove no faster or slower than if it were used on regular ram. What have we gained? A clean way to access a file and being able to alter the contents _without_ intermediate copies or backup files. If you care, please explain to an inexperienced developer like me, where you would get all your excessive memmoves from.

.f
0
 
LVL 12

Expert Comment

by:Salte
ID: 8254051
fl0yd,

it takes one memmove but you will generally have a page fault each time you cross a page. If you plan to do a lot of moving around of the data that might work but a better idea is simply to arrange the data as you want on a heap and then write it out to the file one time the way you want it.

If your way of doing it is to move data around first and then write the file one time, mapping it to memory is a bad idea since you don't really have control of when the data is written to disk.

I am not saying it won't work, but I am saying it is not efficient and there are better ways. Memory mapped files do have their use but editing text files isn't one of them.

Editing binary files on the other hand might work fine. The reason is that data are of fixed size so you can replace one record with new data without having to move all the data after it to make room for the new record.

if you replace text file with lines of fixed size this count as binary files and will also work for the same reason.

ifrit417,

You have a weird formulate for computing the position of the line:

input = input * 13 - 2;

That doesn't look right.

If each line has fixed size (13 bytes) then the first line (line 1) is at position 0, second line at position 13, third line at position 26 etc...

First line is 13 bytes from position 0 to 12.
second line is 13 bytes from position 13 to 25
third line is 13 bytes from position 26 to 38
fourth line is 13 bytes from position 39 to 51

etc...

so the formula is:

input = (input - 1) * 13;

This is under the assumption that the 13 bytes include the end of line marker at end of each line and will position yourself just after the end of line marker that separates the chosen line from the previous line.

If the file has a header (a first line of a different size) you have to add the size of the header also:

input = header + (input -1) * 13;

or if header is constant and known at compile time you can do:

input = input * 13 + header - 13;

This is easier for the compiler to collapse into a simple 'multiply and add' formula since if the header is constant the header and 13 will be subtracted at compile time and only the difference will be added to the result of the multiplication.

Alf
0
 
LVL 8

Expert Comment

by:fl0yd
ID: 8255124
Premature optimizations are the root of all evils, but enough of that. I won't give you any more hints on how to avoid potential problems -- you are stubborn enough to have to learn it the hard way anyways. Thanks again for the comedy, Salte.

Just for the record: Your assumptions on the speed difference and all that sqabbling about page faults is hilariously funny. Once again, I enjoyed the comedy.

.f
0
 
LVL 12

Expert Comment

by:Salte
ID: 8255666
fl0yd,

I believe I know what I am talking about. It's not a comedy.

Memory mapping is usually implemented in a manner similar to the following:

The system scans through the virtual memory for the process and search for a block big enough to map the (part of the) file. When such a block is found (or if the hinted page defines a free block) the pages of that block is changed from being 'no memory' to 'memory of file FFFF' specifically the blocks on the disk where that file is found.

The file is - as a general rule - not read at this point.

When the program access such a page it will generate a page fault and the OS reads the block from hard disk that is associated with that page. This will correspond to the block of the file that is supposed to be located on that file. While this reading take place the process is sleeping.

When the system need a page and a page hasn't been accessed for a while and that page happen to be in that memory mapped area then that page is written to disk and the file is actually updated. Of course, this only happen if the page is "dirty" that is, it has been modified from the original. If the page is "clean" i.e. it is identical to the block on disk the page is simply discarded. A file that is opened for read only and mapped to memory will have only clean pages and so the pages will simply be discarded at this point. A file opened for read/write may have a dirty page and that page is written to disk.

Typically the page entry is then modified to reflect that the page is no longer in memory as it was originally so if the page is again accessed, you will have another wait while the page is read from disk.

Problem with a read/write text file mapped to memory is that if you want to insert a text " fl0yd" between 'o' and ',' in "hello, how are you" so that the text becomes "hello fl0yd, how are you" it means that you have to move the text of all pages of the file after that point. That will be a lot of bytes moving. It may seem like only one memmove to you but in reality it will likely be several page faults and so it will be a memmove() that takes some time.

Now, I think there are much better ways to do the same. For example reading one file and writing another and when you come to the point where you want to insert the text you simply write the extra text even though it did not originate from the source file. If you want to delete text you simply read it but do not write it.

This is cleaner and easier and simpler. It even works on systems that do not support mapping a file to memory.

If you want to read a file on the other hand, mapping a file to memory has certain advantages. For one thing you have the whole file so whatever pointer into the file points to a valid memory address. You don't have to worry that the pointer is outside the buffer so you have to refill the buffer with correct data before you can use the pointer etc.

Since the file is read only the pages will never be dirty and simply discarded when they are "too old". If you access them again you will re-read them from disk.

Of course, if you only map a window of the file you may find that the pointer points outside the window but in that case you probably didn't set the window properly or the file is simply so big that you have to take into account the more complicated logic.

Anyway, mapping files to memory has their uses but not always and not at all times.

Alf
0
 
LVL 8

Expert Comment

by:fl0yd
ID: 8256245
Whatever. Don't use a clean approach then. Don't know who guaranteed you that your 'do it all on the heap' method doesn't use file-mapping in the underlying file-access functions?? Anyway, this is getting ridiculous as any other discussion you seem to get involved. Just a hint: get a life, seriously.

.f
0
 
LVL 12

Expert Comment

by:Salte
ID: 8260037
fl0yd,

As an expert you are supposed to provide whatever knowledge you have to those who request it.

Your postings are increasingly off-topic and of little or no help to those who post their questions.

I ask you to calm down and rethink before you post next time.

As far as the 'do it all on the heap' method it will often use memory mapping. For example the gnu malloc() uses memory mapping to make its own heap and also whenever you request a size that is above a certain threshold.

Such a memory mapping do not go to a specific file though but instead map a part of the swap file to that memory. The effect is the same as a regular file though as far as speed is concerned.

Now, the advantage of doing it all in heap isn't because it is faster per se but because you are not limited to have exactly the same layout in memory as you plan to have in the final file. For example you could have a list in memory and update pointers instead of moving the data physically around, or in the case of working on a file you could allocate a bigger buffer on heap and store the data before the place you want to edit towards the beginning of the buffer and the data after the place you want to edit towards the end and then leave a gap in the middle. You can't do that on the file since that would leave a big hole in the file but you can do it on the heap. Then you can insert text by placing it in the gap WITHOUT having to move the data after it, that data is already stored after the gap and as long as the gap is big enough to hold the new data you're fine. If you want to delete text you simply make the gap bigger and the data portions correspondingly smaller etc..

Then when you are done you write the heap to file by making two separate writes, one that write the data before the gap and one that write the data after the gap.

So in short, yes, I do believe that 'doing it all on the heap' is faster and better than mapping file to memory in the context we are talking about here.

This is true even though the heap is also in a sense a file mapped to memory.

Alf
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9587415
No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:

Answered: Points to Salte

Please leave any comments here within the next seven days. Experts: Silence
means you don't care.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

-bcl (bcladd)
EE Cleanup Volunteer
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is the first in a series of articles about the C/C++ Visual Studio Express debugger.  It provides a quick start guide in using the debugger. Part 2 focuses on additional topics in breakpoints.  Lastly, Part 3 focuses on th…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The viewer will learn how to pass data into a function in C++. This is one step further in using functions. Instead of only printing text onto the console, the function will be able to perform calculations with argumentents given by the user.

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question