icysmarty
asked on
Extracting a field from a data file
I am now writing a program to extract fields from a data.
Now I am experiencing problems to store the bytes read from the file into appropriate fields.
Since the file is not properly aligned, so I have to skip 0x09 and 0x0A character and 0x32 (which is space) as well everytime I read a byte from the file.
When I read the byte and pass it to the function to store in a field, the byte is not stored in appropriate fields since each field has different number of bytes.
I tried solving this problem by using
fstream extract_data;
...
...
extract_data << record.data1 << rercord.data2 << record.data2 ...
This works fine except when the particular record has a "blank" field.
What other options can I go for?
Now I am experiencing problems to store the bytes read from the file into appropriate fields.
Since the file is not properly aligned, so I have to skip 0x09 and 0x0A character and 0x32 (which is space) as well everytime I read a byte from the file.
When I read the byte and pass it to the function to store in a field, the byte is not stored in appropriate fields since each field has different number of bytes.
I tried solving this problem by using
fstream extract_data;
...
...
extract_data << record.data1 << rercord.data2 << record.data2 ...
This works fine except when the particular record has a "blank" field.
What other options can I go for?
That depends on what you want to shift your focus on. If you can easily change the data layout, introduce a special value that represents a blank (NULL) field. Or, change yur parsing code to cope with that. I'd go with the latter, especially if you e.g. know that two consecutive delimiters in fact do mean a blank field.
ASKER
I don't get you.
What do you mean by two consecutive delimiters mean a blank field?
What do you mean by two consecutive delimiters mean a blank field?
Well, let's say you have
SomePerson|1978/07/21|$200 0|A nice guy
and
SomeOtherPerson|1972/03/21 ||Huh?
you'd see that the fields re delimited by "|" and that "||" means "no content"...
SomePerson|1978/07/21|$200
and
SomeOtherPerson|1972/03/21
you'd see that the fields re delimited by "|" and that "||" means "no content"...
ASKER
Then how do I know if it is the end of field.
Say
SomePerson|1978/07/21|$200 0|A nice guy
Whoever|1978/07/21|$30000| A bad guy
The first field, SomePerson and Whoever have different length of field. How to detect the end of field, if I am reading it byte by byte from a data file?
For blank, I can keep a counter for the space, whenever I encounter more than 2 space, that means that field is a blank field right/
Say
SomePerson|1978/07/21|$200
Whoever|1978/07/21|$30000|
The first field, SomePerson and Whoever have different length of field. How to detect the end of field, if I am reading it byte by byte from a data file?
For blank, I can keep a counter for the space, whenever I encounter more than 2 space, that means that field is a blank field right/
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Icysmart was suggesting that he was using the fstream >> operator, which to my knowledge always ignores whitespace (just like cin). So, although your example of having | as the delimitor is good, I don't think it's the case here.
Icysmart, could you be more specific about what kind of input data you're reading from the file, and in what kind of variables you are trying to store it?
Icysmart, could you be more specific about what kind of input data you're reading from the file, and in what kind of variables you are trying to store it?
ASKER
I decided not to use fstream >> operator cause it does not work out for my case.
I am reading a data file with 7 fields and storing them in a record of stuct type.
I create an array for each field and read the data file byte by byte.
The data file is like following
1st
40 abc.de john_smith 30,000 23 5
2nd
50 xxyy.d barbara 20,000 25 2
...
...
Obviously after 1st there will be a tab and line feed.. so I need to check for this conditions.. but there are too many conditions to be included in order to make the program runs perfectly. That's why I am asking if there are other c++ functions that I do not aware of.
I am reading a data file with 7 fields and storing them in a record of stuct type.
I create an array for each field and read the data file byte by byte.
The data file is like following
1st
40 abc.de john_smith 30,000 23 5
2nd
50 xxyy.d barbara 20,000 25 2
...
...
Obviously after 1st there will be a tab and line feed.. so I need to check for this conditions.. but there are too many conditions to be included in order to make the program runs perfectly. That's why I am asking if there are other c++ functions that I do not aware of.
ASKER
>> By looking for the delimiter, which is "|" in the above sample
I got it. The program works. I didn't know that two delimiters means a blank field. Thank you very much.
I got it. The program works. I didn't know that two delimiters means a blank field. Thank you very much.