I am trying to extract to parts of unknown size from a database (textfile).
It is composed as follows
>Title\n //The '>' preceedes a section of the database and \n terminates this header
//That line thus contains the header or title for the following section
AACCCTAGCTAGCATCAGCACGACG\
n
ACGACTAGCACTNACGGCGACCTCG\
n
ACACGAGCTGCGCCATAGCAGCAGG\
n
//Each line is terminated after some charackters with a linefeed '\n'
//Can be between some and some hundred thousands chaarackters.
//Only ACGNT however
>Next section\n
....
>Third section\n
...
>and so on for 100 MB\n
I would like to extract the header to a variable called scaffold
and the corresponding section of the database to a variable called code.
Boh are char*.
I use the following approach, but it only works for the first section thereafter it creates wierd
resutlts.
bool CDna2Protein::GetNextPart(
const char * _filePath)
{
if(!databasein.is_open()) //Declared as class variable so I do not have
//to keep opening and closing a file.
//ifstream databasein;
databasein.open(_filePath,
ios_base::
in);
char ch,
test = '>';
int pos = 0;
char *line = new char[256];
//Read in header for this part of the database
databasein.getline(line,25
6,'\n');
scaffold = new char[256];
sprintf(scaffold,"%s",line
);
delete line;
//Now get the current position in the file
int before = databasein.tellg();
before--; //Does seem to give the next position so account for that
int after = 0;
while(databasein.get(ch)) {
if(ch == test) { //test = '>';
databasein.putback(ch);
//I would like the next read starting with the charackter I just putback.
after = databasein.tellg();
after--; //see above
break;
}
}
int len = after-before;
//Knowing the size of the database section create a buffer to hold it
code = new char[len+1];
databasein.seekg(before);
databasein.get(code,len,te
st); //test='>';
databasein.seekg(after);
//should hopefully end up at desired position within file
code[len]='\0';
//do some output for testing
len = int(strlen(code));
char *temp = new char[10];
sprintf(temp,"%d",len);
CString str = "Laenge von "+CString(scaffold)+": "+CString(temp);
delete temp;
StringToFile("c:\\gpfc++\\
len.log",s
tr);
trim();
//Need to figure out how to test for termination.
return(true);
}
Besides the problem of getting wrong parts of the database posing for scaffold,
I also need to return false on reaching feof.
Thank you for any suggestions
Jens