Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

crap in textfile

Posted on 1998-10-05
13
Medium Priority
?
384 Views
Last Modified: 2010-05-19
Hallo,

I'm reading a text file like this:

(global) var lessX:integer;

var f:Textfile;
    s:string;

begin
AssignFile(f,'c:\test1.txt');
Reset(f);
while not eof(f) do
  begin
   readln(f,s);
   if s<'100' then inc(lessX);
  end;
CloseFile(s);
end;

Sometimes the textfile contains some crap and the while
loop breaks before the real EndOfFile is reached.
Is there a way to avoid this break ?
The mismatch chars reach from #0 to #255 .

regards
0
Comment
Question by:benni
  • 4
  • 3
  • 3
  • +2
13 Comments
 
LVL 1

Expert Comment

by:BlackDeath
ID: 1341799
could you mail me such a crappy file?
my email address can be found in my profile.

regs,
Black Death.
0
 
LVL 2

Expert Comment

by:rene100
ID: 1341800
you can try it with a TMemoryStream and the
method LoadFromFile(FileName).
perhaps this works

regards
rene
0
 

Author Comment

by:benni
ID: 1341801
Black Death
hmm the files are 40 MB or bigger ...

Rene
how do I extract strings from this method ?


0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 4

Expert Comment

by:erajoj
ID: 1341802
Hi,
Can you try the code below and tell us if the result (in the messagebox) is equal to the file size? so I know whether to put any energy into this.

var
  f: File;
  p: pointer;
  iSize, cRead, cTotal: Integer;
begin
  AssignFile( f, 'c:\test1.txt' );
  FileMode := 0;
  Reset( f, 1 );
  iSize := 1 shl 16; // 64kB
  cTotal := 0;
  GetMem( p, iSize );
  repeat
    BlockRead( f, p^, iSize, cRead );
    Inc( cTotal, cRead );
  until ( cRead<>iSize );
  FreeMem( p );
  CloseFile( f ); // not 's'!
  ShowMessage( IntToStr( iTotal ) + ' bytes read.' );
end;

/// John
0
 
LVL 4

Expert Comment

by:erajoj
ID: 1341803
Hi again,
What would happen if the line contains '0123', '+123' or ' 123'???
Would you do a miscalculation of "lessX"??
It seems so, since both strings above are less than '100' due to
their first characters position in the ASCII charset.

/// John
0
 
LVL 1

Accepted Solution

by:
ow earned 400 total points
ID: 1341804
Hi benni,

you have to scan for the strings like this:

  var
    F :file of char;
    C :char;
    S :string;
  begin
  AssignFile(F, TEXT_FILE);
  Reset(F);
  {Initialize S}
  S := '';
  while not EoF(F) do
    begin
    {Read one character}
    Read(F, C);
    if (C <> #13) then
      S := S + C
    else
      begin
      {Here you do with S what you want...}
      {...}
      ListBox1.Items.Add(S);
      {Reinitialize S}
      S := '';
      {Overread linefeed}
      Read(F, C);
      end;
    end;
  CloseFile(F);
  end;

Regards
  ow
0
 
LVL 1

Expert Comment

by:ow
ID: 1341805
Please delete the line
 ListBox1.Items.Add(S);
from the code (its from another use).

ow
0
 
LVL 4

Expert Comment

by:erajoj
ID: 1341806
Hi,
Yes, that answer really provides a fast solution for large files and is soo much better than Borlands own "ReadLn" implementation, reading one char at a time!!! ;-(
Will the answer work better than the original code if there are stray EOL's in the file? ...NO, it won't!
Please try my code example first, if you want a serious solution to the problem.

Typical solution from a sysadmin, scraping off the surface! ;-)

/// John
0
 
LVL 1

Expert Comment

by:BlackDeath
ID: 1341807
outch - circus maximus ?
>:->

benni - 40mb _zipped_ ?

Black Death.
0
 
LVL 1

Expert Comment

by:ow
ID: 1341808
Hi Benni, hi John!

The described solution will work better than Borlands Pascal, cause it uses type "text" and not "file". Therefore it will read the whole file and not terminate on char(26).
But you are right, I have forgotten single CRs.
To prevent that single CRs are seen as lineends, we have to check two characters.
With my simple example I wanted to show, that it's necessary to look at every char to determine the lineends.
And of course the following code is much more faster:

  type
    tCardinalArray = array[0..High(integer) div 2] of char;
  var
    FileStream :tFileStream;
    FileSize   :integer;
    Buffer     :^tCardinalArray;
    Index      :integer;
    LastIndex  :integer;
    C, C1      :char;
    Count      :integer;
    S          :string;
  begin
  FileStream := tFileStream.Create(TEXT_FILE, fmOpenRead);
  FileSize := FileStream.Size;
  Buffer := AllocMem(FileSize);
  FileStream.ReadBuffer(Buffer^, FileSize);
  FileStream.Free;
  {Initialize S}
  S := '';
  C1 := ' ';
  LastIndex := 0;
  for Index := 0 to FileSize - 1 do
    begin
    {Read one character}
    C := Buffer^[Index];
    {Test for CRLF}
    if (C1 = #13) and (C = #10) then
      begin
      Count := Index - LastIndex - 1;
      SetLength(S, Count);
      Move(Buffer^[LastIndex], S[1], Count);
      LastIndex := Index + 1;
      {Here you may do with S what you want...}
      {...}
      end;
    {Remember C for next loop}
    C1 := C;
    end;
  FreeMem(Buffer);  
  end;

regards
  ow

0
 
LVL 1

Expert Comment

by:ow
ID: 1341809
Hi Benni,

I remembered that you want to work on very large files.
So if you don't have enough RAM to read the whole file in, here is another version, which uses only a small buffer.
It is almost as fast as the version above (10 MB in 5 s on a P90).

  const
    MAX_BUF_SIZE = $FFF;
  var
    FileStream :tFileStream;
    FileSize   :integer;
    Buffer     :pByteArray;
    BufSize    :integer;
    Index      :integer;
    LastIndex  :integer;
    C, C1      :char;
    S          :string;

  procedure FromBufToS(BufPos :integer);
    var
      Count :integer;
      SLen  :integer;
    begin
    Count := BufPos - LastIndex;
    if (Count > 0) then
      begin
      SLen := Length(S);
      SetLength(S, SLen + Count);
      Move(Buffer^[LastIndex], S[SLen + 1], Count);
      end;
    end;

  begin
  BufSize := MAX_BUF_SIZE;;
  GetMem(Buffer, BufSize);
  FileStream := tFileStream.Create(TEXT_FILE, fmOpenRead);
  FileSize := FileStream.Size;
  {Initialize C1, S}
  C1 := ' ';
  S := '';
  while (FileSize > 0) do
    begin
    if (FileSize < BufSize) then
      BufSize := FileSize;
    FileStream.ReadBuffer(Buffer^, BufSize);
    Dec(FileSize, BufSize);
    LastIndex := 0;
    for Index := 0 to BufSize - 1 do
      begin
      {Read one character}
      C := char(Buffer^[Index]);
      {Test for CRLF}
      if (C1 = #13) and (C = #10) then
        begin
        if (Index = 0) then
          {Remove CR }
          Delete(S, Length(S), 1)
        else
          {Move chars to S, exclude CRLF}
          FromBufToS(Index - 1);
        LastIndex := Index + 1;
        {Here you do with S what you want...}
        {...}
        {Reset S}
        S := '';
        end;
      {Remember C for next loop}
      C1 := C;
      end;
    {Move Buffer to S}
    FromBufToS(BufSize);
    end;
  FreeMem(Buffer);
  end;


regards
  ow

0
 

Author Comment

by:benni
ID: 1341810
thanks ow - seems that it works ...

btw: your method is about 5 % slower than the readln, seems
that my implementation of your source has a little bit more overhead - but dont worry, time dosnt matter at this point :-) !

thx again

egono

0
 

Author Comment

by:benni
ID: 1341811
for all the other boys and girls - I accepted ow's last comment and not his answer !!!

0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
When cloud platforms entered the scene, users and companies jumped on board to take advantage of the many benefits, like the ability to work and connect with company information from various locations. What many didn't foresee was the increased risk…
Suggested Courses

876 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question