• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 202
  • Last Modified:

How to get strings from a file quickly

I am trying to get the strings out of a selected file as quickly as possible.  My routine takes ~1sec to process a 1MB file but it is too slow.  Any help appreciated!
procedure tform2.getstrings(f: string; var s: TStringlist);
var fs: tfilestream;
    temp: string;
    ch: char;
    i: integer;
begin
  fs:=tfilestream.Create(f, fmsharedenynone);
  for i:=0 to fs.Size-1 do
    begin
      fs.Read(ch, sizeof(ch));
      if (ch in ['A'..'Z','a'..'z','0'..'9',':','/','.']) then
        temp:=temp+ch
         else
        begin
          if (length(temp) > 8) then
            s.Add(temp);
          temp:='';
        end;
    end;
  fs.Free;
end;

Open in new window

0
DSOM
Asked:
DSOM
  • 2
1 Solution
 
ThievingSixCommented:
Well the reason it is so slow is because your reading 1 byte at a time. Hard drive I/O access can really bottle neck a program.

Let me try to make sure I understand what you are doing. From what I can tell you are going through the entire file looking for certain characters. If the character isn't one of them and the string gotten is longer than 8 you keep it otherwise you trash it. Correct?
0
 
ThievingSixCommented:
Here is my go at it. I tested both your function and mine to compare speed. I used a dll from java which was about one megabyte.

Yours clocked in at about 1600ms(about 1.6 seconds).

The one below was at about 60ms.

It's probably not the most optimized but should help you in the right direction. If you want it commented let me know.
procedure GetStrings(FileName: String; var SL: TStringList);
const
  Markers = ['A'..'Z','a'..'z','0'..'9',':','/','.'];
  BlockSize = 4096;
var
  hFile : DWORD;
  Position : DWORD;
  FileSize : DWORD;
  Buffer : PChar;
  BytesToRead : DWORD;
  BytesRead : DWORD;
  I : Integer;
  CurrentString : String;
begin
  If Not(FileExists(FileName)) Then Exit;
  hFile := CreateFile(PChar(FileName),GENERIC_READ,0,nil,OPEN_EXISTING,0,0);
  If hFile = INVALID_HANDLE_VALUE Then Exit;
  Try
    Position := 0;
    CurrentString := '';
    FileSize := SetFilePointer(hFile,0,nil,FILE_END);
    If FileSize = 0 Then Exit;
    SetFilePointer(hFile,0,nil,FILE_BEGIN);
    Buffer := AllocMem(BlockSize);
    If Buffer = nil Then Exit;
    Try
      Repeat
        BytesToRead := BlockSize;
        If BytesToRead > (FileSize - Position) Then
          begin
          BytesToRead := FileSize - Position;
        end;
        If ReadFile(hFile,Buffer^,BytesToRead,BytesRead,nil) Then
          begin
          If BytesToRead = BytesRead Then
            begin
            Inc(Position,BytesRead);
            For I := 0 To BytesRead - 1 Do
              begin
              If Buffer[I] in Markers Then
                begin
                CurrentString := CurrentString + Buffer[I];
              end
              Else
                begin
                If Length(CurrentString) > 8 Then
                  begin
                  SL.Add(CurrentString);
                end;
                CurrentString := '';
              end;
            end;
          end;
        end;
      Until (Position >= FileSize);
    Finally
      FreeMem(Buffer);
    end;
  Finally
    CloseHandle(hFile);
  end;
end;

Open in new window

0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now