Link to home
Start Free TrialLog in
Avatar of hijinx
hijinx

asked on

Read txt file to Tstrlist extract dups read back to new txt file

I have a list of phone #'s in a txt file.  I want to read them out of the file into a Tstrlist (or what ever is best), remove any duplicates and then put them Back into a new txt file without any duplicates.  There are about 10,000-20,000 phone #'s in the txt file.  P.S> I also want to eliminate any blank lines and also want to append data to the new file later
ASKER CERTIFIED SOLUTION
Avatar of erajoj
erajoj
Flag of Sweden image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of hijinx
hijinx

ASKER

Edited text of question
In what way do you mean append?


procedure TMainForm.Button8Click(Sender: TObject);
var
  Index   : Integer;
  StrList : TStringList;
  TempStr : string;
begin
  Screen.Cursor := crHourGlass;
  Strlist := TStringList.Create;
  try
    StrList.LoadFromFile('C:\SLASK.TXT');
    StrList.Sort;
    TempStr := '';
    Index := 0;
    repeat
      if (StrList[Index]=TempStr)
      or (StrList[Index]='') // <<< Here!
      then begin
        StrList.Delete(Index);
      end else begin
        TempStr := StrList[Index];
        Inc(Index);
      end;
    until (Index=StrList.Count);
    StrList.SaveToFile('C:\SLASK2.TXT');
  finally
    StrList.Free;
    Screen.Cursor := crDefault;
  end;
end;

/// John
Avatar of hijinx

ASKER

What I really mean is that I want to create a file, read character data into it from another file.  Sort that file for duplicates then add that data to a third file that already has sorted unduplicated data in it.  I've done this by creating two sorted unduplicate files using the little program you sent and then reading the data from both files into a third file.  This is not a very elegant or speedy way to do this.  Especially if each file has a lot of data in it,   say 100,000 records each of 75-100 characters.
If you want higher speeds, you have to increase the points.
Maybe you can use a hashtable for the duplicate checking?
Are the records of uniform length?

/// John
Avatar of hijinx

ASKER

John,

The records vary between 10 to 75 characters, none shorter then 10 and none longer then 75.  However, you know, I think the way I'm doing it will be OK.  Thanks a lot for your help.  

I will try a hash table and see if it works any better though


regards
Avatar of hijinx

ASKER

John,

One further  annd I hope final question.  I want to use a construct to find and extract a string after a word.  So I guess I need to  use IndexOf is this the correct Construct:

"If IndexOf(TstrList[Index]: 'And:') > -1
      Then Begin"


If the string you're looking for exactly (case insensitively) matches 'And:' then it will work, otherwise:
you'll have to use brute-force + "Pos" or divide&conquer + "Pos"
or some other method.
If you need an example or further help, then give me my points or reject my answer so that someboby else can help you!
Tip: do not use "TstrList" as an identifier, simply use "StrList"
or something else. "T" is defacto used as a "type/class" prefix.
This is to make it easier later on.

/// John