[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

WORKING WITH VERY LARGE WORD LISTS

Posted on 2003-11-16
5
Medium Priority
?
215 Views
Last Modified: 2010-04-05
Dear Delphi Experts,

How I starting to do a program that involves comparing two very large word lists (~40.000 and 70.000 words) and finding out which words
are on one list and not on the other (and/or vice versa).

Each large word lists must be textfiles (.txt) and the results must be
presetend in a third textfile or listbox.

Many Thanks

LeTchev
0
Comment
Question by:letchev
  • 4
5 Comments
 

Expert Comment

by:Smortex
ID: 9759832
Try this :
Load each word list in a TStringList (One word per line) avec sort them.

Create a function that search an item in a list.

Here is an example. In order to make it easier to read, I used 2 TListBox. When an item is clicked, a TLabel get the caption "True" if the item selected is found in the second TListBox, "False" if it was not found :

procedure TForm1.ListBox1Click(Sender: TObject);
  function FindTheOther(AWord: string; AList: TStringList): Boolean;
  var
    Offset, Step, CompResult: integer;
  begin
    Offset := AList.Count div 2;
    Step   := AList.Count;
    while Step <> 0 do
    begin
      Step := Step div 2;
      if Offset + Step >= AList.Count then
        Step := AList.Count - Offset - 1;
      CompResult := CompareText(AWord,AList[offset]);
      if CompResult = 0 then
      begin
        Result := True;
        Exit;
      end
      else
        if CompResult > 0 then
          Offset := Offset + Step
        else
          Offset := Offset - Step;
    end;
    Result := False;
  end;
begin
  Label1.Caption := BoolToStr(FindTheOther(ListBox1.Items[ListBox1.ItemIndex],TStringList(ListBox2.Items)),True);
end;

Hope that help :)

Regards
0
 

Accepted Solution

by:
Smortex earned 500 total points
ID: 9760080
Ooops....

This function should work better ;)

procedure TForm1.ListBox1Click(Sender: TObject);
  function FindTheOther(AWord: string; AList: TStringList): Boolean;
  var
    Offset, Step, CompResult: integer;
    LastChance : Integer;
  begin
    Offset := Ceil(AList.Count / 2);
    Step   := Ceil(AList.Count / 2);
    LastChance := 2;
    while LastChance <> 0 do
    begin

      Step := Ceil(Step / 2);

      if Step = 1 then
        Dec(LastChance);

      if Offset < 0 then
        Offset := 0;
      if Offset >= AList.Count then
        Offset := AList.Count - 1;

      CompResult := CompareText(AWord,AList[offset]);
      if CompResult = 0 then
      begin
        Result := True;
        Exit;
      end
      else
        if CompResult > 0 then
        begin
          Offset := Offset + Step;
        end
        else
        begin
          Offset := Offset - Step;
        end;
    end;
    Result := False;
  end;
begin
  Label1.Caption := BoolToStr(FindTheOther(ListBox1.Items[ListBox1.itemindex],TStringList(ListBox2.Items)),True);
end;

Sorry....

Regards
0
 

Author Comment

by:letchev
ID: 9791630
Dear Smortex,

Sorry, but it is not I want. Firstly I would need a routine for reading wordlists from .txt or .csv files. for example

List1.LoadFromFile('c:\1.txt');
List2.LoadFromFile('c:\2.txt');

ListBox3.Items.Add (here the common words found in list1 and list2

it is possible?

Thank you for your patience.

Letchev
0
 

Expert Comment

by:Smortex
ID: 9799850
Won can do this very easely using my function :

  List1 := TStringList.Create;
  List2 := TStringList.Create;
  try
    List1.LoadFromFile('c:\1.txt');
    List2.LoadFromFile('c:\2.txt');

    List1.Sort;
    List2.Sort;

    for i := 0 to Pred(List1.Count) do
      if FindTheOther(List1[i],List2) then
        ListBox1.Items.Add(List1[i]);

  finally
    List1.Free;
    List2.Free;
  end;

Regards
0
 

Expert Comment

by:Smortex
ID: 9844803
Note that if you dont make any search (with my "FindTheOther" function) on the first list (List1) you do not have to sort it :)
This line can so be removed :
    List1.Sort;

Regards
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction I have seen many questions in this Delphi topic area where queries in threads are needed or suggested. I know bumped into a similar need. This article will address some of the concepts when dealing with a multithreaded delphi database…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
This Micro Tutorial will teach you how to add a cinematic look to any film or video out there. There are very few simple steps that you will follow to do so. This will be demonstrated using Adobe Premiere Pro CS6.
Look below the covers at a subform control , and the form that is inside it. Explore properties and see how easy it is to aggregate, get statistics, and synchronize results for your data. A Microsoft Access subform is used to show relevant calcul…
Suggested Courses

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question