Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

WORKING WITH VERY LARGE WORD LISTS

Posted on 2003-11-16
5
Medium Priority
?
214 Views
Last Modified: 2010-04-05
Dear Delphi Experts,

How I starting to do a program that involves comparing two very large word lists (~40.000 and 70.000 words) and finding out which words
are on one list and not on the other (and/or vice versa).

Each large word lists must be textfiles (.txt) and the results must be
presetend in a third textfile or listbox.

Many Thanks

LeTchev
0
Comment
Question by:letchev
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
5 Comments
 

Expert Comment

by:Smortex
ID: 9759832
Try this :
Load each word list in a TStringList (One word per line) avec sort them.

Create a function that search an item in a list.

Here is an example. In order to make it easier to read, I used 2 TListBox. When an item is clicked, a TLabel get the caption "True" if the item selected is found in the second TListBox, "False" if it was not found :

procedure TForm1.ListBox1Click(Sender: TObject);
  function FindTheOther(AWord: string; AList: TStringList): Boolean;
  var
    Offset, Step, CompResult: integer;
  begin
    Offset := AList.Count div 2;
    Step   := AList.Count;
    while Step <> 0 do
    begin
      Step := Step div 2;
      if Offset + Step >= AList.Count then
        Step := AList.Count - Offset - 1;
      CompResult := CompareText(AWord,AList[offset]);
      if CompResult = 0 then
      begin
        Result := True;
        Exit;
      end
      else
        if CompResult > 0 then
          Offset := Offset + Step
        else
          Offset := Offset - Step;
    end;
    Result := False;
  end;
begin
  Label1.Caption := BoolToStr(FindTheOther(ListBox1.Items[ListBox1.ItemIndex],TStringList(ListBox2.Items)),True);
end;

Hope that help :)

Regards
0
 

Accepted Solution

by:
Smortex earned 500 total points
ID: 9760080
Ooops....

This function should work better ;)

procedure TForm1.ListBox1Click(Sender: TObject);
  function FindTheOther(AWord: string; AList: TStringList): Boolean;
  var
    Offset, Step, CompResult: integer;
    LastChance : Integer;
  begin
    Offset := Ceil(AList.Count / 2);
    Step   := Ceil(AList.Count / 2);
    LastChance := 2;
    while LastChance <> 0 do
    begin

      Step := Ceil(Step / 2);

      if Step = 1 then
        Dec(LastChance);

      if Offset < 0 then
        Offset := 0;
      if Offset >= AList.Count then
        Offset := AList.Count - 1;

      CompResult := CompareText(AWord,AList[offset]);
      if CompResult = 0 then
      begin
        Result := True;
        Exit;
      end
      else
        if CompResult > 0 then
        begin
          Offset := Offset + Step;
        end
        else
        begin
          Offset := Offset - Step;
        end;
    end;
    Result := False;
  end;
begin
  Label1.Caption := BoolToStr(FindTheOther(ListBox1.Items[ListBox1.itemindex],TStringList(ListBox2.Items)),True);
end;

Sorry....

Regards
0
 

Author Comment

by:letchev
ID: 9791630
Dear Smortex,

Sorry, but it is not I want. Firstly I would need a routine for reading wordlists from .txt or .csv files. for example

List1.LoadFromFile('c:\1.txt');
List2.LoadFromFile('c:\2.txt');

ListBox3.Items.Add (here the common words found in list1 and list2

it is possible?

Thank you for your patience.

Letchev
0
 

Expert Comment

by:Smortex
ID: 9799850
Won can do this very easely using my function :

  List1 := TStringList.Create;
  List2 := TStringList.Create;
  try
    List1.LoadFromFile('c:\1.txt');
    List2.LoadFromFile('c:\2.txt');

    List1.Sort;
    List2.Sort;

    for i := 0 to Pred(List1.Count) do
      if FindTheOther(List1[i],List2) then
        ListBox1.Items.Add(List1[i]);

  finally
    List1.Free;
    List2.Free;
  end;

Regards
0
 

Expert Comment

by:Smortex
ID: 9844803
Note that if you dont make any search (with my "FindTheOther" function) on the first list (List1) you do not have to sort it :)
This line can so be removed :
    List1.Sort;

Regards
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…
Are you ready to place your question in front of subject-matter experts for more timely responses? With the release of Priority Question, Premium Members, Team Accounts and Qualified Experts can now identify the emergent level of their issue, signal…
Suggested Courses

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question