Solved

Problem of Tstringlist.indexof ....too slow

Posted on 2004-09-24
13
1,580 Views
Last Modified: 2008-01-09
i have 10000 ++ of numbers with some numbers are duplicate inside...but,they are not sorted in any order (neither ascending nor descending)
e.g.  num1 = {100,1234,4566,66,100,1,4566,1234,9999 ......}

i was required to remove those duplicate numbers without moving the sequence of any number, which the result should be :-

num1 = {100,1234,4566,66,1,9999 ........}

so, what i do is :

var num1,num2 :tstringlist;

for i:=0 to num1.count - 1 do begin
  if num2.indexof(num1.strings[i])<0 then begin
    num2.add(num1.strings[i]);
end;

result = num2.text;

but the problem is, it is running very slow, it need about 40++ second to finish 10000 ++ numbers .....

anotherissue is my total numbers is always changing .....sometimes, i have total 10 numbers, while sometimes might be 10000++ numbers ....

any idea ?

thanks            
0
Comment
Question by:chongkeng_woon
  • 4
  • 4
  • 3
  • +2
13 Comments
 
LVL 12

Expert Comment

by:Ivanov_G
ID: 12141858
use TStringList.CustomSort to make your own sorting routine, because comparing numbers as string is not very accurate...
0
 
LVL 22

Expert Comment

by:Ferruccio Accalai
ID: 12142062
Why don't you simply use TSTringList.Duplicates :=  dupIgnore;

procedure TForm1.Button1Click(Sender: TObject);
var
  List : TStringList;
  i  : integer;
begin
  List := TStringList.Create;
  List.Duplicates := dupIgnore;
  List.Sorted := true;
  List.LoadFromFile('FileName'); //or just add NUM1 as text
   
  for i := 0 to List.Count - 1 do
  begin
    //DO whatever you want
  end;

  List.Free;
end;
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 12142244
@Ferruccio68, dupIgnore does nothing if the stringlist is not sorted. And if I understand correctly, this list cannot be sorted.

@hongkeng_woon, why did you ever decide to use a stringlist for storing numbers in the first place? It would be a lot easier if you would store them in a dynamic array of integers in the first place. Or a TList if using dynamic arrays is too complex for you. If you use numbers only, store them as numbers! Then the comparisons will be a lot faster too.
At http://www.workshop-alex.org/Sources/untDuplicateCheck.pas you will find an interesting unit called untDuplicateCheck, written by me. Basically, it provides a mechanism to check for duplicates, and it does this quite fast. Use something like:

var
  DuplicateChecker: IDuplicateChecker;
  I: Integer;
begin
  DuplicateChecker:= NewDuplicateChecker; // Parameters are all optional.
  for i:=0 to num1.count - 1 do begin
    if AddAndCheck(num1.strings[i]) then num2.add(num1.strings[i]);
  end;
  DuplicateChecker:= nil;
end;

Above code will probably perform "slightly" faster... ;-)
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 12142265
One minor warning, though... Speed comes at a price. My method will eat a lot of memory. It takes 10 megabytes of memory when active. But the memory will be freed again once you're done and assigned nil to the duplicate checker. It's the price you have to pay for a very neat speed increase...
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 12142377
if AddAndCheck(num1.strings[i]) then num2.add(num1.strings[i]);

Should be:

if DuplicateChecker.AddAndCheck(num1.strings[i]) then num2.add(num1.strings[i]);

Silly me... :-)
0
 
LVL 22

Expert Comment

by:Ferruccio Accalai
ID: 12142385
--> @Ferruccio68, dupIgnore does nothing if the stringlist is not sorted. And if I understand correctly, this list cannot be sorted.
Gosh,i've totally misreaded the question! :((

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 22

Expert Comment

by:Ferruccio Accalai
ID: 12142444
Well, just a test....
Using Pos instead of indexof seems to be more fast.
This is quite fast using listboxes, so i guess that using STringLIsts missing the drawing should be more fast again

procedure TForm1.Button1Click(Sender: TObject);
var
i,y: Integer;
begin
for i := 1 to 3 do
   for y := 1 to 10000 do
     if y mod 2 = 0 then
      Listbox1.Items.Add(inttostr(y))
     else
      Listbox1.Items.Insert(Y-1,inttostr(y))
end;

procedure TForm1.Button2Click(Sender: TObject);
var
i: Integer;
s: STring;
begin
for i := 0 to Listbox1.Items.Count-1 do
begin
  If pos(Listbox1.Items[i],s) = 0 then
   s := s+Listbox1.Items[i]+',';
 end;
 ListBox2.Items.CommaText := s;
end;
0
 
LVL 2

Accepted Solution

by:
gary_williams earned 75 total points
ID: 12142929

function SortCompare2(List: TStringList; Index1, Index2: Integer): Integer;
begin
  Result := Integer(List.Objects[Index1]) - Integer(List.Objects[Index2]);
end;

function SortCompare1(List: TStringList; Index1, Index2: Integer): Integer;
begin
  Result := CompareText(List[Index1], List[Index2]);
  if (Result = 0) then
    Result := SortCompare2(List, Index1, Index2);
end;

{
  This procedure assumes the Objects property of the string list is unpopulated.
}
procedure RemoveDuplicatesFromStringListWhilePreservingOriginalOrder(const SL: TStringList);
var
  I: Integer;
begin
  SL.CommaText;
 
  for I := 0 to (SL.Count - 1) do
    SL.Objects[I] := TObject(I);

  SL.CustomSort(SortCompare1);

  for I := (SL.Count - 1) downto 1 do
    if (SL[I] = (SL[I - 1])) then
      begin
      Assert(Integer(SL.Objects[I]) > Integer(SL.Objects[I - 1]));
      SL.Delete(I);
      end;

  SL.CustomSort(SortCompare2);

  for I := 0 to (SL.Count - 1) do
    SL.Objects[I] := nil;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  SL: TStringList;
begin
  SL := TStringList.Create;
  try
    SL.Assign(Memo1.Lines);
    RemoveDuplicatesFromStringListWhilePreservingOriginalOrder(SL);
    Memo2.Lines.Assign(SL);
  finally
    SL.Free;
  end;
end;
0
 
LVL 2

Expert Comment

by:gary_williams
ID: 12143012
My solution avoids IndexOf completely.  SortCompare1 alphabetizes the list, and SortCompare2 restores the original sequence.  The original sequence is temporarily stuffed into the Objects property of the list, so this is only appropriate when you're not already using the Objects property.
0
 
LVL 2

Expert Comment

by:gary_williams
ID: 12143029
You can remove the reference to   SL.CommaText; in my solution, it was only put in temporarily for debugging as the linker removed the method and I wanted to watch the sequence in the debugger.  I just forgot to remove it.
0
 
LVL 2

Expert Comment

by:gary_williams
ID: 12143039
You can also remove the Assert.  Sorry about that.
0
 
LVL 5

Expert Comment

by:tzxie2000
ID: 12151000
suggest you change string to int as string compare is veryvery slow
the code is like below
I want to check the time it run but it runs so quick(not more than 1second) so I did not check the time

procedure TForm1.Button1Click(Sender: TObject);
var
  i,j:integer;
  s:string;
  dt:TDateTime;
  sl:TStrings;
  na:array[0..10000] of integer;
  found:boolean;
  tempi:integer;
begin
  s:='';
  s:=IntToStr(random(10000));
  for i := 2 to 10000 do
  begin
    s:=s+','+IntToStr(random(10000));
  end;
  dt:=Now;
  sl:=TStringList.Create;
  sl.QuoteChar:=',';
  sl.DelimitedText:=s;
  for i:=0 to sl.Count-1 do
  begin
    na[i]:=StrToInt(sl[i]);
  end;
  s:=IntToStr(na[0]);
  for i:=1 to sl.Count-1 do
  begin
    found:=false;
    for j:=i+1 to sl.Count-1 do
    begin
      if(na[j]=na[i])then
      begin
        found:=true;
        break;
      end;
    end;
      if not found then
      begin
        s:=s+','+IntToStr(na[i]);
      end;
  end;
AppliCation.MessageBox(PChar(s),'');
end;
0
 
LVL 17

Expert Comment

by:Wim ten Brink
ID: 12152434
As I see it, chongkeng_woon is moving strings from one stringlist to the second one but wants to skip all duplicates. One other alternatives would be by using two stringlists, adding each value to both of them. One sorted list and one unsorted list. With the sorted list you could use the IndexOf property to see if it's already added.

var num1,num2, num3 :tstringlist;

num2 := TStringList.Create;
num2.Sorted := False;
num3 := TStringList.Create;
num3.Sorted := True;
num3.Duplicates=dupIgnore;
for i:=0 to num1.count - 1 do begin
  if num3.indexof(num1.strings[i])<0 then begin
    num2.add(num1.strings[i]);
    num3.add(num1.strings[i]);
  end;
end;
num3.Free;

Then num2 is your list of items in the preferred order and num3 is just there to check for duplicates. But again, this is what I used my duplicateCheck unit for, which tends to be a bit faster.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
Objective: - This article will help user in how to convert their numeric value become words. How to use 1. You can copy this code in your Unit as function 2. than you can perform your function by type this code The Code   (CODE) The Im…
This is used to tweak the memory usage for your computer, it is used for servers more so than workstations but just be careful editing registry settings as it may cause irreversible results. I hold no responsibility for anything you do to the regist…
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

896 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now