Solved

Linked list

Posted on 2000-05-04
23
158 Views
Last Modified: 2010-04-04
I have 2linked list,
i need to see if any of the records on list2 are on list1. And i need to do that in a verry fast way.

the linkedlists are:
type  plist =^list
list= record
       name:string
       size:integer;
       next: plist
end
0
Comment
Question by:lfrodrigues
  • 9
  • 8
  • 3
  • +2
23 Comments
 
LVL 13

Expert Comment

by:Epsylon
Comment Utility
I didn't test it but I think this should be it:


function LinkedListCompare(list1, list2: plist): Boolean;
var found: Boolean;
    i, j: plist;
begin
  i := list2;
  repeat
    found := false;
    j := list1;
    repeat
      if i.name = j.name then
        found := true;
      j := j.next;
    until (found) or (j = nil);
    i := i.next;
  until (not found) or (i = nil);
  result := found;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  if LinkedListCompare(list1, list2) then
    ShowMessage('All items found')
  else
    ShowMessage('One or more items missing');
end;
0
 
LVL 20

Expert Comment

by:Madshi
Comment Utility
Hi Eps, looks alright, but you will get an exception, if one of the lists is "nil".   :-)

Regards, Madshi.
0
 
LVL 10

Expert Comment

by:ECollin
Comment Utility
Just assuming that PD1 points on the first element of List1 and PD2 points on the first element of List2


Function Search : Boolean;
var Found : Boolean;
    L1,L2 : PList;
Begin
L1 := PD1;
Found := False;
While (L1 <> nil) and (not Found) do
  Begin
  L2 := PD2;
  While (L2 <> Nil) and (not Found) do
    Begin
    Found := L2^.Name = L1^.Name;
    L2 := L2^.Next;
    End; // While
  L1 := L1^.Next;
  End; // While

Result := Found;
End; // Function



0
 
LVL 13

Expert Comment

by:Epsylon
Comment Utility
Thanks Madshi for pointing that out. I think this should be safe:


function LinkedListCompare(list1, list2: plist): Boolean;
var found: Boolean;
    i, j: plist;
begin
  found := false;
  if (list1 = nil) or (list2 = nil) then
  begin
    // We have 3 situations here
    if (list1 = nil) and (list2 = nil) then
      found := true
    else if list2 = nil then
      found := true
    else if list1 = nil then
      found := false;
  end
  else begin
    i := list2;
    repeat
      found := false;
      j := list1;
      repeat
        if i.name = j.name then
          found := true;
        j := j.next;
      until (found) or (j = nil);
      i := i.next;
    until (not found) or (i = nil);
  end;
  result := found;
end;
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
Hi experts :-)

You both left out the size member in the record.

lfrodrigues, is speed a significant matter in this question or is it merely the algorithm you are after?

If speed is an issue and we can make some assumption then comparison can greatly be improved.

Ciao, Mike
0
 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
i know how to do that the way you showed me.
I need a very fast way of comparing
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
Well then, let's talk about assumptions we can make to ease the comparison. How are the lists organized? What I mean is:

1) Are they sorted in any way?
2) What does the size member specify?

Additionally:

3) Have the node texts specific properties like a fixed length, only upper case or lower case or similar?
4) Do you need case sensitve comparison?
5) What actually is to be considered as being equal lists? I assume:
  a) strings must be equal
  b) nodes must be in the same order
  c) size member must be equal
  d) count of nodes must be equal
  Is this correct? What more, what less?
6) About what dimensions are we talking here (1 million nodes or runs, less, more)?
7) Can it be that a node appears in both lists or are the memory locations always different? If a node would be in both lists then this would mean that all the nodes following it would be in both lists too.

Ciao, Mike
0
 
LVL 13

Expert Comment

by:Epsylon
Comment Utility
>i know how to do that the way you showed me.

Your question does not tell me that so how could I know that. Now it seems that I have been wasting my time   :o(
0
 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
Lischke
1)
the nods aren't ordered,
2)
name: a filename
size: the file size
4) case sensitive would be usefull but not mandatory
5)
to be equal the names must be equal.
probably less than one million but could be more

what i want is a way to find if any of the filenames on list2 are on list1.

hope that helps you



-------------------------
sorry epsylon i thought that way was to basic and that anyone would know a better way.

0
 
LVL 20

Expert Comment

by:Madshi
Comment Utility
It depends on how often you have to look through that list. If you do that only once, then I think, there's no better way than Eps'. If you do that more than once, it might be faster to sort one list.
But you can save time when we talk about string comparison. If the names are file names, then the compare logic should be case insensitive, that means "c:\autoexec.bat" should be the same as "C:\Autoexec.Bat".
Which function do you use now? CompareText? Or "UpperCase(str1) = UpperCase(str2)"? Doesn't matter, the following function is faster anyway:

var lowCharTable : array [#0..#$FF] of char =
  (#$00,#$01,#$02,#$03,#$04,#$05,#$06,#$07,#$08,#$09,#$0A,#$0B,#$0C,#$0D,#$0E,#$0F,
   #$10,#$11,#$12,#$13,#$14,#$15,#$16,#$17,#$18,#$19,#$1A,#$1B,#$1C,#$1D,#$1E,#$1F,
   #$20,#$21,#$22,#$23,#$24,#$25,#$26,#$27,#$28,#$29,#$2A,#$2B,#$2C,#$2D,#$2E,#$2F,
   #$30,#$31,#$32,#$33,#$34,#$35,#$36,#$37,#$38,#$39,#$3A,#$3B,#$3C,#$3D,#$3E,#$3F,
   #$40,#$61,#$62,#$63,#$64,#$65,#$66,#$67,#$68,#$69,#$6A,#$6B,#$6C,#$6D,#$6E,#$6F,
   #$70,#$71,#$72,#$73,#$74,#$75,#$76,#$77,#$78,#$79,#$7A,#$5B,#$5C,#$5D,#$5E,#$5F,
   #$60,#$61,#$62,#$63,#$64,#$65,#$66,#$67,#$68,#$69,#$6A,#$6B,#$6C,#$6D,#$6E,#$6F,
   #$70,#$71,#$72,#$73,#$74,#$75,#$76,#$77,#$78,#$79,#$7A,#$7B,#$7C,#$7D,#$7E,#$7F,
   #$80,#$81,#$82,#$83,#$84,#$85,#$86,#$87,#$88,#$89,#$9A,#$8B,#$9C,#$8D,#$9E,#$8F,
   #$90,#$91,#$92,#$93,#$94,#$95,#$96,#$97,#$98,#$99,#$9A,#$9B,#$9C,#$9D,#$9E,#$FF,
   #$A0,#$A1,#$A2,#$A3,#$A4,#$A5,#$A6,#$A7,#$A8,#$A9,#$AA,#$AB,#$AC,#$AD,#$AE,#$AF,
   #$B0,#$B1,#$B2,#$B3,#$B4,#$B5,#$B6,#$B7,#$B8,#$B9,#$BA,#$BB,#$BC,#$BD,#$BE,#$BF,
   #$E0,#$E1,#$E2,#$E3,#$E4,#$E5,#$E6,#$E7,#$E8,#$E9,#$EA,#$EB,#$EC,#$ED,#$EE,#$EF,
   #$F0,#$F1,#$F2,#$F3,#$F4,#$F5,#$F6,#$D7,#$F8,#$F9,#$FA,#$FB,#$FC,#$FD,#$FE,#$DF,
   #$E0,#$E1,#$E2,#$E3,#$E4,#$E5,#$E6,#$E7,#$E8,#$E9,#$EA,#$EB,#$EC,#$ED,#$EE,#$EF,
   #$F0,#$F1,#$F2,#$F3,#$F4,#$F5,#$F6,#$F7,#$F8,#$F9,#$FA,#$FB,#$FC,#$FD,#$FE,#$FF);

function IsTextEqual(const s1, s2: string) : boolean;
var c1, c2 : cardinal;
begin
  c1 := Length(s1);
  result := cardinal(Length(s2)) = c1;
  if result then
    for c2 := 1 to c1 do
      if lowCharTable[s1[c2]] <> lowCharTable[s2[c2]] then begin
        result := false;
        break;
      end;
end;

Regards, Madshi.
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
Ah yes, this simplifies the stuff a lot and I do agree with Madshi that Epsylon's solution is the best one so far (plus the case insensitive match).

But consider that this approach needs (in worst case) M * N comparisons (and inner loops) where M is the number of names in list one and N the number of names in list two. The time for the search will so increase quadratic with increasing file numers. Hence I strongly recommend to use a kind of sorted lists. Since you have linked lists using pointers there will be only a small penalty to insert new nodes in a sorted manner (no need to move large memory blocks around, like in linear lists). Doing a binary search (which would be the fastest way) is still a bit expensive because you cannot directly determine mid points in a linked list. So a binary tree (patricia tree, whatever) would be the best option.

Since there's significant more work to do it depends on the importance of a fast comparison.

Ciao, Mike
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
Lischke  you are correct i need to use the worst case. How do i sort the list1 and how do i search the two lists in a REALLY fast way?
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
Hi Rodrigues,

here's my solution for you. First the code then the discussion:

type
  PFileList = ^TFileList;
  TFileList = record
    Name: String;
    Size: Cardinal; // perhaps Int64 is better here for files > 4GB
    Next: PFileList;
  end;

procedure InsertNodeSorted(var List: PFileList; Name: String; Size: Cardinal);

// Inserts a new node with the given values into the list List where the position is
// determined by the given Name (alpha sorted list).
// Note: To make this work you must have a dummy node as root which is not allocated here
//       and should not be passed to the compare function (Use Root1.Next and Root2.Next instead).

var
  NewEntry: PFileList;
 
begin
  // find last node which is less than Name
  while Assigned(List.Next) and (List.Next.Name < Name) do List := List.Next;
  // create new node...
  New(NewEntry);
  NewEntry.Name := Name;
  NewEntry.Size := Size;
  // ...and insert it
  NewEntry.Next := List.Next;
  List.Next := NewEntry;
end;

function ContainsAnyFile(List1, List2: PFileList): Boolean;

// Determines whether files from List2 are in List1.
// Returns True if at least one file of List2 is in List1, otherwise False.
// Note: Both lists must be sorted alphabetically with shorter names placed before longer names!

begin
  // make sure we don't try to access empty lists
  Result := Assigned(List1) and Assigned(List2);
  if Result then
  begin
    repeat
      // loop until we found one entry in both lists
      Result := List1.Name = List2.Name;
      if Result then Break;
      // skip entries in List1 which are smaller than the first entry in List2
      while Assigned(List1) and (List1.Name < List2.Name) do List1 := List1.Next;
      // Result is False at this point
      if List1 = nil then Break;
      // skip entries in List2 which are smaller than the first entry in List1
      while Assigned(List2) and (List2.Name < List1.Name) do List2 := List2.Next;
    until List2 = nil;
  end;
end;

Under consideration that we are talking about file lists I assumed they don't become very large, so insertion should not be optimized too much for the cost of much more complex code.

On a linked list with N + 1 entries it takes on average N /2 comparisons to find a node sequentially. This is only important for the sorted insertion here!

The searched speeds up to M + N + 1 comparisons in worst case. E.g. 10,000 'a's and 1 'z' in list 1, 10,000 'b's in list 2 lead to 20,001 comparisons here in opposition to 1,000,000 comparisons in the old case!

Until now the source code is still very trivial. If you still need a faster search then you need to implement binary trees or an array instead of a linked list. The array approach is much better because of its property to be binary searchable, which results in an average comparison count of at most ~ lg N. This is for 1,000,000 entries 20! But in this case insertion cost a lot of time because possibly large memory areas have to be moved around. With N items in list2 and M items in list 1 this results in about N * lg M comparisons on average. Quite impressive, isn't it? This is the reason why I prefer arrays, but you would need to stop using linked lists.

Binary trees are better when inserting in a sorted manner, but need quite complex code and searching costs on average 2 * lg N comparisons (worst case N comparisons). But this is for special needs only and has other drawbacks.

Ciao, Mike
0
 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
Its not fast enouhgt, you are probably right it would be better to use an array, but how do i use binary search with strings?
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
This is easy. Look at this code which is directly copied from my virtual treeview implementation which handles one million nodes very smoothly:

function TBaseVirtualTree.FindInNodeCache(Node: PVirtualNode; var Index: Integer): Boolean;

// Looks through the node cache and returns the index of the found entry in Index. When successful then the result
// is True else False (in which case Index contains a value as nearest as possible to the given node).

var
  L, H, I, C: Integer;

begin
  Result := False;
  L := 0;
  H := High(FNodeCache);
  while L <= H do
  begin
    I := (L + H) shr 1;
    C := Integer(FNodeCache[I].Node) - Integer(Node);
    if C < 0 then L := I + 1 else
    begin
      H := I - 1;
      if C = 0 then
      begin
        Result := True;
        L := I;
      end;
    end;
  end;
  Index := L;
end;

FNodeCache is of type TCache which is defined as:

  TCacheEntry = record
    Node: PVirtualNode;
    AbsoluteTop: Cardinal;
  end;

  TCache = array of TCacheEntry;

You need of course to modify the cache entry to:

  TCacheEntry = record
    Name: String;
    Size: Cardinal;
  end;

Now the binary search whether a string is in the cache is:

function TBaseVirtualTree.FindInNodeCache(Name: String; var Index: Integer): Boolean;

// Looks through the node cache and returns the index of the found entry in Index. When successful then the result
// is True else False (in which case Index contains a value as nearest as possible to the given node).

var
  L, H, I, C: Integer;

begin
  Result := False;
  L := 0;
  H := High(FNodeCache);
  while L <= H do
  begin
    I := (L + H) shr 1;
    // compare not case sensitive
    C := CompareText(FNodeCache[I].Name, Name);
    if C < 0 then L := I + 1 else
    begin
      H := I - 1;
      if C = 0 then
      begin
        Result := True;
        L := I;
      end;
    end;
  end;
  Index := L;
end;


This code is extremely efficient, as I already pointed out. The worst case costs ld N (at most 20 comparations for 1,000,000 entries). But the average cost is of course much lesser.

But now comes the hard stuff to insert new entries as fast as possible into the cache, but this one will you excellent response times:

procedure TBaseVirtualTree.InternalAddToSelection(const NewItems: TNodeArray; NewLength: Integer;
  ForceInsert: Boolean);

// Internal version of method AddToSelection which does not trigger OnChange events

var
  I, J: Integer;
  CurrentEnd: Integer;

begin
  // The idea behind this code is to use a kind of reverse merge sort. QuickSort is quite fast
  // and would do the job here too but has a serious problem with already sorted lists like FSelection.

  // 1) Remove already selected items, mark all other as being selected
  // here I have snipped some code not related to your problem
  // PackArray removes all nil entries from NewItems (if there are any) and returns the number of remaining entries  
  I := PackArray(NewItems, NewLength);
  if I > -1 then NewLength := I;
  if NewLength > 0 then
  begin
    // 2) Sort the new item list so we can easily traverse it.
    if NewLength > 1 then QuickSort(NewItems, 0, NewLength - 1);
    // 3) Make room in FSelection for the new items.
    if FSelectionCount + NewLength >= Length(FSelection) then
    begin
      SetLength(FSelection, FSelectionCount + NewLength);
    end;

    // 4) Merge in new items
    J := NewLength - 1;
    CurrentEnd := FSelectionCount - 1;

    while J >= 0 do
    begin
      // First insert all new entries which are greater than the greatest entry in the old list.
      // If the current end marker is < 0 then there's nothing more to move in the selection
      // array and only the remaining new items must be inserted.
      if CurrentEnd >= 0 then
      begin
        while (J >= 0) and (Cardinal(NewItems[J]) > Cardinal(FSelection[CurrentEnd])) do
        begin
          FSelection[CurrentEnd + J + 1] := NewItems[J];
          Dec(J);
        end;
        // early out if nothing more needs to be copied
        if J < 0 then Break;
      end
      else
      begin
        // insert remaining new entries at position 0
        Move(NewItems[0], FSelection[0], (J + 1) * SizeOf(Pointer));
        // nothing more to do so exit main loop
        Break;
      end;

      // find the last entry in the remaining selection list which is smaller then the largest
      // entry in the remaining new items list
      FindNodeInSelection(NewItems[J], I, 0, CurrentEnd);
      Dec(I);
      // move all entries which are greater than the greatest entry in the new items list up
      // so the remaining gap travels down to where new items must be inserted
      Move(FSelection[I + 1], FSelection[I + J + 2], (CurrentEnd - I) * SizeOf(Pointer));
      CurrentEnd := I;
    end;

    Inc(FSelectionCount, NewLength);
  end;
end;

The Quicksort algotithm (as well as the binary search) are used in a different manner in the VCL too (see TStringList). For you I post my code here:

procedure QuickSort(const TheArray: TCache; L, R: Integer);

var
  I, J, M: Integer;
  T: TCacheEntry;

begin
  repeat
    I := L;
    J := R;
    M := (L + R) shr 1;
    repeat
      while CompareText(TheArray[I].Name, TheArray[M].Name) < 0 do Inc(I);
      while Cardinal(TheArray[J].Name, TheArray[P].Name) > 0 do Dec(J);
      if I <= J then
      begin
        T := TheArray[I];
        TheArray[I] := TheArray[J];
        TheArray[J] := T;
        Inc(I);
        Dec(J);
      end;
    until I > J;
    if L < J then QuickSort(TheArray, L, J);
    L := I;
  until I >= R;
end;

Is this now fast enough?

Ciao, Mike
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
Oops, the second line should of course be:

while CompareText(TheArray[J].Name, TheArray[P].Name) > 0 do Dec(J);

Ciao, Mike
0
 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
hi mike

What is FNodeCache? in FindInNodeCache
What is FindNodeInSelection? in InternalAddToSelection
What is PackArray? in InternalAddToSelection
The code is just to confusing since there are many missing declarations,
could you please give some example on how to use your source.


bye LR
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
Have you read all the comments in the text? This stuff is indeed not easy (to understand). I told it to you initially. On the other hand it is the fastest code I could come up with (after quite long research, I even have asked some "algorithm peoples" here at E-E which gave me the final idea).

FNodeCache is of type TCache which is defined as (copy from above):

  TCacheEntry = record
    Node: PVirtualNode;
    AbsoluteTop: Cardinal;
  end;

  TCache = array of TCacheEntry;

You need of course to modify the cache entry to:

  TCacheEntry = record
    Name: String;
    Size: Cardinal;
  end;

Both Find* functions are just binary searches in their caches.

PackArray is also described in the text. It is used to quickly remove items too. This is for your case not relevant. Just ignore the related lines.

The code itself (InternalAddToSelection) is quite complex, I know, but nonetheless fast. Keep in mind it is ONLY to quickly add a range of items into a local array, nothing else. Once you have managed this you can very quick compare two lists.

Ciao, Mike
0
 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
Mike

I probably did made myself clear enough.
Have you tried to run the code i gave me?
Some types and functions are not declared eg: PVirtualNode ,FindNodeInSelection
I would like to have those types and functions.

bye
        LR
0
 
LVL 10

Accepted Solution

by:
Lischke earned 200 total points
Comment Utility
I think I understood you perfectly, but the problem is you cannot run the code as it is. There are too many things missing. It is to show you the way to do what you want. My contribution here is to give you the ideas and show you the direction. The work still has to be done by yourself.

Ok, I gave some working code too (Quicksort, the compare function etc.) but you should consider this as bonus. From your request to give you declarations like PVirtualNode I conclude that you have not yet tried to understand what I gave you nor have you even read the code, otherwise you would have seen that I told you twice how you must modify the array entry (replacing PVirtualNode by a string).

Well, let me try it a last time. You probably don't need such a fast solution as I posted here. You can do it easier with some speed penalty:

1) Fill List1 and List2 with values
2) sort both lists using the QuickSort algorithm from above
3) Use the following modified compare function to determine whether files in both lists:

type
  TCacheEntry = record
    Name: String;
    Size: Cardinal;
  end;

  TCache = array of TCacheEntry;

function ContainsAnyFile(List1, List2: TCache): Boolean;

// Determines whether files from List2 are in List1.
// Returns True if at least one file of List2 is in List1, otherwise False.
// Note: Both lists must be sorted alphabetically with shorter names placed before longer names!

var
  I, J,
  Len1, Len2: Cardinal;
 
begin
  // make sure we don't try to access empty lists
  Result := Assigned(List1) and Assigned(List2);
  if Result then
  begin
    I := 0;
    J := 0;
    Len1 := Length(List1);
    Len2 := Length(List2);
    repeat
      // loop until we found one entry in both lists
      Result := List1[I].Name = List2[J].Name;
      if Result then Break;
      // skip entries in List1 which are smaller than the first entry in List2
      while (I < Len1) and (List1[I].Name < List2[J].Name) do Inc(I);
      // Result is False at this point
      if I = Len1 then Break;
      // skip entries in List2 which are smaller than the first entry in List1
      while (J < Len2) and (List2[J].Name < List1[I].Name) do Inc(J);
    until J = Len2;
  end;
end;


Ciao, Mike
0
 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
Yes you are correct i really didn't spent more than 5 min loking at the source. I just thought that for 200 points you would bother to give me a full working solution. I really need this i don't have the time to understand the source and make the changes. Well i will acept you anwser anyway.

Thanks
              LR
0
 
LVL 10

Expert Comment

by:Lischke
Comment Utility
Well, at last I gave you working code. Perhaps next time it would be wise to tell already in the question that you want fully working source code. I'm not a friend to make it standard for an acceptable answer to provide full source code with each and every bit worked out. But if you make the code a condition explicitly I can in advance decide how much time I willing to spend on your question. Usually writing the code is only a small excercise once you actually understood the problem and the provided solution. If you do not understand it then you will never be able to fix bugs or enhance it...

Ciao, Mike
0
 
LVL 1

Author Comment

by:lfrodrigues
Comment Utility
Don't worry i've been exeminig the source for the last 30 min and i find it easy enought to understant. Thanks for everything

Bye
       LR
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

Objective: - This article will help user in how to convert their numeric value become words. How to use 1. You can copy this code in your Unit as function 2. than you can perform your function by type this code The Code   (CODE) The Im…
Introduction I have seen many questions in this Delphi topic area where queries in threads are needed or suggested. I know bumped into a similar need. This article will address some of the concepts when dealing with a multithreaded delphi database…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now