[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Very large string building in Delphi

Posted on 2008-11-14
15
Medium Priority
?
1,746 Views
Last Modified: 2012-05-05
In one of my Delphi application (Turbo Delphi 2006), I have a loop that builds a very large string
It looks like this :

.../...
for I := 1 to N do
  begin
    .../...
    LargeString := LargeString + Something NotToSmall;
    .../...
  end;
.../...

I need that LargeString for some other processing later on (too long to explain here)
At the end of this loop, LargeString may be several megabytes large.
This as such is not a problem, but I am afraid that in some occasion, the memory manager could have a kind of problem due to the repeated instruction "LargeString := LargeString + Something NotToSmall;"
LargeString becomes larger and larger.
I suppose that to execute this instruction, the memory manager will first be asked to allocate a bloc of length = total length of resulting string, and after, will be able to free the previous one.
But the freed place will not be reused because too small the next time, etc...
So at the end, a kind of short of memory could occur.
How can I avoid this to happen ? Alternative code ? Which one ?
Thanks

 
0
Comment
Question by:LeTay
  • 5
  • 3
  • 3
  • +2
14 Comments
 
LVL 46

Accepted Solution

by:
aikimark earned 1200 total points
ID: 22960592
* TMemoryStream
* Size() the string to the final size and change the contents of the string rather than use concatenation operations.
* add your strings to a TStringList and then combine the results without concatenation.
0
 
LVL 46

Expert Comment

by:aikimark
ID: 22960604
and...
* use any number of available classes/units that make string appending efficient and fast.
0
 

Author Comment

by:LeTay
ID: 22960823
Does TStringList offer a method that returns the all stuff as one single string without CR LF ?
I am interested with the Size(), as I can calculate the final length needed
Once initialized, how do I replace part of it with some other string ? A loop ? In one shot ?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 27

Expert Comment

by:BigRat
ID: 22960974
>>too long to explain here)

Well, if it is anything like my RatScript interpreter, I solved it by using a sort of variant for the data variables whereby the string type has alloccount,maxlength, length and pointer to memory fields. I allocate the memory first depending on the string length, then reallocate and copy each time I need more space. The first reallocation is 128 bytes, the seond 5,000 and the third 50,000 always rounded up from the required length. The effect is that longstring = longstring + shortstring, particlarly when length(shortstring)<100, nearly always fits. Since my memory blocks have definite sizes they get used up quite quickly. I use the interpreter with legacy code in application server mode and although the program with code and data is 10MB it rarely exceeds 50MB during execution. One must remember that memory is cheap, so wasting a bit to gain system performance is a good trade-off.
0
 

Author Comment

by:LeTay
ID: 22961727
Hello AikiMark
I will use the trick with SetLength(LargeString,Size)
Is there another that the following to push a string into another
for I := 1 to Length(SmallString) do
 begin
    P := P + 1;
    LargeString[P] := SmallString[I];
  end;
0
 
LVL 46

Expert Comment

by:aikimark
ID: 22965105
You could use the TStringList.SaveToStream after setting the LineBreak property := ' ';
That should result in the strings being separated by a space instead of CR LF characters.

OR...

SaveToStream and then use a replace() function to change all the CRLF into ' '
0
 
LVL 26

Assisted Solution

by:Russell Libby
Russell Libby earned 800 total points
ID: 22965425
Take a look at the TStringAppend class that I put together; this should do the job you are looking for. In the test runs below the class runs the 100K loop in roughly 40 ms. The string concatenation in Delphi (s:=s+{random string}) runs in 754845 ms, or a little over 12 1/2 minutes.

Regards,
Russell

--

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls;

type
  TForm1 = class(TForm)
    Button1: TButton;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

type
  TStringAppend     =  class(TObject)
  private
     // Private declarations
     FData:         String;
     FPosition:     Integer;
     FSize:         Integer;
  protected
     // Protected declarations
     function       GetData: String;
     procedure      SetSize(Value: Integer);
  public
     // Public declaration
     constructor    Create;
     procedure      Append(const S: String);
     property       Data: String read GetData;
     property       Size: Integer read FSize write SetSize;
  end;

var
  Form1: TForm1;

implementation
{$R *.DFM}

procedure TStringAppend.Append(const S: String);
var  dwLength:      Integer;

begin

  // Check length
  dwLength:=Length(S);

  // Not null?
  if (dwLength > 0) then
  begin
     // Check against size
     if ((dwLength + FPosition) > FSize) then SetSize(FSize * 2 + dwLength * 2);
     // Append the string
     StrCopy(PChar(@FData[Succ(FPosition)]), PChar(S));
     // Update position
     Inc(FPosition, dwLength);
  end;

end;

procedure TStringAppend.SetSize(Value: Integer);
begin

  // Check to see if smaller
  if (Value <> FSize) then
  begin
     // Update size
     FSize:=Value;
     // Set string
     SetLength(FData, FSize);
     // Adjust position if needed
     if (FPosition > FSize) then FPosition:=FSize;
  end;

end;

function TStringAppend.GetData: String;
begin

  // Truncate to current position
  SetLength(FData, FPosition);

  // Update size
  FSize:=FPosition;

  // Return data
  result:=FData;

end;

constructor TStringAppend.Create;
begin

  // Perform inherited
  inherited Create;

  // Set defaults
  FPosition:=0;
  FSize:=0;
  SetLength(FData, FSize);

end;

// Test example
procedure TForm1.Button1Click(Sender: TObject);
var
  strAppend:     TStringAppend;
  dwMark:        LongWord;
  i:             Integer;
  s:             String;
begin

  // First test
  SetLength(s, 0);
  strAppend:=TStringAppend.Create;

  dwMark:=GetTickCount;
  for i:=1 to 100000 do
  begin
     strAppend.Append(StringOfChar(Chr(Random(32) + 65), Random(100)));
  end;
  dwMark:=GetTickCount - dwMark;
  Caption:=Format('String size is now: %d (%d ms)', [Length(strAppend.Data), dwMark]);
  strAppend.Free;

  ShowMessage('Next test...');

  // Second test
  SetLength(s, 0);
  dwMark:=GetTickCount;
  for i:=1 to 100000 do
  begin
     s:=s + StringOfChar(Chr(Random(32) + 65), Random(100));
  end;
  dwMark:=GetTickCount - dwMark;
  Caption:=Format('String size is now: %d (%d ms)', [Length(s), dwMark]);

end;

end.
0
 
LVL 46

Expert Comment

by:aikimark
ID: 22965599
@Russell

Nice class.  I do have a concern that your resizing formula might be a bit too greedy, doubling (and then some) the working buffer during every expansion.  Maybe I'm just being paranoid.

Out of curiosity, how well does a TStringList do against your bazingly fast TStringAppend class?
0
 
LVL 26

Expert Comment

by:Russell Libby
ID: 22967290
Well, given that the asker already accepted AN answer, its academic at this point. But I will share this...

The code I posted is faster than TMemoryStream, and faster than TStringList. My class also allows for setting of size (which I did not demonstrate), which will perform 1 allocation and if the final string length is known ahead of time, no memory is wasted. In the example above, setting size to 100K times max random size (100) resulted in runtimes of 20ms. Thats almost 38,000 X faster than delphi's string concatenation.

And memory usage when size is not specified, as in the first example? Keep in mind that TStringList maintains a list of string pointers, so thats 1/2 MB of memory right there for maintenance. The final string obtained from a call to .Text is allocated seperately, so you end up using the total string size * 2 + 500KB of memory. Something to think about...

TMemoryStream also uses a sliding allocation, but is too conservative (IMHO) resulting is less wasted memory but increased runtimes due to the number of reallocs that must be performed. And thats the real killer in this sort of problem; reallocations. The resulting speed has a direct correlation with the number of reallocations that must be made. Keep the reallocs down, and the speed goes up. All I did was try to provide a string class that is an all around best performer; in a worst case scenario (demo'd above) it utilizes extra memory, and when the size is known no memory is wasted.

Russell

 
0
 

Author Closing Comment

by:LeTay
ID: 31516804
I will use the SetLength fonction as I can calculate the final string length
0
 
LVL 38

Expert Comment

by:Geert Gruwez
ID: 22986007
Since it's academic anyway,
for completeness shouldn't somebody mention the FastStrings unit by Peter Morris ?

I didn't test it against Russel's class though
0
 
LVL 46

Expert Comment

by:aikimark
ID: 22990428
Thanks, Geert. That's an excellent example of what I mentioned in #22960604
0
 
LVL 26

Expert Comment

by:Russell Libby
ID: 22990623
@Aikimark
Thanks for the points. Not required, but appreciated.

@Geert
I didn't realize that FastStrings had a string append class, but its been awhile since I looked at it. Are you planning on posting an example of usage?

Russell

0
 
LVL 38

Expert Comment

by:Geert Gruwez
ID: 23002202
Russel,
you got me, no string append
it's only got a copy string routine and a move string

i was gonna try and make an example of
L1 = length(string1);
L2 = length(string2);
N = times to string2 to string1
setlength(s1, L1 + N * L2);
and then use copystr

but can't get it to work

0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction The parallel port is a very commonly known port, it was widely used to connect a printer to the PC, if you look at the back of your computer, for those who don't have newer computers, there will be a port with 25 pins and a small print…
In this tutorial I will show you how to use the Windows Speech API in Delphi. I will only cover basic functions such as text to speech and controlling the speed of the speech. SAPI Installation First you need to install the SAPI type library, th…
This Micro Tutorial will teach you how to add a cinematic look to any film or video out there. There are very few simple steps that you will follow to do so. This will be demonstrated using Adobe Premiere Pro CS6.
We’ve all felt that sense of false security before—locking down external access to a database or component and feeling like we’ve done all we need to do to secure company data. But that feeling is fleeting. Attacks these days can happen in many w…
Suggested Courses

873 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question