Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Code Optimization

Posted on 2008-10-13
26
Medium Priority
?
412 Views
Last Modified: 2010-04-05
I was wondering if any of the other experts can use there eagle eye's to spot a speed flaw or a way to increase the speed. The usual way of increasing the block size(which is limited to 2048 bytes) can't be changed.

I was thinking of:

A) Threading. Compress two blocks at the same time.
B) Cache data. Write to drive every 4 blocks.

The threading option is probably a must. But how exactly would I pull it off? Would I fire up two or more threads dealing with seperate blocks, wait for their return, combine the blocks and write it. Or would I let the threads write to the file, how would I maintain order?

I tried method B but it didn't give me a speed increase, what are the effecient ways to Move memory without having a large overhead?

Cheers- ThievingSix
function CompressCISO(FileIn, FileOut: TFileStream; Level: Integer = 1): Boolean;
var
  CISO : TCISOHeader;
  FileSize : Int64;
  WritePos : Int64;
  IndexSize : Integer;
  Block : Integer;
  Buffer : Array[0..63] Of Char;
  CmpSize : Integer;
  Status : Integer;
  CISOTotalBlocks : Integer;
  IndexBuffer : Array Of Cardinal;
  BlockBuffer1 : PChar;
  BlockBuffer2 : PChar;
  ZStream : TZStreamRec;
  Align,
  AlignB,
  AlignM : Integer;
begin
  Result := False;
  CISO := InitializeHeader(FileIn,CISOTotalBlocks);
  FileSize := CISO.TotalBytes;
  If FileSize < 0 Then
    begin
    Raise Exception.CreateFmt('%d',[INVALID_FILE]);
    Exit;
  end;
  IndexSize := (CISOTotalBlocks + 1) * 4;
  SetLength(IndexBuffer,IndexSize);
  BlockBuffer1 := AllocMem(CISO.BlockSize);
  BlockBuffer2 := AllocMem(CISO.BlockSize * 2);
  If (IndexBuffer = nil) Or (BlockBuffer1 = nil) Or (BlockBuffer2 = nil) Then
    begin
    Raise Exception.CreateFmt('%d',[NOT_ENOUGH_MEMORY]);
    Exit;
  end;
  FillChar(Buffer,64,0);
  ZStream.zalloc := nil;
  ZStream.zfree := nil;
  ZStream.opaque := nil;
  FileOut.Write(CISO,SizeOf(CISO));
  FileOut.Write(IndexBuffer[0],IndexSize);
  WritePos := SizeOf(CISO) + IndexSize;
  AlignB := 1 SHL CISO.Align;
  AlignM := AlignB - 1;
  For Block := 0 To CISOTotalBlocks - 1 Do
    begin
    If DeflateInit2(ZStream,Level,Z_DEFLATED,-15,8,Z_DEFAULT_STRATEGY) <> Z_OK Then
      begin
      Raise Exception.CreateFmt('%d:%s',[DEFLATEINIT_ERROR,ZStream.msg]);
      Exit;
    end;
    Align := WritePos And AlignM;
    If Align <> 0 Then
      begin
      Align := AlignB - Align;
      If FileOut.Write(Buffer[0],Align) <> Align Then
        begin
        Raise Exception.CreateFmt('%d',[INVALID_FILE]);
        Exit;
      end;
      Inc(WritePos,Align);
    end;
    IndexBuffer[Block] := WritePos SHR CISO.Align;
    ZStream.next_out := BlockBuffer2;
    ZStream.avail_out := CISO.BlockSize * 2;
    ZStream.next_in := BlockBuffer1;
    ZStream.avail_in := FileIn.Read(BlockBuffer1[0],CISO.BlockSize);
    If ZStream.avail_in <> CISO.BlockSize Then
      begin
      Raise Exception.CreateFmt('%d:%d',[INVALID_FILE,Block]);
      Exit;
    end;
    Status := Deflate(ZStream,Z_FINISH);
    If Status <> Z_STREAM_END Then
      begin
      Raise Exception.CreateFmt('%d:%d:%s',[DEFLATE_ERROR,Block,ZStream.msg]);
      Exit;
    end;
    CmpSize := (CISO.BlockSize * 2) - ZStream.avail_out;
    If CmpSize >= CISO.BlockSize Then
      begin
      CmpSize := CISO.BlockSize;
      CopyMemory(BlockBuffer2,BlockBuffer1,CmpSize);
      IndexBuffer[Block] := IndexBuffer[Block] Or $80000000;
    end;
    If FileOut.Write(BlockBuffer2[0],CmpSize) <> CmpSize Then
      begin
      Raise Exception.CreateFmt('%d:%d',[INVALID_FILE,Block]);
      Exit;
    end;
    Inc(WritePos,CmpSize);
    If DeflateEnd(ZStream) <> Z_OK Then
      begin
      Raise Exception.CreateFmt('%d:%s',[DEFLATE_END_ERROR,ZStream.msg]);
      Exit;
    end;
  end;
  IndexBuffer[Block] := WritePos SHR CISO.Align;
  FileOut.Position := SizeOf(CISO);
  FileOut.Write(IndexBuffer[0],IndexSize);
  Result := True;
end;

Open in new window

0
Comment
Question by:ThievingSix
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 10
  • 2
  • +3
26 Comments
 
LVL 17

Expert Comment

by:Shanmuga Sundaram
ID: 22701258
I am not a master in Delphi. This is just a thought. Better try threading. This will sure increase the performance. You can create as many threads as you need. These sites will help you on threading

http://delphi.about.com/od/kbthread/Threading_in_Delphi.htm
http://dn.codegear.com/article/22411
http://www.eonclash.com/Tutorials/Multithreading/MartinHarvey1.1/Ch2.html
http://www.delphifaq.com/faq/delphi/vcl/f269.shtml
0
 
LVL 28

Accepted Solution

by:
2266180 earned 1600 total points
ID: 22701427
the definite speed increase is using a file buffer. create a new class, internally using a file, which buffers reads and writes with at least 64kb blocks. that will speed up file access by a factor of 30 (which is not the total speed, just the file access speed)

the threading model only provides speed gain in case of parallel solutions. doing the same stuff in the same stuff, or different stuff in the same time. the gain is because while some threads are bloking some resources, other threads continue to work on tasks that do not require the blocked resources.

your solution is liniar, so threads are out of the question from optimizaiton point of view UNLESS there are more than 1 operations that are speed consumers and are independed (pretty important detail)

for example. your solutiuon would be;

for each block do
  independent operation 1
  independent operation 2
  independent operation 3
  independent operation 4
end for each
all 4 operations are independent and take a lot of time to process and use different resources. you fire up 4 threads doing the operations and "waitfor" all of them.

from a first glance at your code, you don't have anything there that can vbe done in parallel except file reading and compressing, with the note that compressing is done on the previous file block read, otherwise the 2 operations are dependent.

so, using this transformation model, you can create as many threads as semi-independent operations you have, with the restriciton that some operations will depend on the previous result of another operation.

so you clould do threading, the only question is, is it feasable? in order to answer this, you will have to measure the time of each thread-able operation.

with this "delayed" model in mind, I think you might be able to pull off some speed increase, provided that there are at least 10 blocks of data to process (10 is relative, could be 5, could be 15, but I think around 10 is a optimal minimum)
one other thing you will need to keep in mind with this scenario is that creating and destroying threads consumes times, so you will have to create and destroy them once. and if you call the method many many times, then you will have to consider using a thread pool, otherwise whatever you gain with threads, will loose with bad thread management.
and also keep in mind that thread sycnhronizationalso costs time and that is why I kep saying that the operations must be independent, so that there will be as few locks as possible.
0
 
LVL 21

Expert Comment

by:developmentguru
ID: 22704454
To optimize the disk I/O I would read the file into memory, compress it in memory, then write the results out to a file.  Of course, this will only work on files that could fit in memory.  For larger files it could be broken into one megabyte chunks and done the same way (one Meg at a time).  The more of it you can do in memory, the more can be read or written at one time, the faster it will be.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 28

Expert Comment

by:2266180
ID: 22705132
that is called a file buffer :P
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22707419
Something like this?

constructor TBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(64 * 1024);
  fBufferSize := 0;
end;
 
destructor TBufferStream.Destroy;
begin
  inherited Destroy;
  FreeAndNil(fMemoryStream);
end;
 
procedure TBufferStream.DumpBuffer;
begin
  fMemoryStream.Position := 0;
  fFileStream.CopyFrom(fMemoryStream,fBufferSize);
end;
 
function TBufferStream.WriteBuffer(const Buffer; Count: Int64): Int64;
begin
  If Count + fMemoryStream.Position > fMemoryStream.Size Then
    begin
    DumpBuffer;
    fBufferSize := 0;
    fMemoryStream.Position := 0;
  end;
  Result := 0;
  Result := fMemoryStream.Write(Buffer,Count);
  Inc(fBufferSize,Result);
end;

Open in new window

0
 
LVL 28

Expert Comment

by:2266180
ID: 22709504
something like that, but move
    fBufferSize := 0;
    fMemoryStream.Position := 0;
in dumpbuffer and call dumpbuffer in destroy and in dumpbuffer check if buffersize>0 before operating. and don't forget to buffer the read from file as well if that is not done in big chunks either.
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22709582
Completely forgot about the read! Will do.
0
 
LVL 28

Expert Comment

by:2266180
ID: 22709630
one minor detail: you need to support writing (reading) big chunks (it won't happen in your particular case, but you might want to use this code for other projects later on and it might happen then ;)
sure, the code as it is will work just fine with the side effect that your memory buffer will increase and thus the behaviour will change. it's what I call a logical error.

  If Count + fMemoryStream.Position > fMemoryStream.Size Then
   begin
    DumpBuffer;
    fBufferSize := 0;
    fMemoryStream.Position := 0;
    If Count + fMemoryStream.Position > fMemoryStream.Size Then
    begin
      Result := fFileStream.Write(Buffer,Count);
      exit;
    end;
  end;
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22709680
Took me about 5 re reads of that code before I realized why you did what you did. =P But I see why now.
0
 
LVL 28

Expert Comment

by:2266180
ID: 22709741
:)

of course
    fBufferSize := 0;
    fMemoryStream.Position := 0;
shouldn't be there. copy-paste left-overs :P
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22715424
Hmm, how would I take advantage of dual cores and hyper threading processors(such as mine)? Wouldn't that mean I would have to thread anyway? I never hit more than 50% on this app meaning I never hit my second virtual core.
0
 
LVL 28

Expert Comment

by:2266180
ID: 22715992
I never touched that subject so I don't know. but I can google :P
if you don't want to waste time, you can make a separate small test app with 2 threads and let them rip doing nothing. keep in mind that your system might freeze :)
next will be to set the cpu affinity. a few articles I found:
http://www.delphi3000.com/articles/article_2605.asp?SK=thread
http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_23039948.html
http://dn.codegear.com/article/27267

but this will only make sense if you have CPU intesive operatiosn. keep in mind that even if you write 100 threads, there is onl;y one data BUS leading to only one harddrive. so threads is not an anwser for boosting single IO operation speed, but it is an answer, as already explained, when you can break that up into more IO operations and other operations which do not lock eachother out.
0
 
LVL 46

Assisted Solution

by:aikimark
aikimark earned 200 total points
ID: 22717935
@ThievingSix

1. Have you done any performance profiling of your code?
2. Is your code I/O bound or CPU bound?

benchmark reference:
http://www.softpedia.com/get/System/Hard-Disk-Utils/AssignFile-TFileStream-Speed-Test.shtml

3. What do DeflateInit2(), Deflate(), and DeflateEnd() functions do?
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22717964
This is a file compression routine. DeflateInit2(), Deflate(), and DeflateEnd() are part of the zlib compression/decompression library. No, I haven't done any performance profiling. Can you recommend one for delphi?
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22718206
Here is what I have for the reading side of it. Looks like I screwed up somewhere because I either get a stream read error, or a longer completion time(which isn't supposed to happen =P). Been fiddling with it for a while and I will continue to do so unless someone here spots it first.
constructor TReadBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(64 * 1024);
  fBufferSize := 0;
end;
 
destructor TReadBufferStream.Destroy;
begin
  FreeAndNil(fMemoryStream);
  inherited Destroy;
end;
 
procedure TReadBufferStream.FillBuffer;
begin
  If fBufferSize >= fMemoryStream.Size Then Exit;
  fMemoryStream.Position := fBufferSize;
  fBufferSize := fBufferSize + fMemoryStream.CopyFrom(fFileStream,fMemoryStream.Size - fBufferSize);
  fMemoryStream.Position := fMemoryStream.Position - fBufferSize;
end;
 
function TReadBufferStream.ReadBuffer(var Buffer; Count: Int64): Int64;
var
  BytesRead : Int64;
  TotalBytesRead : Int64;
  OldPosition : Int64;
  BigBlock : PChar;
begin
  If Count > fMemoryStream.Size Then
    begin
    OldPosition := Count;
    TotalBytesRead := 0;
    BytesRead := fMemoryStream.Read(Buffer,fBufferSize);
    Inc(TotalBytesRead,BytesRead);
    Dec(fBufferSize,BytesRead);
    Dec(OldPosition,TotalBytesRead);
    BigBlock := AllocMem(OldPosition);
    Result := fFileStream.Read(BigBlock[0],OldPosition);
    PChar(Buffer) := TotalBytesRead + PChar(Buffer);
    MoveMemory(PChar(Buffer) + TotalBytesRead,BigBlock,OldPosition);
    Result := Result + TotalBytesRead;
    fBufferSize := 0;
    FillBuffer;
    Exit;
  end;
  If fBufferSize - Count < 0 Then
    begin
    FillBuffer;
  end;
  Result := fMemoryStream.Read(Buffer,Count);
  Dec(fBufferSize,Result);
end;

Open in new window

0
 
LVL 28

Expert Comment

by:2266180
ID: 22719313
I don't understand why you are allocating memory. your algorithm is way off. no memory allocation is needed (othjer then what the buffer uses, but that is done by the memorystream).
keep in mind that reads and writes can happen alternativly in the buffer.

I wanted to write the read function myself but the whole thing needs to be changed. writing the way it is done will not work correctly. you need to keep a virtual position and size which corelates with the actual file position and size if you were not using the bufer.
why?
because the buffer is fixed size 64 kb. when writing, this either overwrites a part of the file or extends it. when reading from the end of the file but still the middle of the buffer you must return 0 becuse even though the buffer has space, the file with the actual data from buffer does not.

I'm sure you can write everything from scratch yourself, but if you need help let me know. If you need me to write it for you I think I can squeeze it in my shedule sometime tonight (that's about 5-6 hours from now).
0
 
LVL 46

Expert Comment

by:aikimark
ID: 22720441
What version of zlib are you using?  There have been performance boosts in some of the zlib wrappers.
Examples:
http://www.programmersheaven.com/download/24171/download.aspx
http://www.dellapasqua.com/delphizlib/

Once you allocate the correct output size, you should use stream-to-stream operations, bypassing intermediate variables.
Example:
http://coding.derkeiler.com/Archive/Delphi/borland.public.delphi.thirdpartytools.general/2004-01/0552.html
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22720701
I'm using an updated version of the second zlib you posted, so I'm not going to get much better there. I'll try the stream to stream option after I fix my messy TReadBufferStream =P
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22737610
My second attempt. Doesn't slow it down this time, doesn't speed it up either. Looks like there is only so much speed to gain.
constructor TReadBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(64 * 1024);
  fBufferSize := 0;
  FillBuffer;
end;
 
destructor TReadBufferStream.Destroy;
begin
  FreeAndNil(fMemoryStream);
  inherited Destroy;
end;
 
procedure TReadBufferStream.FillBuffer;
var
  BytesToRead : Int64;
  BytesRead : Int64;
  Buffer : PChar;
begin
  If fBufferSize < fMemoryStream.Size Then
    begin
    BytesToRead := fMemoryStream.Size - fBufferSize;
    fMemoryStream.Position := fBufferSize;
    GetMem(Buffer,BytesToRead);
    Try
      BytesRead := fFileStream.Read(Buffer^,BytesToRead);
      fMemoryStream.Write(Buffer^,BytesRead);
      Inc(fBufferSize,BytesRead);
      fMemoryStream.Position := 0;
    Finally
      FreeMem(Buffer,BytesToRead);
    end;
  end;
end;
 
function TReadBufferStream.ReadBuffer(var Buffer; Count: Int64): Int64;
var
  BytesToRead : Int64;
begin
  BytesToRead := Count;
  If BytesToRead > fBufferSize Then
    begin
    FillBuffer;
    If BytesToRead > fBufferSize Then
      begin
      BytesToRead := fBufferSize;
    end;
  end;
  Result := fMemoryStream.Read(Buffer,BytesToRead);
  Dec(fBufferSize,Result);
end;

Open in new window

0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22737686
Changed the buffer sizes to 128kb's which offered a bit of a speed increase. Anything more didn't help much.
0
 
LVL 28

Expert Comment

by:2266180
ID: 22738744
that is because you are still doing memory allocation while filling the buffer. I told you not to do that.
do it once in the constructor.

second step will be to to some profiling as aikimark suggested. measure the time/clock cycles whatever of every major operation and see where you have big numbers. that's where you might be able to imporive. no sense in wasting time improving something that takes under 1 ms when you might have operations that take 10 ms or more.
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22738836
     BytesRead := fFileStream.Read(Buffer^,BytesToRead);
      fMemoryStream.Write(Buffer^,BytesRead);
      Inc(fBufferSize,BytesRead);

Thats the same thing TStream.CopyFrom does, is there a more efficient way to move the data from the file stream to the memory stream?

Also, I used ProDelphi to profile it. Seems that the 100k iterations are just going to take that much time. The TReadBufferStream.FillBuffer() and TWriteBufferStream.DumpBuffer() seem to take the most time as individuals, but the main part is just in the data loop.
0
 

Assisted Solution

by:birbilis
birbilis earned 200 total points
ID: 22738864
try moving all loop invariants outside of the for loop. For example why calculate CISO.BlockSize * 2 several times, if the CISO structure is initialized once outside of the for loop? Just define a local variable with CISO_BlockSize2 = CISO.BlockSize * 2 before the for loop and do a replace on the for loop body (select it and replace on the selection only) all "CISO.BlockSize * 2" with "CISO_BlockSize2". Similarly, avoid even field access like CISO.BlockSize inside the for loop and assign such fields to local variables, then use them inside the for loop (assuming the loop executes many times, not just a few)
0
 
LVL 13

Author Comment

by:ThievingSix
ID: 22738890
@cuily: I just realized what you were probably saying. Something like fFileStream.Read(PChar(fMemoryStream.Memory + fMemoryStream.Position)^,Count);?

@birbilis: Ah, makes sense. Will do.
0
 
LVL 28

Expert Comment

by:2266180
ID: 22739027
you got it all wrong. TStream.CopyFrom does it in a non-speed fashion.

I was saying more like this (copy paste follows with needed modifications)

in read buffer you will need to make a loop, in case Count is bigger then MaxBuffSize, so that you will do multiple reads if needed.
then, further optimizaiton can be done for this case, in the sense that if you need to read more than MaxBuffSize, you will read once, directly from the file, but not before reading whatever is left in the memory buffer.
you will need to do pointer operations for this. in case you don't know how, shortly, it's liek this:
read(pointer((@Buffer+delta))^, whatever count is left);
const MaxBuffSize = 64 * 1024; // ---- constant
 
constructor TReadBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(MaxBuffSize);
  fBufferSize := 0;
  GetMem(Buffer,MaxBuffSize); // -------- this
  FillBuffer;
end;
 
destructor TReadBufferStream.Destroy;
begin
  FreeAndNil(fMemoryStream);
  FreeMem(Buffer,MaxBuffSize); //  ----------- this
  inherited Destroy;
end;
 
procedure TReadBufferStream.FillBuffer;
var
  BytesToRead : Int64;
  BytesRead : Int64;
  Buffer : PChar;
begin
  If fBufferSize < fMemoryStream.Size Then
    begin
    BytesToRead := fMemoryStream.Size - fBufferSize;
    fMemoryStream.Position := fBufferSize;
    BytesRead := fFileStream.Read(Buffer^,BytesToRead);
    fMemoryStream.Write(Buffer^,BytesRead);
    Inc(fBufferSize,BytesRead);
    fMemoryStream.Position := 0;
  end;
end;
 
function TReadBufferStream.ReadBuffer(var Buffer; Count: Int64): Int64;
var
  BytesToRead : Int64;
begin
  BytesToRead := Count;
  If BytesToRead > fBufferSize Then
    begin
    FillBuffer;
    If BytesToRead > fBufferSize Then
      begin
      BytesToRead := fBufferSize;
    end;
  end;
  Result := fMemoryStream.Read(Buffer,BytesToRead);
  Dec(fBufferSize,Result);
end;

Open in new window

0
 
LVL 28

Expert Comment

by:2266180
ID: 22739036
>> FileStream.Read(PChar(fMemoryStream.Memory + fMemoryStream.Position)^,Count)

that's is actually a pretty good idea. this way you eliminate one memory copy. brilliant :) dunno how I didn't think of this :|
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The uses clause is one of those things that just tends to grow and grow. Most of the time this is in the main form, as it's from this form that all others are called. If you have a big application (including many forms), the uses clause in the in…
Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
In this video you will find out how to export Office 365 mailboxes using the built in eDiscovery tool. Bear in mind that although this method might be useful in some cases, using PST files as Office 365 backup is troublesome in a long run (more on t…
Are you ready to place your question in front of subject-matter experts for more timely responses? With the release of Priority Question, Premium Members, Team Accounts and Qualified Experts can now identify the emergent level of their issue, signal…
Suggested Courses

604 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question