Code Optimization

I was wondering if any of the other experts can use there eagle eye's to spot a speed flaw or a way to increase the speed. The usual way of increasing the block size(which is limited to 2048 bytes) can't be changed.

I was thinking of:

A) Threading. Compress two blocks at the same time.
B) Cache data. Write to drive every 4 blocks.

The threading option is probably a must. But how exactly would I pull it off? Would I fire up two or more threads dealing with seperate blocks, wait for their return, combine the blocks and write it. Or would I let the threads write to the file, how would I maintain order?

I tried method B but it didn't give me a speed increase, what are the effecient ways to Move memory without having a large overhead?

Cheers- ThievingSix
function CompressCISO(FileIn, FileOut: TFileStream; Level: Integer = 1): Boolean;
var
  CISO : TCISOHeader;
  FileSize : Int64;
  WritePos : Int64;
  IndexSize : Integer;
  Block : Integer;
  Buffer : Array[0..63] Of Char;
  CmpSize : Integer;
  Status : Integer;
  CISOTotalBlocks : Integer;
  IndexBuffer : Array Of Cardinal;
  BlockBuffer1 : PChar;
  BlockBuffer2 : PChar;
  ZStream : TZStreamRec;
  Align,
  AlignB,
  AlignM : Integer;
begin
  Result := False;
  CISO := InitializeHeader(FileIn,CISOTotalBlocks);
  FileSize := CISO.TotalBytes;
  If FileSize < 0 Then
    begin
    Raise Exception.CreateFmt('%d',[INVALID_FILE]);
    Exit;
  end;
  IndexSize := (CISOTotalBlocks + 1) * 4;
  SetLength(IndexBuffer,IndexSize);
  BlockBuffer1 := AllocMem(CISO.BlockSize);
  BlockBuffer2 := AllocMem(CISO.BlockSize * 2);
  If (IndexBuffer = nil) Or (BlockBuffer1 = nil) Or (BlockBuffer2 = nil) Then
    begin
    Raise Exception.CreateFmt('%d',[NOT_ENOUGH_MEMORY]);
    Exit;
  end;
  FillChar(Buffer,64,0);
  ZStream.zalloc := nil;
  ZStream.zfree := nil;
  ZStream.opaque := nil;
  FileOut.Write(CISO,SizeOf(CISO));
  FileOut.Write(IndexBuffer[0],IndexSize);
  WritePos := SizeOf(CISO) + IndexSize;
  AlignB := 1 SHL CISO.Align;
  AlignM := AlignB - 1;
  For Block := 0 To CISOTotalBlocks - 1 Do
    begin
    If DeflateInit2(ZStream,Level,Z_DEFLATED,-15,8,Z_DEFAULT_STRATEGY) <> Z_OK Then
      begin
      Raise Exception.CreateFmt('%d:%s',[DEFLATEINIT_ERROR,ZStream.msg]);
      Exit;
    end;
    Align := WritePos And AlignM;
    If Align <> 0 Then
      begin
      Align := AlignB - Align;
      If FileOut.Write(Buffer[0],Align) <> Align Then
        begin
        Raise Exception.CreateFmt('%d',[INVALID_FILE]);
        Exit;
      end;
      Inc(WritePos,Align);
    end;
    IndexBuffer[Block] := WritePos SHR CISO.Align;
    ZStream.next_out := BlockBuffer2;
    ZStream.avail_out := CISO.BlockSize * 2;
    ZStream.next_in := BlockBuffer1;
    ZStream.avail_in := FileIn.Read(BlockBuffer1[0],CISO.BlockSize);
    If ZStream.avail_in <> CISO.BlockSize Then
      begin
      Raise Exception.CreateFmt('%d:%d',[INVALID_FILE,Block]);
      Exit;
    end;
    Status := Deflate(ZStream,Z_FINISH);
    If Status <> Z_STREAM_END Then
      begin
      Raise Exception.CreateFmt('%d:%d:%s',[DEFLATE_ERROR,Block,ZStream.msg]);
      Exit;
    end;
    CmpSize := (CISO.BlockSize * 2) - ZStream.avail_out;
    If CmpSize >= CISO.BlockSize Then
      begin
      CmpSize := CISO.BlockSize;
      CopyMemory(BlockBuffer2,BlockBuffer1,CmpSize);
      IndexBuffer[Block] := IndexBuffer[Block] Or $80000000;
    end;
    If FileOut.Write(BlockBuffer2[0],CmpSize) <> CmpSize Then
      begin
      Raise Exception.CreateFmt('%d:%d',[INVALID_FILE,Block]);
      Exit;
    end;
    Inc(WritePos,CmpSize);
    If DeflateEnd(ZStream) <> Z_OK Then
      begin
      Raise Exception.CreateFmt('%d:%s',[DEFLATE_END_ERROR,ZStream.msg]);
      Exit;
    end;
  end;
  IndexBuffer[Block] := WritePos SHR CISO.Align;
  FileOut.Position := SizeOf(CISO);
  FileOut.Write(IndexBuffer[0],IndexSize);
  Result := True;
end;

Open in new window

LVL 13
Dagan HooverDeveloperAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Shanmuga SundaramDirector of Software EngineeringCommented:
I am not a master in Delphi. This is just a thought. Better try threading. This will sure increase the performance. You can create as many threads as you need. These sites will help you on threading

http://delphi.about.com/od/kbthread/Threading_in_Delphi.htm
http://dn.codegear.com/article/22411
http://www.eonclash.com/Tutorials/Multithreading/MartinHarvey1.1/Ch2.html
http://www.delphifaq.com/faq/delphi/vcl/f269.shtml
0
2266180Commented:
the definite speed increase is using a file buffer. create a new class, internally using a file, which buffers reads and writes with at least 64kb blocks. that will speed up file access by a factor of 30 (which is not the total speed, just the file access speed)

the threading model only provides speed gain in case of parallel solutions. doing the same stuff in the same stuff, or different stuff in the same time. the gain is because while some threads are bloking some resources, other threads continue to work on tasks that do not require the blocked resources.

your solution is liniar, so threads are out of the question from optimizaiton point of view UNLESS there are more than 1 operations that are speed consumers and are independed (pretty important detail)

for example. your solutiuon would be;

for each block do
  independent operation 1
  independent operation 2
  independent operation 3
  independent operation 4
end for each
all 4 operations are independent and take a lot of time to process and use different resources. you fire up 4 threads doing the operations and "waitfor" all of them.

from a first glance at your code, you don't have anything there that can vbe done in parallel except file reading and compressing, with the note that compressing is done on the previous file block read, otherwise the 2 operations are dependent.

so, using this transformation model, you can create as many threads as semi-independent operations you have, with the restriciton that some operations will depend on the previous result of another operation.

so you clould do threading, the only question is, is it feasable? in order to answer this, you will have to measure the time of each thread-able operation.

with this "delayed" model in mind, I think you might be able to pull off some speed increase, provided that there are at least 10 blocks of data to process (10 is relative, could be 5, could be 15, but I think around 10 is a optimal minimum)
one other thing you will need to keep in mind with this scenario is that creating and destroying threads consumes times, so you will have to create and destroy them once. and if you call the method many many times, then you will have to consider using a thread pool, otherwise whatever you gain with threads, will loose with bad thread management.
and also keep in mind that thread sycnhronizationalso costs time and that is why I kep saying that the operations must be independent, so that there will be as few locks as possible.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
developmentguruPresidentCommented:
To optimize the disk I/O I would read the file into memory, compress it in memory, then write the results out to a file.  Of course, this will only work on files that could fit in memory.  For larger files it could be broken into one megabyte chunks and done the same way (one Meg at a time).  The more of it you can do in memory, the more can be read or written at one time, the faster it will be.
0
Cloud Class® Course: Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

2266180Commented:
that is called a file buffer :P
0
Dagan HooverDeveloperAuthor Commented:
Something like this?

constructor TBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(64 * 1024);
  fBufferSize := 0;
end;
 
destructor TBufferStream.Destroy;
begin
  inherited Destroy;
  FreeAndNil(fMemoryStream);
end;
 
procedure TBufferStream.DumpBuffer;
begin
  fMemoryStream.Position := 0;
  fFileStream.CopyFrom(fMemoryStream,fBufferSize);
end;
 
function TBufferStream.WriteBuffer(const Buffer; Count: Int64): Int64;
begin
  If Count + fMemoryStream.Position > fMemoryStream.Size Then
    begin
    DumpBuffer;
    fBufferSize := 0;
    fMemoryStream.Position := 0;
  end;
  Result := 0;
  Result := fMemoryStream.Write(Buffer,Count);
  Inc(fBufferSize,Result);
end;

Open in new window

0
2266180Commented:
something like that, but move
    fBufferSize := 0;
    fMemoryStream.Position := 0;
in dumpbuffer and call dumpbuffer in destroy and in dumpbuffer check if buffersize>0 before operating. and don't forget to buffer the read from file as well if that is not done in big chunks either.
0
Dagan HooverDeveloperAuthor Commented:
Completely forgot about the read! Will do.
0
2266180Commented:
one minor detail: you need to support writing (reading) big chunks (it won't happen in your particular case, but you might want to use this code for other projects later on and it might happen then ;)
sure, the code as it is will work just fine with the side effect that your memory buffer will increase and thus the behaviour will change. it's what I call a logical error.

  If Count + fMemoryStream.Position > fMemoryStream.Size Then
   begin
    DumpBuffer;
    fBufferSize := 0;
    fMemoryStream.Position := 0;
    If Count + fMemoryStream.Position > fMemoryStream.Size Then
    begin
      Result := fFileStream.Write(Buffer,Count);
      exit;
    end;
  end;
0
Dagan HooverDeveloperAuthor Commented:
Took me about 5 re reads of that code before I realized why you did what you did. =P But I see why now.
0
2266180Commented:
:)

of course
    fBufferSize := 0;
    fMemoryStream.Position := 0;
shouldn't be there. copy-paste left-overs :P
0
Dagan HooverDeveloperAuthor Commented:
Hmm, how would I take advantage of dual cores and hyper threading processors(such as mine)? Wouldn't that mean I would have to thread anyway? I never hit more than 50% on this app meaning I never hit my second virtual core.
0
2266180Commented:
I never touched that subject so I don't know. but I can google :P
if you don't want to waste time, you can make a separate small test app with 2 threads and let them rip doing nothing. keep in mind that your system might freeze :)
next will be to set the cpu affinity. a few articles I found:
http://www.delphi3000.com/articles/article_2605.asp?SK=thread
http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_23039948.html
http://dn.codegear.com/article/27267

but this will only make sense if you have CPU intesive operatiosn. keep in mind that even if you write 100 threads, there is onl;y one data BUS leading to only one harddrive. so threads is not an anwser for boosting single IO operation speed, but it is an answer, as already explained, when you can break that up into more IO operations and other operations which do not lock eachother out.
0
aikimarkCommented:
@ThievingSix

1. Have you done any performance profiling of your code?
2. Is your code I/O bound or CPU bound?

benchmark reference:
http://www.softpedia.com/get/System/Hard-Disk-Utils/AssignFile-TFileStream-Speed-Test.shtml

3. What do DeflateInit2(), Deflate(), and DeflateEnd() functions do?
0
Dagan HooverDeveloperAuthor Commented:
This is a file compression routine. DeflateInit2(), Deflate(), and DeflateEnd() are part of the zlib compression/decompression library. No, I haven't done any performance profiling. Can you recommend one for delphi?
0
Dagan HooverDeveloperAuthor Commented:
Here is what I have for the reading side of it. Looks like I screwed up somewhere because I either get a stream read error, or a longer completion time(which isn't supposed to happen =P). Been fiddling with it for a while and I will continue to do so unless someone here spots it first.
constructor TReadBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(64 * 1024);
  fBufferSize := 0;
end;
 
destructor TReadBufferStream.Destroy;
begin
  FreeAndNil(fMemoryStream);
  inherited Destroy;
end;
 
procedure TReadBufferStream.FillBuffer;
begin
  If fBufferSize >= fMemoryStream.Size Then Exit;
  fMemoryStream.Position := fBufferSize;
  fBufferSize := fBufferSize + fMemoryStream.CopyFrom(fFileStream,fMemoryStream.Size - fBufferSize);
  fMemoryStream.Position := fMemoryStream.Position - fBufferSize;
end;
 
function TReadBufferStream.ReadBuffer(var Buffer; Count: Int64): Int64;
var
  BytesRead : Int64;
  TotalBytesRead : Int64;
  OldPosition : Int64;
  BigBlock : PChar;
begin
  If Count > fMemoryStream.Size Then
    begin
    OldPosition := Count;
    TotalBytesRead := 0;
    BytesRead := fMemoryStream.Read(Buffer,fBufferSize);
    Inc(TotalBytesRead,BytesRead);
    Dec(fBufferSize,BytesRead);
    Dec(OldPosition,TotalBytesRead);
    BigBlock := AllocMem(OldPosition);
    Result := fFileStream.Read(BigBlock[0],OldPosition);
    PChar(Buffer) := TotalBytesRead + PChar(Buffer);
    MoveMemory(PChar(Buffer) + TotalBytesRead,BigBlock,OldPosition);
    Result := Result + TotalBytesRead;
    fBufferSize := 0;
    FillBuffer;
    Exit;
  end;
  If fBufferSize - Count < 0 Then
    begin
    FillBuffer;
  end;
  Result := fMemoryStream.Read(Buffer,Count);
  Dec(fBufferSize,Result);
end;

Open in new window

0
2266180Commented:
I don't understand why you are allocating memory. your algorithm is way off. no memory allocation is needed (othjer then what the buffer uses, but that is done by the memorystream).
keep in mind that reads and writes can happen alternativly in the buffer.

I wanted to write the read function myself but the whole thing needs to be changed. writing the way it is done will not work correctly. you need to keep a virtual position and size which corelates with the actual file position and size if you were not using the bufer.
why?
because the buffer is fixed size 64 kb. when writing, this either overwrites a part of the file or extends it. when reading from the end of the file but still the middle of the buffer you must return 0 becuse even though the buffer has space, the file with the actual data from buffer does not.

I'm sure you can write everything from scratch yourself, but if you need help let me know. If you need me to write it for you I think I can squeeze it in my shedule sometime tonight (that's about 5-6 hours from now).
0
aikimarkCommented:
What version of zlib are you using?  There have been performance boosts in some of the zlib wrappers.
Examples:
http://www.programmersheaven.com/download/24171/download.aspx
http://www.dellapasqua.com/delphizlib/

Once you allocate the correct output size, you should use stream-to-stream operations, bypassing intermediate variables.
Example:
http://coding.derkeiler.com/Archive/Delphi/borland.public.delphi.thirdpartytools.general/2004-01/0552.html
0
Dagan HooverDeveloperAuthor Commented:
I'm using an updated version of the second zlib you posted, so I'm not going to get much better there. I'll try the stream to stream option after I fix my messy TReadBufferStream =P
0
Dagan HooverDeveloperAuthor Commented:
My second attempt. Doesn't slow it down this time, doesn't speed it up either. Looks like there is only so much speed to gain.
constructor TReadBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(64 * 1024);
  fBufferSize := 0;
  FillBuffer;
end;
 
destructor TReadBufferStream.Destroy;
begin
  FreeAndNil(fMemoryStream);
  inherited Destroy;
end;
 
procedure TReadBufferStream.FillBuffer;
var
  BytesToRead : Int64;
  BytesRead : Int64;
  Buffer : PChar;
begin
  If fBufferSize < fMemoryStream.Size Then
    begin
    BytesToRead := fMemoryStream.Size - fBufferSize;
    fMemoryStream.Position := fBufferSize;
    GetMem(Buffer,BytesToRead);
    Try
      BytesRead := fFileStream.Read(Buffer^,BytesToRead);
      fMemoryStream.Write(Buffer^,BytesRead);
      Inc(fBufferSize,BytesRead);
      fMemoryStream.Position := 0;
    Finally
      FreeMem(Buffer,BytesToRead);
    end;
  end;
end;
 
function TReadBufferStream.ReadBuffer(var Buffer; Count: Int64): Int64;
var
  BytesToRead : Int64;
begin
  BytesToRead := Count;
  If BytesToRead > fBufferSize Then
    begin
    FillBuffer;
    If BytesToRead > fBufferSize Then
      begin
      BytesToRead := fBufferSize;
    end;
  end;
  Result := fMemoryStream.Read(Buffer,BytesToRead);
  Dec(fBufferSize,Result);
end;

Open in new window

0
Dagan HooverDeveloperAuthor Commented:
Changed the buffer sizes to 128kb's which offered a bit of a speed increase. Anything more didn't help much.
0
2266180Commented:
that is because you are still doing memory allocation while filling the buffer. I told you not to do that.
do it once in the constructor.

second step will be to to some profiling as aikimark suggested. measure the time/clock cycles whatever of every major operation and see where you have big numbers. that's where you might be able to imporive. no sense in wasting time improving something that takes under 1 ms when you might have operations that take 10 ms or more.
0
Dagan HooverDeveloperAuthor Commented:
     BytesRead := fFileStream.Read(Buffer^,BytesToRead);
      fMemoryStream.Write(Buffer^,BytesRead);
      Inc(fBufferSize,BytesRead);

Thats the same thing TStream.CopyFrom does, is there a more efficient way to move the data from the file stream to the memory stream?

Also, I used ProDelphi to profile it. Seems that the 100k iterations are just going to take that much time. The TReadBufferStream.FillBuffer() and TWriteBufferStream.DumpBuffer() seem to take the most time as individuals, but the main part is just in the data loop.
0
birbilisCommented:
try moving all loop invariants outside of the for loop. For example why calculate CISO.BlockSize * 2 several times, if the CISO structure is initialized once outside of the for loop? Just define a local variable with CISO_BlockSize2 = CISO.BlockSize * 2 before the for loop and do a replace on the for loop body (select it and replace on the selection only) all "CISO.BlockSize * 2" with "CISO_BlockSize2". Similarly, avoid even field access like CISO.BlockSize inside the for loop and assign such fields to local variables, then use them inside the for loop (assuming the loop executes many times, not just a few)
0
Dagan HooverDeveloperAuthor Commented:
@cuily: I just realized what you were probably saying. Something like fFileStream.Read(PChar(fMemoryStream.Memory + fMemoryStream.Position)^,Count);?

@birbilis: Ah, makes sense. Will do.
0
2266180Commented:
you got it all wrong. TStream.CopyFrom does it in a non-speed fashion.

I was saying more like this (copy paste follows with needed modifications)

in read buffer you will need to make a loop, in case Count is bigger then MaxBuffSize, so that you will do multiple reads if needed.
then, further optimizaiton can be done for this case, in the sense that if you need to read more than MaxBuffSize, you will read once, directly from the file, but not before reading whatever is left in the memory buffer.
you will need to do pointer operations for this. in case you don't know how, shortly, it's liek this:
read(pointer((@Buffer+delta))^, whatever count is left);
const MaxBuffSize = 64 * 1024; // ---- constant
 
constructor TReadBufferStream.Create(InputStream: TFileStream);
begin
  fFileStream := InputStream;
  fMemoryStream := TMemoryStream.Create;
  fMemoryStream.SetSize(MaxBuffSize);
  fBufferSize := 0;
  GetMem(Buffer,MaxBuffSize); // -------- this
  FillBuffer;
end;
 
destructor TReadBufferStream.Destroy;
begin
  FreeAndNil(fMemoryStream);
  FreeMem(Buffer,MaxBuffSize); //  ----------- this
  inherited Destroy;
end;
 
procedure TReadBufferStream.FillBuffer;
var
  BytesToRead : Int64;
  BytesRead : Int64;
  Buffer : PChar;
begin
  If fBufferSize < fMemoryStream.Size Then
    begin
    BytesToRead := fMemoryStream.Size - fBufferSize;
    fMemoryStream.Position := fBufferSize;
    BytesRead := fFileStream.Read(Buffer^,BytesToRead);
    fMemoryStream.Write(Buffer^,BytesRead);
    Inc(fBufferSize,BytesRead);
    fMemoryStream.Position := 0;
  end;
end;
 
function TReadBufferStream.ReadBuffer(var Buffer; Count: Int64): Int64;
var
  BytesToRead : Int64;
begin
  BytesToRead := Count;
  If BytesToRead > fBufferSize Then
    begin
    FillBuffer;
    If BytesToRead > fBufferSize Then
      begin
      BytesToRead := fBufferSize;
    end;
  end;
  Result := fMemoryStream.Read(Buffer,BytesToRead);
  Dec(fBufferSize,Result);
end;

Open in new window

0
2266180Commented:
>> FileStream.Read(PChar(fMemoryStream.Memory + fMemoryStream.Position)^,Count)

that's is actually a pretty good idea. this way you eliminate one memory copy. brilliant :) dunno how I didn't think of this :|
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Delphi

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.