Solved

Efficient way to split a file into segments ?

Posted on 2011-09-18
1
412 Views
Last Modified: 2016-09-29
I retrieve a file size from hte internet, lets say the file content (size) is 17mb.. (the file size will vary in future usage, so the algorithm must be compatible)

what is the most efficient way to split it into segments ? i have a specific amout of threads that each one downloads from AOffset to BOffset, my current algorithm is to divide it by the amount of connections that the user chose...

this is the code that makes the split:
 
var
  nLoop,
  nEndDiv,
  nConnections      : Integer;

  i64FileSize,
  i64Start,
  i64End,
  i64End2           : Int64;  

  nConnections := StrToInt( SpinBox1.Text );
    nEndDiv := nConnections;  

    bStartZero := True;

    for nLoop := 1 to nConnections do
    begin 

     if ( bStartZero = True ) then i64Start := 0;
      bStartZero := False;

      i64End :=
       i64FileSize div (nEndDiv);

      dec( nEndDiv );

      i64End2 := i64End;

      with TFetchDataThread.Create(
      alabel[nLoop], apbar[nLoop], hOpenFile[nLoop], hInetFile[nLoop], i64Start, i64End ) do
      begin
        Priority := tpNormal;
        Start;
      end;

      i64Start := i64End2 + 1;
    end;

Open in new window

 

this is the code that downloads:
 
procedure TFetchDataThread.Execute;
type
  TypeByteArray = array [1..1024] of Byte;
var
  Buffer         : TypeByteArray;
  BytesToRead    : DWORD;
  BytesToWrite   : DWORD;

  BufferLen,
  BytesWritten   : DWORD;
  EndProgress    : Cardinal;
  i: Integer;
begin
  FProgressBar.Min := Extended( FStartOffset + 0.0 );
  FProgressBar.Max := Extended( FEndOffset   + 0.0 );

  InternetSetFilePointer( FInetFile, FStartOffset, nil, FILE_BEGIN, 0 );

  EndProgress := SetFilePointer( FDestFile, FEndOffset, nil, FILE_BEGIN );

  SetFilePointer( FDestFile, FStartOffset, nil, FILE_BEGIN );

  BytesToRead := SizeOf( Buffer );
  BytesToWrite := SizeOf(Buffer);

  try
    repeat

      InternetReadFile(
       FInetFile, @Buffer, BytesToRead, BufferLen );

      LockFile(
       FDestFile, FStartOffset, 0, BytesToRead, 0 );

      if ( FCurrentOffset > EndProgress ) then
      WriteFile(
       FDestFile, Buffer, BytesToWrite, BytesWritten, nil )
      else
      WriteFile(
       FDestFile, Buffer, BytesToWrite, BytesWritten, nil );

      UnlockFile(
       FDestFile, FStartOffset, 0, BytesToRead, 0 );

      FCurrentOffset :=
       SetFilePointer( FDestFile, 0, nil, FILE_CURRENT );

      FProgressBar.Value := FCurrentOffset;

      Synchronize( UpdateGUI );
    until FCurrentOffset >= EndProgress;
  finally
    CloseHandle( FDestFile );
    InternetCloseHandle( FInetFile );
  end;
end;

Open in new window

0
Comment
Question by:rotem156
1 Comment
 
LVL 25

Accepted Solution

by:
epasquier earned 500 total points
ID: 36563416
well, it's pretty obvious that each thread should manage differently the last block. In your current code you read 1024 bytes whatever the position in your file is.

for the same example of 40.528.057 bytes per block, that would mean
39578 loops reading 1024 bytes
and one reading 185

I suppose one quick fix would be :
repeat
//== FIX
      BytesToRead:=EndProgress-FCurrentOffset;  
      if BytesToRead>1024 Then BytesToRead:=1024;
//== END FIX
      InternetReadFile(
       FInetFile, @Buffer, BytesToRead, BufferLen );

      LockFile(
       FDestFile, FStartOffset, 0, BytesToRead, 0 );

//== WHAT IS THAT ALL ABOUT ??
//      if ( FCurrentOffset > EndProgress ) then
//      WriteFile(
//       FDestFile, Buffer, BytesToWrite, BytesWritten, nil )
//      else
//== ???
      WriteFile(
       FDestFile, Buffer, BytesToWrite, BytesWritten, nil );

      UnlockFile(
       FDestFile, FStartOffset, 0, BytesToRead, 0 );

      FCurrentOffset :=
       SetFilePointer( FDestFile, 0, nil, FILE_CURRENT );

      FProgressBar.Value := FCurrentOffset;

      Synchronize( UpdateGUI );
    until FCurrentOffset >= EndProgress;

Open in new window

0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This algorithm (in C#) will resize any image down to a given size while maintaining the original aspect ratio. The maximum width and max height are both optional but if neither are given, the original image is returned. This example is designed t…
Okay. So what exactly is the problem here? How often have we come across situations where we need to know if two strings are 'similar' but not necessarily the same? I have, plenty of times. Until recently, I thought any functionality like that wo…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…
A company’s greatest vulnerability is their email. CEO fraud, ransomware and spear phishing attacks are the no1 threat to a company’s security. Cybercrime is responsible for the largest loss of money to companies today with losses projected to r…

948 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now