Solved

single file size limitation?

Posted on 2002-06-24
28
1,906 Views
Last Modified: 2012-06-21
Hello All...
   Is anyone aware of a single file size limitation for opening/reading a file in a Delphi app? I'm testing my app on Win 2000 and XP and it works correctly on reading and processing large files of up to around 4 gigs in size. Over and above that, it gives incorrect and wacky results (the results consistemtly show as a much smaller number than expected)... like some internal counter has been exceeded.

  I seem to be able to correctly process several files whose TOTAL exceeds 4 gigs, no problem... as long as each of those files are less than 4 gigs. But as soon as any one single file exceeds 4 gigs, something gets blown. Don't know whether to think this is an operating system limitation, or a Delphi limitation.

I'm using D3 Professional.

Thanks
   Shawn
0
Comment
Question by:aztec
  • 14
  • 8
  • 6
28 Comments
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7106576
This is not a Delphi problem at all. The size limit of 4 Gig (2 Gig for FAT file systems) stems from the simple fact that the size is stored in a DWORD. It is a limit of many file systems.
0
 

Author Comment

by:aztec
ID: 7109225
I see... but my app can create OUTPUT files of sizes greater than 4 gigs...

How can this be?
0
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7109620
Then it simply may be that your program eats up 1000 % CPU. Show your code.
0
 

Author Comment

by:aztec
ID: 7109628
1000% ??

you mean 100%? How could using 100% of CPU result in a limitation of the file sizes I can read in?
0
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7109644
Sorry, posted to the wrong question.

NTFS does not have the size limit, but as long as you use the file functions with DWORD for position or size parameters you will be lost in your file. Sequential reading and writing works, but repositioning should be modulo 4 Gigs.
0
 

Author Comment

by:aztec
ID: 7109673
ok, no problem!

Well, I am using 'BlockRead' to read in the data file. Maybe this could be the problem? If I switch to using maybe TFileStream instead it might solve this?
  Below is my code when I read in the file:

var buf : array[0..49151] of char;
    fromfile : file;

repeat
   buf:='';
   blockread(fromfile, buf, 1, numread);
   recsread:=recsread + 1;
   
   for z:=0 to 49151 do
   begin
     if ((ord(buf[z])>= 0) and (ord(buf[z]) <= 31)) or
        ((ord(buf[z])>= 127) and (ord(buf[z]) <= 255))  then
     begin
        if bigstring <> '' then find_valid;
     end
     else
     begin
       bigstring:=concat(bigstring, buf[z]);
       if length(bigstring) = 255 then find_valid;
     end;
   end;

   bytes_processed:=bytes_processed + 49152;
   if recsread mod 10 = 0 then
   begin
     hperc:=calc_status(totsizeoffile, bytes_processed);
     gauge14.progress:=hperc;
     Form1.update;
   end;

 until EOF(fromfile);


Thanks!
0
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7109758
You should revert to Win32 API. The function family CreateFile, ReadFiel, WriteFile also contains functions for file size and seeking (GetFileSizeEx and SetFilePointerEx) which use Int64 to overcome the DWORD limit.
0
 

Author Comment

by:aztec
ID: 7109774
So TFileStream wouldn't help?

OK, how do I use these Win32 API file commands? Would you have some samples?

Are they faster/more efficient for reading/writing than the regular Read/Write statements for text files?
0
 

Author Comment

by:aztec
ID: 7109776
I see nothing for these commands in the Delphi 3 Help file...
0
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7109886
Get the Platform SDK from Microsoft. It contains all the help you need. The Delphi help is outdated because they have to license it from MS.

CreateFile ReadFile etc are the core functions for files. All Delphi functions have to base on them.
0
 

Author Comment

by:aztec
ID: 7109958
can I download this somewhere? Is it free?
0
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7110014
Yes, go to www.microsoft.com and search for "Platform SDK".
0
 
LVL 7

Accepted Solution

by:
Cynna earned 50 total points
ID: 7110539
aztec,

This free component (with source) enables you
to work with 64-bit file sizes:

http://17slon.com/gp/gp/gphugefile.htm

0
 

Author Comment

by:aztec
ID: 7112294
Robert - does the "Readfile" from the Platform SDK let you do like a "Readln"? Where it will read in one full line (ended by a CR/LF) at a time (without having to specify byte-count)? I am reading in text-files like this and this is essential for me.

Cynna - thanks for the component suggestion. But when I try to install it, it gives a compile error in the "GPHugeF" file - File Not Found "SysConst.dcu". I am using Delphi 3 Pro. Maybe I don't have this file?
  Also, will this component address the issues I mention aboe to Robert?

Thanks
   Shawn
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:aztec
ID: 7112376
Also Robert - there is like 7 different components of the Platform SDK... do I need them all, or is there only a specific one I need to install?
0
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7112708
It is always wise to have all possible help available.
ReadFile does not allow to read lines.
0
 

Author Comment

by:aztec
ID: 7112740
So can I do this then? How can I read text files > 4 gigs in a line by line fashion?
0
 
LVL 11

Expert Comment

by:robert_marquardt
ID: 7112872
1. Rethink your strategy. >4 Gigs line by line will take ages. Having unstructured textual information of that size indicates poor design.
2. Write a simple class with a big buffer where you pick the lines from. That is basic programming technique.
0
 

Author Comment

by:aztec
ID: 7113138
(1) But this is the structure of my data. Large files consisting of one email address on each line - which I must read in and process one at a time. I cannot change this.

(2) How do I do this and just pick out 1 address/line at a time (ie. up to the first CR/LF) ?
0
 
LVL 7

Expert Comment

by:Cynna
ID: 7113202
aztec,


> File Not Found "SysConst.dcu"

Well, D3 obviously doesn't have this unit. Try this:

1. Copy this to NotePad:

unit SysConst;

interface

resourcestring
  SUnknown = '<unknown>';
  SInvalidInteger = '''%s'' is not a valid integer value';
  SInvalidFloat = '''%s'' is not a valid floating point value';
  SInvalidDate = '''%s'' is not a valid date';
  SInvalidTime = '''%s'' is not a valid time';
  SInvalidDateTime = '''%s'' is not a valid date and time';
  STimeEncodeError = 'Invalid argument to time encode';
  SDateEncodeError = 'Invalid argument to date encode';
  SOutOfMemory = 'Out of memory';
  SInOutError = 'I/O error %d';
  SFileNotFound = 'File not found';
  SInvalidFilename = 'Invalid filename';
  STooManyOpenFiles = 'Too many open files';
  SAccessDenied = 'File access denied';
  SEndOfFile = 'Read beyond end of file';
  SDiskFull = 'Disk full';
  SInvalidInput = 'Invalid numeric input';
  SDivByZero = 'Division by zero';
  SRangeError = 'Range check error';
  SIntOverflow = 'Integer overflow';
  SInvalidOp = 'Invalid floating point operation';
  SZeroDivide = 'Floating point division by zero';
  SOverflow = 'Floating point overflow';
  SUnderflow = 'Floating point underflow';
  SInvalidPointer = 'Invalid pointer operation';
  SInvalidCast = 'Invalid class typecast';
  SAccessViolation = 'Access violation at address %p. %s of address %p';
  SStackOverflow = 'Stack overflow';
  SControlC = 'Control-C hit';
  SPrivilege = 'Privileged instruction';
  SOperationAborted = 'Operation aborted';
  SException = 'Exception %s in module %s at %p.'#$0A'%s%s';
  SExceptTitle = 'Application Error';
  SInvalidFormat = 'Format ''%s'' invalid or incompatible with argument';
  SArgumentMissing = 'No argument for format ''%s''';
  SInvalidVarCast = 'Invalid variant type conversion';
  SInvalidVarOp = 'Invalid variant operation';
  SDispatchError = 'Variant method calls not supported';
  SReadAccess = 'Read';
  SWriteAccess = 'Write';
  SResultTooLong = 'Format result longer than 4096 characters';
  SFormatTooLong = 'Format string too long';
  SVarArrayCreate = 'Error creating variant array';
  SVarNotArray = 'Variant is not an array';
  SVarArrayBounds = 'Variant array index out of bounds';
  SExternalException = 'External exception %x';
  SAssertionFailed = 'Assertion failed';
  SIntfCastError = 'Interface not supported';
  SSafecallException = 'Exception in safecall method';
  SAssertError = '%s (%s, line %d)';
  SAbstractError = 'Abstract Error';
  SModuleAccessViolation = 'Access violation at address %p in module ''%s''. %s of address %p';
  SCannotReadPackageInfo = 'Cannot access package information for package ''%s''';
  sErrorLoadingPackage = 'Can''t load package %s.'#13#10'%s';
  SInvalidPackageFile = 'Invalid package file ''%s''';
  SInvalidPackageHandle = 'Invalid package handle';
  SDuplicatePackageUnit = 'Cannot load package ''%s.''  It contains unit ''%s,''' +
    ';which is also contained in package ''%s''';
  SWin32Error = 'Win32 Error.  Code: %d.'#10'%s';
  SUnkWin32Error = 'A Win32 API function failed';
  SNL = 'Application is not licensed to use this feature';

  SShortMonthNameJan = 'Jan';
  SShortMonthNameFeb = 'Feb';
  SShortMonthNameMar = 'Mar';
  SShortMonthNameApr = 'Apr';
  SShortMonthNameMay = 'May';
  SShortMonthNameJun = 'Jun';
  SShortMonthNameJul = 'Jul';
  SShortMonthNameAug = 'Aug';
  SShortMonthNameSep = 'Sep';
  SShortMonthNameOct = 'Oct';
  SShortMonthNameNov = 'Nov';
  SShortMonthNameDec = 'Dec';

  SLongMonthNameJan = 'January';
  SLongMonthNameFeb = 'February';
  SLongMonthNameMar = 'March';
  SLongMonthNameApr = 'April';
  SLongMonthNameMay = 'May';
  SLongMonthNameJun = 'June';
  SLongMonthNameJul = 'July';
  SLongMonthNameAug = 'August';
  SLongMonthNameSep = 'September';
  SLongMonthNameOct = 'October';
  SLongMonthNameNov = 'November';
  SLongMonthNameDec = 'December';

  SShortDayNameSun = 'Sun';
  SShortDayNameMon = 'Mon';
  SShortDayNameTue = 'Tue';
  SShortDayNameWed = 'Wed';
  SShortDayNameThu = 'Thu';
  SShortDayNameFri = 'Fri';
  SShortDayNameSat = 'Sat';

  SLongDayNameSun = 'Sunday';
  SLongDayNameMon = 'Monday';
  SLongDayNameTue = 'Tuesday';
  SLongDayNameWed = 'Wednesday';
  SLongDayNameThu = 'Thursday';
  SLongDayNameFri = 'Friday';
  SLongDayNameSat = 'Saturday';

implementation

end.


2. Save that text as 'sysconst.pas' in the same folder
   you extracted GpHugeF.pas

3. Now it should compile OK.




>Also, will this component address the issues I mention aboe to Robert?

Yes it will.



> (2) How do I do this and just pick out 1 address/line at a time (ie. up to the first CR/LF) ?


Use BlockRead method of TGpHugeFile to read file chunk by chunk, and search for CR/LF inside every chunk.
0
 

Author Comment

by:aztec
ID: 7113224
ok Cynna, thanks.

For fastest file reading, what do you recommend as an efficient 'chunk-size'? 2 megs? 4 megs?

Thanks
0
 
LVL 7

Expert Comment

by:Cynna
ID: 7114067
aztec,

Well generally, the more - the better. It depends on
your target system average expected free mem.
For example, I'd use 8Mb as a start point on my system...

Anyway, you could very easily change several values and
measure how it impacts your performance, so you could
make best judgement for yourself.
0
 

Author Comment

by:aztec
ID: 7116814
OK Cynna, I did what you suggested for the sysconst.pas, now when installing the HugeFile component, I get a new error on this linein the GPHugeF.pas file:

var
  start : int64

"Error: Undeclared identifier: 'int64'"

Any suggestions? I am using Delphi 3 Pro.

Thanks
   Shawn
0
 
LVL 7

Expert Comment

by:Cynna
ID: 7117326
aztec,

D3 - well that's a kind of a problem...
Int64 wasn't introduced until D4, I think.

You might try replacing Int64 with Comp. But the
problem might be that Comp is float, while Int64 is
(surprise, surprise...) integer. This can create
problems if GPHugeF.pas uses operations like div
(specific to integers) on Int64 types. You might
try tweeking this code a bit, if you get any other
errors.

Other then that I'm afraid I can't think of anything
simple, considering your platform. Additional problem
is I have D5, and can't really test D3 specific code....

If you get stuck again, post it, and I'll try helping you
a bit later.
0
 

Author Comment

by:aztec
ID: 7117468
Thanks for the suggestion Cynna. I made the change of int64 to comp (for 2 variables - start and stop), but it gave a new compile error for this line:

stop := stop + $100000000;

"Integer expression too large"

Plus there were other compile errors everywhere "Win32Check" was mentioned. :-(

Am I stuck? Is it possible to email directly the creator of this component?
0
 
LVL 7

Expert Comment

by:Cynna
ID: 7121602
Well, aztec, you're pretty close to beeing stuck, as far as I can see...

But, try one more final idea:

procedure TForm1.Button1Click(Sender: TObject); // read file
const  cBufSize = 2048;
var Buffer   : Array [0..cBufSize] of Char;
    hRead    : THandle;
    FileName : String;
    read     : DWORD;
begin
   // Ensure our Buffer is always zero-terminated:
   Buffer[cBufSize]:=Chr(0);
   // Place your full file name here:
   FileName:='Unit1.pas';
   hRead := CreateFile( PChar(FileName),
                        GENERIC_READ,
                        FILE_SHARE_READ or FILE_SHARE_WRITE,
                        Nil,
                        OPEN_EXISTING,
                        FILE_ATTRIBUTE_TEMPORARY,
                        0 );
   while ReadFile(hRead, Buffer, cBufSize, read, nil) do begin
         // ----------- Process Buffer: ------------------
         // .... your block-parsing code here ...
         // ----------------------------------------------
         Memo1.Text:=StrPas(Buffer); // just a demo
         Application.ProcessMessages; Sleep(1);
         if read<cBufSize then Break;
   end;
   // Final block processing:
   // ----------- Process Buffer: ------------------
   // .... your block-parsing code here ...
   // ----------------------------------------------
   Memo1.Text:=StrPas(Buffer); // just a demo
   CloseHandle(hRead);
end;


This code uses WinAPI native file handling operations.
It relies on sequential read, so it doesn't touch
your D3 limitation.
I used it just as a demo. You should use your large
file name in var FileName.
If you see the end of your large file in Memo1, when it
finishes, it works OK. If it works OK, write your block
parsing code instead of Memo1.Text:=... part and you're done.
If it doesn't, I'm afraid I'm all out of quick-ideas for this one...


Good luck!

0
 

Author Comment

by:aztec
ID: 7122489
Thanks Cynna - I just went ahead and upgraded to the D6 Personal Edition ...everything works fine now!

(thanks for the final D3 suggestion however!)

Shawn
0
 
LVL 7

Expert Comment

by:Cynna
ID: 7123280
> ..I just went ahead and upgraded to the D6 Personal Edition...

Smart move.
:)
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now