aztec
asked on
Delphi thinks it's EOF
Hi ...
My app reads plain text files. In one of these text files, there happened to be a weird looking character - it looked like a vertical bar type thing. It was around the beginning of the file and the file was quite large.When my program reached this funny character, it stopped reading the file and appeared to take this character as EOF??? Ever see this before. How do I prevent something like this from happenning?
Cheers
Shawn Halfpenny
drumme59@sprint.ca
My app reads plain text files. In one of these text files, there happened to be a weird looking character - it looked like a vertical bar type thing. It was around the beginning of the file and the file was quite large.When my program reached this funny character, it stopped reading the file and appeared to take this character as EOF??? Ever see this before. How do I prevent something like this from happenning?
Cheers
Shawn Halfpenny
drumme59@sprint.ca
If it's a plain text file, I think CTRL + Z means EOF. Maybe you can prevent this by reading it as a binary file.
Where did this specific file come from ?
If it's suposed to be plain-text Why there is special characters in it ?
If it's suposed to be plain-text Why there is special characters in it ?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hello..
Regarding Odie's comment...how do I read a text file as a binary file?
Regarding Mvz's answer, my file is not file of Byte, but is text. Will FileSize still work? And furthermore, how do I use FileSize to determine I'm at end-of-file? Like this:
if FileSize(f) = FilePos(f)...
..is that correct?
Cheers
Shawn
Regarding Odie's comment...how do I read a text file as a binary file?
Regarding Mvz's answer, my file is not file of Byte, but is text. Will FileSize still work? And furthermore, how do I use FileSize to determine I'm at end-of-file? Like this:
if FileSize(f) = FilePos(f)...
..is that correct?
Cheers
Shawn
FileSize(f) = FilePos(f) should do the trick.
If there are EOF-cahrs in the middle of the file, it is not an text-file. You should read it as a binary file.
Here's an example of working with binary files:
( copy source file to destination file )
function FileCopy(var source, destination: String; EQuery: boolean):
boolean;
implementation
uses
SysUtils, Dialogs;
function FileCopy(var Source, Destination: String; EQuery: boolean):
boolean;
var
Src, Dest, len: Integer;
size: LongInt;
buffer: array [0..4095] of Byte;
begin
if Not FileExists(Source) then begin
ShowMessage('Source File Not Found');
Result := False;
exit;
end;
if FileExists(Destination)the n begin
{ dest file exists, if equery opt for rename }
if EQuery then begin
{ future: rename destination file }
Result := False;
exit;
end
else begin
{ copy over old existing file }
DeleteFile(Destination);
end;
end;
{ copy source to destination }
Result := False;
if Source <> Destination then begin
Src := FileOpen(source, fmOpenRead);
if Src >= 0 then begin {successfully opened }
{ get file size }
size := FileSeek(Src, 0, 2);
FileSeek(Src, 0, 0);
{ create new file }
Dest := FileCreate(Destination);
if Dest >= 0 then begin { successful creation }
while size > 0 do begin
len := FileRead(Src, buffer, sizeof(buffer));
FileWrite(Dest, buffer, len);
size := size - len;
end;
{ keep date and attribute values }
FileSetDate(Dest, FileGetDate(Src));
FileClose(Dest);
FileSetAttr(Destination, FileGetAttr(Source));
Result := True;
end;
Fileclose(Src);
end;
end
else begin
{ attempted to copy source to itself }
ShowMessage('FILE COPY ERROR: Cannot Copy File to itself');
end;
end;
If there are EOF-cahrs in the middle of the file, it is not an text-file. You should read it as a binary file.
Here's an example of working with binary files:
( copy source file to destination file )
function FileCopy(var source, destination: String; EQuery: boolean):
boolean;
implementation
uses
SysUtils, Dialogs;
function FileCopy(var Source, Destination: String; EQuery: boolean):
boolean;
var
Src, Dest, len: Integer;
size: LongInt;
buffer: array [0..4095] of Byte;
begin
if Not FileExists(Source) then begin
ShowMessage('Source File Not Found');
Result := False;
exit;
end;
if FileExists(Destination)the
{ dest file exists, if equery opt for rename }
if EQuery then begin
{ future: rename destination file }
Result := False;
exit;
end
else begin
{ copy over old existing file }
DeleteFile(Destination);
end;
end;
{ copy source to destination }
Result := False;
if Source <> Destination then begin
Src := FileOpen(source, fmOpenRead);
if Src >= 0 then begin {successfully opened }
{ get file size }
size := FileSeek(Src, 0, 2);
FileSeek(Src, 0, 0);
{ create new file }
Dest := FileCreate(Destination);
if Dest >= 0 then begin { successful creation }
while size > 0 do begin
len := FileRead(Src, buffer, sizeof(buffer));
FileWrite(Dest, buffer, len);
size := size - len;
end;
{ keep date and attribute values }
FileSetDate(Dest, FileGetDate(Src));
FileClose(Dest);
FileSetAttr(Destination, FileGetAttr(Source));
Result := True;
end;
Fileclose(Src);
end;
end
else begin
{ attempted to copy source to itself }
ShowMessage('FILE COPY ERROR: Cannot Copy File to itself');
end;
end;
ASKER
You guys leave a lot of unanswered questions :
(1) If I use the Fileseek and Filepos things to determine EOF, then my file has to be a File of Byte, not a text file, right? Then how do I read records from a File of Byte? Does "readln" work?
(2) If I convert my text file to binary - as mvz outlines - then what do I do then? How do I read it? With "readln" ?
Regards,
Shawn Halfpenny
drumme59@sprint.ca
(1) If I use the Fileseek and Filepos things to determine EOF, then my file has to be a File of Byte, not a text file, right? Then how do I read records from a File of Byte? Does "readln" work?
(2) If I convert my text file to binary - as mvz outlines - then what do I do then? How do I read it? With "readln" ?
Regards,
Shawn Halfpenny
drumme59@sprint.ca
I still don't understand why a text-file has EOF inside...
ASKER
Itamar... I know...it shouldn't, but sometimes some crap gets in text files for no apparent reason. I want to have my program be on the lookout for that.
Shawn
Shawn
ASKER
mvz...are you there???
Shawn Halfpenny
Shawn Halfpenny
The 'crap' you get in your text files, is the reason why you DON'T have text-files. In my opinion the defition of a text-file is a file where there are no bytes below value 32 and above 128.
Only 13 (carriage return) and 10 (Line feed) are allowed.
If you have (even 1) bytevalue below 32 (and you are talking of the EOF, which is 26), you have a binary file, and you should thread it as one.
A big disadvantage is that you cannot use readln, but you have to use fileread(). This function reads in a predefined number of caracters. Be aware that CR & LF caracters are threated as 'normal' caracters and could be in the middle of your buffer.
Only 13 (carriage return) and 10 (Line feed) are allowed.
If you have (even 1) bytevalue below 32 (and you are talking of the EOF, which is 26), you have a binary file, and you should thread it as one.
A big disadvantage is that you cannot use readln, but you have to use fileread(). This function reads in a predefined number of caracters. Be aware that CR & LF caracters are threated as 'normal' caracters and could be in the middle of your buffer.
ASKER
Couldn't I use the BlockRead procedure instead of the fileRead() function? Then I could read in one record at a time, analyze its contents checking for weird characters, clean it up if it contains any, then output it using BlockWrite?
ASKER
Am trying the BlockRead thing and am following the example in the on-line help faithfully, but still get a "type mismatch" error...here's my code:
procedure cleanfile(infilestr:string ; outfilestr:string);
label doagain, ckat;
var
fromf : File;
hinrec: array[1..2048] of Char;
inrec: array[1..2048] of Char;
numread, numwritten : word;
x,i, startindex, endindex : integer;
foundbad : boolean;
hstr : string;
begin
assignfile(fromf, infilestr);
reset(fromf, 2048);
assignfile(outfile, outfilestr);
rewrite(outfile);
repeat
BlockRead(fromf, hinrec, 1, numread);
inrec := strlower(hinrec);
.. this statement (inrec:=strlower(hinrec)) generates a type mismatch error. What am I doing or declaring wrong? I pretty well follow exactly what the "BlockRead" example says in the online help!
Cheers
Shawn
procedure cleanfile(infilestr:string
label doagain, ckat;
var
fromf : File;
hinrec: array[1..2048] of Char;
inrec: array[1..2048] of Char;
numread, numwritten : word;
x,i, startindex, endindex : integer;
foundbad : boolean;
hstr : string;
begin
assignfile(fromf, infilestr);
reset(fromf, 2048);
assignfile(outfile, outfilestr);
rewrite(outfile);
repeat
BlockRead(fromf, hinrec, 1, numread);
inrec := strlower(hinrec);
.. this statement (inrec:=strlower(hinrec)) generates a type mismatch error. What am I doing or declaring wrong? I pretty well follow exactly what the "BlockRead" example says in the online help!
Cheers
Shawn
ASKER
mvz...are you there???
Shawn
Shawn
I think you should not use BlockRead, unless you are sure that your files-size are always multiple's of your blocksize.
Here's my Cleanfile suggestion:
(It deletes EOF-karakters and converts upper to lowercase)
function cleanfile(infilestr:string ; outfilestr:string):boolean ;
var i,Src,Dest,size,len,Count: integer;
bufferIn,bufferOut:array[0 ..2048] of Byte;
begin
result:=false;
Src := FileOpen(infilestr, fmOpenRead);
try
if Src >= 0 then begin
size := FileSeek(Src, 0, 2); //filesize
FileSeek(Src, 0, 0); // goto top
Dest:=FileCreate(outFileSt r);
try
if Dest> 0 then begin
while size > 0 do begin
len := FileRead(Src, bufferIn, sizeof(bufferIn));
Count:=0;
for i:=0 to len-1 do begin
if bufferIn[i]<>26 then begin
if (Chr(bufferIn[i])>='A') AND (Chr(bufferIn[i])<='Z') then
bufferOut[Count]:=bufferIn [i]+32 else
bufferOut[Count]:=bufferIn [i];
Inc(Count);
end;
end;
if Count>0 then FileWrite(Dest, bufferOut, Count);
size := size - len;
end;
result:=true;
end;
finally
FIleClose(Dest);
end;
end;
finally
FIleClose(Src);
end;
end;
Here's my Cleanfile suggestion:
(It deletes EOF-karakters and converts upper to lowercase)
function cleanfile(infilestr:string
var i,Src,Dest,size,len,Count:
bufferIn,bufferOut:array[0
begin
result:=false;
Src := FileOpen(infilestr, fmOpenRead);
try
if Src >= 0 then begin
size := FileSeek(Src, 0, 2); //filesize
FileSeek(Src, 0, 0); // goto top
Dest:=FileCreate(outFileSt
try
if Dest> 0 then begin
while size > 0 do begin
len := FileRead(Src, bufferIn, sizeof(bufferIn));
Count:=0;
for i:=0 to len-1 do begin
if bufferIn[i]<>26 then begin
if (Chr(bufferIn[i])>='A') AND (Chr(bufferIn[i])<='Z') then
bufferOut[Count]:=bufferIn
bufferOut[Count]:=bufferIn
Inc(Count);
end;
end;
if Count>0 then FileWrite(Dest, bufferOut, Count);
size := size - len;
end;
result:=true;
end;
finally
FIleClose(Dest);
end;
end;
finally
FIleClose(Src);
end;
end;
ASKER
thanks for the function mvz, but I need to use BlockRead (I think) because I must read in a record (record length is variable) at a time and BlockRead seems to allow for this (does FileRead?). If I use FileRead, then I will wind up splitting records by reading in a fixed number of bytes (2048) at a time. Do you see what I mean? Can you please explain to me why I get a type mismatch error on the line:
inrec := strlower(hinrec);
..as I mentioned before when trying to use BlockRead?
inrec := strlower(hinrec);
..as I mentioned before when trying to use BlockRead?
FileRead does allows the number of bytes to read (3th paramter).
It even reports back the exactly read number of bytes (Blockread does not). I think this is important when reaching the end of the file.
StrLower does expect a string as parameter, not an array of char (which hinrec is). Therefor DElphi reports an error.
Instead you can use something like:
for i:=1 to 2048 do
if (Chr(hinrec[i])>='A') AND (Chr(hinrec[i])<='Z') then
hinrec[i]:=hinrec[i]+32
Greetings,
MvZ
It even reports back the exactly read number of bytes (Blockread does not). I think this is important when reaching the end of the file.
StrLower does expect a string as parameter, not an array of char (which hinrec is). Therefor DElphi reports an error.
Instead you can use something like:
for i:=1 to 2048 do
if (Chr(hinrec[i])>='A') AND (Chr(hinrec[i])<='Z') then
hinrec[i]:=hinrec[i]+32
Greetings,
MvZ
ASKER
you're not following me... I KNOW that FileRead has a parameter to read a set # of bytes...that's why I don't think it'll work for me. Reading a fixed # of bytes will split up my records which I MUST NOT do! Therefore I think I have to use BlockRead - which you can set to read 'count' records at a time. Do you see what I'm saying?
Can you please explain to me how to use BlockRead?
Also, you say the function 'strlower' expects a string variable. Accoring to Delphi on-line help, this is the definition of 'strlower':
StrLower converts a string to lowercase.
Unit
SysUtils
Category
string handling routines (null-terminated)
function StrLower(Str: PChar): PChar;
Description
The StrLower function converts Str to lowercase and returns Str.
.. according to that, Strlower uses a PChar array as input, not a string. Can you clarify?
Can you please explain to me how to use BlockRead?
Also, you say the function 'strlower' expects a string variable. Accoring to Delphi on-line help, this is the definition of 'strlower':
StrLower converts a string to lowercase.
Unit
SysUtils
Category
string handling routines (null-terminated)
function StrLower(Str: PChar): PChar;
Description
The StrLower function converts Str to lowercase and returns Str.
.. according to that, Strlower uses a PChar array as input, not a string. Can you clarify?
ASKER
mvz...are you there????
Sorry for the delay,
I was wrong with the Strlower-parameter.
It's in the return value. Use something like this:
StrCopy(inrec,StrLower(hin rec))
Be aware that inrec is still not a string. You can however let delphi convert it to one, by:
var cString:string;
cString:=inrec;
Now cString is a normal string;
Greetings
MvZ
I was wrong with the Strlower-parameter.
It's in the return value. Use something like this:
StrCopy(inrec,StrLower(hin
Be aware that inrec is still not a string. You can however let delphi convert it to one, by:
var cString:string;
cString:=inrec;
Now cString is a normal string;
Greetings
MvZ
ASKER
MVZ...still waiting for an explanation on how to use BlockRead to read in 'Count' records at a time. In my case, 'Count' will be 1.
Shawn
Shawn
ASKER
mvz..you there????