MyersA
asked on
Appropriate Filestream buffer size for 78MB file?
I'm using FileStream and using byte arrays in order to read an ascii file that's 77,292 KB (FileStream.Length = 79146968). I'm interested in knowing what the appropriate buffer size for a file this big would be. I've tried 2048, 32767, 100000000, etc... and, when executing fs.Read, it always makes everything REALLY sluggish (almost freezes) and I get the system error
"Insufficient system resources exist to complete the requested service." .
This is the code I'm using to read the file one time:
using(FileStream fs = new FileStream(sFileName, FileMode.Open, FileAccess.Read, FileShare.None, 32767, true))
{
byteData = new byte[fs.Length]; //fs.length = 79146968
fs.Read(byteData, 0, byteData.Length);
fs.Close();
}
Any help is really appreciated.
"Insufficient system resources exist to complete the requested service." .
This is the code I'm using to read the file one time:
using(FileStream fs = new FileStream(sFileName, FileMode.Open, FileAccess.Read, FileShare.None, 32767, true))
{
byteData = new byte[fs.Length]; //fs.length = 79146968
fs.Read(byteData, 0, byteData.Length);
fs.Close();
}
Any help is really appreciated.
An example is if your data is organized into 200 byte rows.
FileStream fs = new FileStream(sFileName, FileMode.Open, FileAccess.Read, FileShare.None, 32767, true);
byteData = new byte[200];
while( fs.Read(byteData,0,200) > 0 )
{
//do your operation on the data here
Array.Copy(byteData,0,temp Buff,0,20) ;
//Then write it to another file, whatever you want
}
The buffer would come into play here behind the scenes, by allowing you to read 200 bytes from the buffer on each iteration of the while loop, instead of going out to disk. You can increase or decrease the size the buffer to tweak the performance for your specific system.
Good Luck
NTAC
FileStream fs = new FileStream(sFileName, FileMode.Open, FileAccess.Read, FileShare.None, 32767, true);
byteData = new byte[200];
while( fs.Read(byteData,0,200) > 0 )
{
//do your operation on the data here
Array.Copy(byteData,0,temp
//Then write it to another file, whatever you want
}
The buffer would come into play here behind the scenes, by allowing you to read 200 bytes from the buffer on each iteration of the while loop, instead of going out to disk. You can increase or decrease the size the buffer to tweak the performance for your specific system.
Good Luck
NTAC
here's a tutorial on reading and writing large files:
http://samples.gotdotnet.com/quickstart/howto/doc/LargeReadWrite.aspx
from this site:
http://samples.gotdotnet.com/quickstart/howto/
http://samples.gotdotnet.com/quickstart/howto/doc/LargeReadWrite.aspx
from this site:
http://samples.gotdotnet.com/quickstart/howto/
i've read that a buffer with a lenght of 2048 is optimal.
ASKER
Thanks for all the information.
I had initially written thw Win app so that the UI would call this method (with filename as parm) which would read each row of the file, do some minor parsing, and then write it to the table. Then the method would return the filled table. This was working good-enough until the client told us that the files would contain over 400,000 records (78MB+). So now the user won't see anything until the table's filled and then loaded into the grid. And that's not including the slowness of the system from having the whole table in memory (it almost freezes). Since I have to redo this part in order for the client to use, how would you guys recommend I do this? I was thinking of displaying a few rows in the grid (30-50 rows) and display more when the need arises (ie. the user goes for a page down) but I'm not sure what event to intercept so I can do that. Also, should I append to the existing table or should I delete the previously viewed rows and add the new ones?
Any help would be appreciated.
I had initially written thw Win app so that the UI would call this method (with filename as parm) which would read each row of the file, do some minor parsing, and then write it to the table. Then the method would return the filled table. This was working good-enough until the client told us that the files would contain over 400,000 records (78MB+). So now the user won't see anything until the table's filled and then loaded into the grid. And that's not including the slowness of the system from having the whole table in memory (it almost freezes). Since I have to redo this part in order for the client to use, how would you guys recommend I do this? I was thinking of displaying a few rows in the grid (30-50 rows) and display more when the need arises (ie. the user goes for a page down) but I'm not sure what event to intercept so I can do that. Also, should I append to the existing table or should I delete the previously viewed rows and add the new ones?
Any help would be appreciated.
you need to do all your processing in another thread, and notify the user of the progress.
Multi-threaded asynchronus callback.
load a buffer in one thread, display the buffer in another thread, GUI activities in the main thread.
Don't forget to lock your buffer object when displaying so you don't get a race condition.
ASKER
Thanks for the info.
I can't use a fixed row size because all rows are of different sizes (data is comma-delimited). So if I always read 200-byte chunks, it may read into the following line.
Regarding the use of threads, I'm currently using delegation to run this long process. It takes about 2-3 minutes to do the whole file so I was wondering what code would be needed so that the user can see the data in the grid while the table's being filled?? Here's the basic structure of the code:
delegate void loadAuditFileToTableDelega te();
private void btn_run_Click(object sender, System.EventArgs e)
{
/* In Window form */
loadAuditFileToTableDelega te LoadAuditFileToTable = new loadAuditFileToTableDelega te(loadAud itFileToTa ble);
LoadAuditFileToTable.Begin Invoke (null, null);
}
private void loadAuditFileToTable()
{
/* Also in Window form */
ZMMatch.Audit zmAudit = new ZMMatch.Audit();
table_auditAddress = zmAudit.OpenAuditAZMFileTo View(_SFil eName); /*long process*/
// datagrid.DataSource = table_auditAddress;
}
public DataTable OpenAuditAZMFileToView(str ing sFileName)
{
// open file, create table/columns,etc...
sAuditRecord = sr.ReadLine();
while (sAuditRecord != null)
{
/* Parse/process line, fill row, and add row to table */
}
return myTable;
}
I can't use a fixed row size because all rows are of different sizes (data is comma-delimited). So if I always read 200-byte chunks, it may read into the following line.
Regarding the use of threads, I'm currently using delegation to run this long process. It takes about 2-3 minutes to do the whole file so I was wondering what code would be needed so that the user can see the data in the grid while the table's being filled?? Here's the basic structure of the code:
delegate void loadAuditFileToTableDelega
private void btn_run_Click(object sender, System.EventArgs e)
{
/* In Window form */
loadAuditFileToTableDelega
LoadAuditFileToTable.Begin
}
private void loadAuditFileToTable()
{
/* Also in Window form */
ZMMatch.Audit zmAudit = new ZMMatch.Audit();
table_auditAddress = zmAudit.OpenAuditAZMFileTo
// datagrid.DataSource = table_auditAddress;
}
public DataTable OpenAuditAZMFileToView(str
{
// open file, create table/columns,etc...
sAuditRecord = sr.ReadLine();
while (sAuditRecord != null)
{
/* Parse/process line, fill row, and add row to table */
}
return myTable;
}
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
What the buffer does is provide you a way to work on the data as you are reading it from the file. You are just reading all the data from the file and putting it in memory. No change of the buffer size will help what you are doing.
What are you trying to do with this data? I would recommend reading smaller sections of the data, and doing work on them--then writing them back or to another file when you are done with the processing.