Link to home
Start Free TrialLog in
Avatar of jtrifts
jtriftsFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Max file size to be read by UTL_FILE

I'm using UTL_FILE as an alternative to SQL*LOADER to load in data to an Oracle Database (8.1.5).  

1) I am interested in finding out the largest permissible file size that can be opened (for reading only) by this package.  The file will be located on a Unix box.

Currently, I have a data extract flat file that is in excess of 700MB.  

2) Will I be able to read this?  
3) Will its size have an impact on the speed on which the lines are read?
4) Or does the file only get opened in chunks into a buffer?  
5) If the file is only read in chunks into a buffer, can you tell me what is the default buffer size and how one might change it?

50 points for getting the max size biz.
100 points for finding max size and the default buffer size.
200 points for answering all questions.
(I don't mind splitting the points up around and posting additional questions to provide points for contributors).

Thanks to all,

JT

Avatar of urim
urim

Before answering all your questions, can you explain me why not using sql*loader ?

Load the data into temporary table and then run on the temporary table and do all the logic you want.
Avatar of jtrifts

ASKER

Urim,

I am loading sequential rows of data that comes from a system that examines data in a physical manner (i.e. data is grouped into clusters).  Related data is not marked relationally (e.g. no foreign keys exist).  

Loading into a temporary table straigt via SQL*LOADER causes my data to lose its physical relationship (!!!) as Oracle stores records logically.  

To give an example, the first ten records might all belong to a single data set, then the next four, then the next seventeen.  There is no defined number of records in a set.  Thus I cannot say the first set starts with AA so the next ten records belong to AA.  The first AA record may be the only record in that cluster or there may be 1000 in the cluster.  Further, the positions of fields within each record vary depending on the cluster and the type of record contained within the cluster.  

Unfortunately, I have no control over the data extract, though I'm responsible for the import.

Any clearer?

Regards,

JT
ASKER CERTIFIED SOLUTION
Avatar of urim
urim

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of jtrifts

ASKER

Urim,

I wasn't aware you could embed proper SQL into SQL*LOADER.  Are you saying I can simply ad into the WHEN clause:

sequence_name.nextval  POSITION (1:10)

Can you provide the proper syntax?

How about setting other types of counters (or is PL/SQL illegal)?

Cheers,
JT
loo at the book Oracle utilites at page 5-6. You can see there you can put sequence on a column. sql*loader will generate sequence for this column for each rows that been loaded.
So you don't need to put the nextval by yourself, sql*loader does it for you.

If you don't have the book, you can view it in http://technet.oracle.com. You need to register but it free.
Then go for documents and Oracle 8 and choose the book Oracle8 Utilites.
Avatar of jtrifts

ASKER

OK Thanks Urim.  

To get back to the current topic, though, does anyone else have any input on the UTL_FILE side?
I've used utl_file on both NT and Unix for both reading and writing and have not encountered a file size limit.  If there is, I would expect it to be an O/S limit.  I am not aware of an Oracle limit on the size of the file for utl_file.

I would also expect the buffering, if any, to be O/S-dependent.  
You should not be concerned by file size limit. There is none on Oracle side. As markgeer indicated, the limit is OS file system limit. However there is line size limit. In Oracle 7 it is 1024 including New Line which is OS dependent (1 character for most OS, 2 characters fore others). Oracle 8 (you indicated you are using Oracle 8.1.5) provides an overloading UTL_FILE.FOPEN function:

FUNCTION fopen(location     IN VARCHAR2,
                 filename     IN VARCHAR2,
                 open_mode    IN VARCHAR2,
                 max_linesize IN BINARY_INTEGER) RETURN file_type;

where max_linesize is between 1 and 32767 (including New Line).

So, if you are loading values longer than 32767, you will have to split them into multiple lines.

Solomon Yakobson.
Avatar of jtrifts

ASKER

I appreciate your response syakobson.  Unfortunately as it re-iterates answers provided by others, I'm the points will have to go to them.  

I am familiar with the linesize issue, which I have already addressed and incidentally, that is what prompted to ask the question regarding file size.

In light of everyone stating that the limit will be o/s dependent, is anyone familiar with the actual value (on Unix)? I.E. under the circumstances noted in the question, what is the max file size?  What if the file is greater than the stated 700MB and closer to 2GB?

Many thanks for all responses thus far!

JT

I can't see a mention of file size limits for UTL_FILE in any of the documentation/white papaers.

In general I am not a fan of larges (2Gb + ) files because they become cumbersome.

Is there anyway the people creating the files could split them up ?

If not and there are performance issues when loading try and split them with a unix commmand prior to loading.



Avatar of jtrifts

ASKER

It is possible for us to limit the data file size, however, the smaller the files, the more intervention will be necessary.  (Though not necessaily if we code to account for it -- but then that in itself is more work).

That said, what we don't want to get into is a situation where related data is spread across these two (or more) files.

Have checked my system limitations, and bellyboy has come the closest to the limit -- 2GB.

Thus, I will accept URIM's comment as answer, and post points for Markgeer and bellyboy.

Thanks for the discussion folks!

regards, Jtrifts
Avatar of jtrifts

ASKER

Thanks all