Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3056
  • Last Modified:

Max file size to be read by UTL_FILE

I'm using UTL_FILE as an alternative to SQL*LOADER to load in data to an Oracle Database (8.1.5).  

1) I am interested in finding out the largest permissible file size that can be opened (for reading only) by this package.  The file will be located on a Unix box.

Currently, I have a data extract flat file that is in excess of 700MB.  

2) Will I be able to read this?  
3) Will its size have an impact on the speed on which the lines are read?
4) Or does the file only get opened in chunks into a buffer?  
5) If the file is only read in chunks into a buffer, can you tell me what is the default buffer size and how one might change it?

50 points for getting the max size biz.
100 points for finding max size and the default buffer size.
200 points for answering all questions.
(I don't mind splitting the points up around and posting additional questions to provide points for contributors).

Thanks to all,

JT

0
jtrifts
Asked:
jtrifts
1 Solution
 
urimCommented:
Before answering all your questions, can you explain me why not using sql*loader ?

Load the data into temporary table and then run on the temporary table and do all the logic you want.
0
 
jtriftsMI and AutomationAuthor Commented:
Urim,

I am loading sequential rows of data that comes from a system that examines data in a physical manner (i.e. data is grouped into clusters).  Related data is not marked relationally (e.g. no foreign keys exist).  

Loading into a temporary table straigt via SQL*LOADER causes my data to lose its physical relationship (!!!) as Oracle stores records logically.  

To give an example, the first ten records might all belong to a single data set, then the next four, then the next seventeen.  There is no defined number of records in a set.  Thus I cannot say the first set starts with AA so the next ten records belong to AA.  The first AA record may be the only record in that cluster or there may be 1000 in the cluster.  Further, the positions of fields within each record vary depending on the cluster and the type of record contained within the cluster.  

Unfortunately, I have no control over the data extract, though I'm responsible for the import.

Any clearer?

Regards,

JT
0
 
urimCommented:
You can add a column which sql*loader will give it a sequence, hence yuo can do the same logic after the load, just fetch the rows order by the sequence.

What I ment was to load the data into temporary table that has two columns: the sequence generated by sql*loader and a varchar(...) that will hold the entire line.

And for yuor question, I read somewhere I don't remember where the the limit is os dependent, meaning you can erad the 700mb.

The rest of the questions I really don't know.
0
Get quick recovery of individual SharePoint items

Free tool – Veeam Explorer for Microsoft SharePoint, enables fast, easy restores of SharePoint sites, documents, libraries and lists — all with no agents to manage and no additional licenses to buy.

 
jtriftsMI and AutomationAuthor Commented:
Urim,

I wasn't aware you could embed proper SQL into SQL*LOADER.  Are you saying I can simply ad into the WHEN clause:

sequence_name.nextval  POSITION (1:10)

Can you provide the proper syntax?

How about setting other types of counters (or is PL/SQL illegal)?

Cheers,
JT
0
 
urimCommented:
loo at the book Oracle utilites at page 5-6. You can see there you can put sequence on a column. sql*loader will generate sequence for this column for each rows that been loaded.
So you don't need to put the nextval by yourself, sql*loader does it for you.

If you don't have the book, you can view it in http://technet.oracle.com. You need to register but it free.
Then go for documents and Oracle 8 and choose the book Oracle8 Utilites.
0
 
jtriftsMI and AutomationAuthor Commented:
OK Thanks Urim.  

To get back to the current topic, though, does anyone else have any input on the UTL_FILE side?
0
 
Mark GeerlingsDatabase AdministratorCommented:
I've used utl_file on both NT and Unix for both reading and writing and have not encountered a file size limit.  If there is, I would expect it to be an O/S limit.  I am not aware of an Oracle limit on the size of the file for utl_file.

I would also expect the buffering, if any, to be O/S-dependent.  
0
 
syakobsonCommented:
You should not be concerned by file size limit. There is none on Oracle side. As markgeer indicated, the limit is OS file system limit. However there is line size limit. In Oracle 7 it is 1024 including New Line which is OS dependent (1 character for most OS, 2 characters fore others). Oracle 8 (you indicated you are using Oracle 8.1.5) provides an overloading UTL_FILE.FOPEN function:

FUNCTION fopen(location     IN VARCHAR2,
                 filename     IN VARCHAR2,
                 open_mode    IN VARCHAR2,
                 max_linesize IN BINARY_INTEGER) RETURN file_type;

where max_linesize is between 1 and 32767 (including New Line).

So, if you are loading values longer than 32767, you will have to split them into multiple lines.

Solomon Yakobson.
0
 
jtriftsMI and AutomationAuthor Commented:
I appreciate your response syakobson.  Unfortunately as it re-iterates answers provided by others, I'm the points will have to go to them.  

I am familiar with the linesize issue, which I have already addressed and incidentally, that is what prompted to ask the question regarding file size.

In light of everyone stating that the limit will be o/s dependent, is anyone familiar with the actual value (on Unix)? I.E. under the circumstances noted in the question, what is the max file size?  What if the file is greater than the stated 700MB and closer to 2GB?

Many thanks for all responses thus far!

JT
0
 
bellyboyCommented:

I can't see a mention of file size limits for UTL_FILE in any of the documentation/white papaers.

In general I am not a fan of larges (2Gb + ) files because they become cumbersome.

Is there anyway the people creating the files could split them up ?

If not and there are performance issues when loading try and split them with a unix commmand prior to loading.



0
 
jtriftsMI and AutomationAuthor Commented:
It is possible for us to limit the data file size, however, the smaller the files, the more intervention will be necessary.  (Though not necessaily if we code to account for it -- but then that in itself is more work).

That said, what we don't want to get into is a situation where related data is spread across these two (or more) files.

Have checked my system limitations, and bellyboy has come the closest to the limit -- 2GB.

Thus, I will accept URIM's comment as answer, and post points for Markgeer and bellyboy.

Thanks for the discussion folks!

regards, Jtrifts
0
 
jtriftsMI and AutomationAuthor Commented:
Thanks all
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now