[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

sqlldr a .csv with variable length columns into an oracle(8.1) table

Posted on 2004-10-27
18
Medium Priority
?
771 Views
Last Modified: 2007-12-19
Hi,

I have a .csv file of the following format:

Item, Resource,20-10-2004 10:04,20-10-2004 11:00
BL00002BB      ,47    ,188,203

here there are only 2 date/time columns. there can be anywhere between 100-1000 date/time columns.

Im using Oracle 8.1 and would like to know the best way to load this into a table of

Resource_Table
start_time                           End time                   Item               Resource     Qty
20-10-2004 10:04                20-10-2004  11:00     BL00002BB            47        188
20-10-2004 11:00                20-10-2004  12:00     BL00002BB            47        203

etc.

any ideas?
0
Comment
Question by:joehodge
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 5
  • 4
18 Comments
 

Author Comment

by:joehodge
ID: 12421506
I've just increased the points as a solution is needed fairly urgently!
0
 

Author Comment

by:joehodge
ID: 12421819
would utl_file be a better option?
0
 
LVL 2

Expert Comment

by:muskware77
ID: 12422309
You need to adapt your model to your input. So, let's think a little. When you
have two date/time columns, you label them start_time and end_time. How would you label the columns if you have say four (4) date/time columns?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:joehodge
ID: 12422573
Hi muskware77,

if we have four time/date columns we still only have:

Resource_Table
start_time                           End time                   Item               Resource     Qty
20-10-2004 10:04                20-10-2004  11:00     BL00002BB            47        188
20-10-2004 11:00                20-10-2004  12:00     BL00002BB            47        203
20-10-2004 11:04                20-10-2004  11:00     BL00002BB            47        188
20-10-2004 12:00                20-10-2004  12:00     BL00002BB            47        203

thanks
0
 
LVL 2

Expert Comment

by:muskware77
ID: 12422848
A sqlloader session like this one may move towards the solution:

-- sqlldr userid=joehodge/yourpwd control=c:\load\sol01.ctl log=d:\load\sqlldr01.log
--your control file (sol.01.ctl) may look like this:

LOAD DATA
INFILE .csv
INTO TABLE Resource_Table
FIELDS TERMINATED BY ','
(starttime, End_time, Item, Resource, Qty)


0
 
LVL 7

Accepted Solution

by:
bvanderveen earned 2000 total points
ID: 12422850
If I understand your question, you are not using variable-length columns, but a variable number of columns:
>>there can be anywhere between 100-1000 date/time columns

That makes sqlldr unsuitable for this purpose.  What does the table you are loading into look like?  I can't understand how you would map such an animal.

But, if you have to,
  1. move the date/time columns to the end, use "trailing nullcols", which defaults null into columns not specified, or
  2. in creating the datafile, put in commas as placeholders for all columns that are null

I would probably use utl_file or a java program to process this, but with the above, you may be able to make it work with sqlldr.
0
 
LVL 2

Expert Comment

by:muskware77
ID: 12423112
my time is up. maybe i see you guys tomorrow.
0
 

Author Comment

by:joehodge
ID: 12423163
Hi bvanderveen,

apologies i did mean a variable number of columns!

my idea was to :

have a table with as many columns (col_1, col_2 etc) as Oracle allows as a staging table

sqlldr the .csv into this table

write a oracle procedure to take the first row ( e.g. Item, Resource,20-10-2004 10:04,20-10-2004 11:00) and put this into a table called column_header_description
col_number     col_description
1                        item
2                        Resource
3                        20-10-2004  10:00
4                        20-10-2004  11:00
etc

I could then write another procedure(already sounding like a bit long winded!)
to insert into the Resource_Table:

Resource_Table
start_time                  End time                   Item                     Resource          Qty
3                                   4              col_1.value            col_1.value        col_3.value
5                                   6              col_1.value            col_1.value        col_4.value

3, 4, 5 and 6 would be taken from the column_header_description and col_1.value from the staging table

after writing it all out there must be an easier way?
0
 
LVL 7

Expert Comment

by:bvanderveen
ID: 12423255
I think you will run into a limitation on the number of columns Oracle allows.  You could have multiple staging tables linked in a 1 to 1 relationship, but I think you would be better served to try to make the input data more like what your final data will be (I assume you have some control over this, if not, that changes the problem).  

Look at your final design.  If you have a large number of times, I assume that there is more that start/end combinations, but some sort of intervals.  If you normalize this, you will probably have separate tables, and your program sending over output can also normalize the data to be loaded.   You can do some wizardry with sqlldr or utl_file, but I think this approach would be simpler and easier to implement and maintain.

When solving a problem, I try to start with making sure I am solving the "right" problem.

Let me know if my assumptions are wrong.
0
 

Author Comment

by:joehodge
ID: 12423402
Hi,

due to limitations from the source system we have no control over the input.

the report can be run for any time frame so this could be a month or a year (which supports your comment on the number of columns Oracle allows).

I didnt quite understand the comments in your second paragraph, could you kindly elaborate
0
 
LVL 7

Expert Comment

by:bvanderveen
ID: 12423476
My second paragraph envisions at least two tables.  Perhaps I do not understand your data.  Is the final result going to be a table with four columns:

Resource_Table
start_time                  End time                   Item                     Resource          Qty
3                                   4              col_1.value            col_1.value        col_3.value
5                                   6              col_1.value            col_1.value        col_4.value

or will there be additional time columns in this table?  If there will be additional date/time cols, what will they mean?  Will they be start_time2, end_time2, start_time3, etc?
 
0
 

Author Comment

by:joehodge
ID: 12423503
there will only ever be 5 columns in this table.

Resource_Table
start_time                  End time                   Item                     Resource          Qty
3                                   4              col_1.value            col_1.value        col_3.value
5                                   6              col_1.value            col_1.value        col_4.value
6                                   7              col_1.value            col_1.value        col_5.value
0
 
LVL 7

Expert Comment

by:bvanderveen
ID: 12423782
So why will your source output file have more than 2 times?  Even if it is a month's worth of data, I would expect that there would be a separate record for each time/resource/quantity.

What I would probably have is something like 2 tables:

  RESOURCE_INFO
      RESOURCE_ID  NUMBER NOT NULL,
      DESCRIPTION VARCHAR2(200)

  PRODUCTION_DATA
     RESOURCE_ID   NUMBER NOT NULL (FK) ,
     QUANTITY        NUMBER,
     START_TIME    DATE,
     END_TIME        DATE

I think you need to pre-process your data file.  I am guessing that they are sending you (in a one-month example, a record with the resource, and then quantity/start/end data for each data point.  You should use java, C,  or whatever to split this up so that each record matches somewhat your final data format.   Are my assumptions about the data file correct?


 
0
 

Author Comment

by:joehodge
ID: 12424041
Hi,

the first two lines of the file look like this. A user could also select an extra day/months worth of data

Item, Resource,20-10-2004 10:04,20-10-2004 11:00,20-10-2004 12:00,20-10-2004 13:00,20-10-2004 14:00,20-10-2004 15:00,20-10-2004 16:00,20-10-2004 17:00,20-10-2004 18:00,20-10-2004 19:00,20-10-2004 20:00,20-10-2004 21:00,20-10-2004 22:00,20-10-2004 23:00,21-10-2004 00:00,21-10-2004 01:00,21-10-2004 02:00,21-10-2004 03:00,21-10-2004 04:00,21-10-2004 05:00,21-10-2004 06:00,21-10-2004 07:00,21-10-2004 08:00,21-10-2004 09:00,21-10-2004 10:00,21-10-2004 11:00,21-10-2004 12:00,21-10-2004 13:00,21-10-2004 14:00,21-10-2004 15:00,21-10-2004 16:00,21-10-2004 17:00,21-10-2004 18:00,21-10-2004 19:00,21-10-2004 20:00,21-10-2004 21:00,21-10-2004 22:00,21-10-2004 23:00,22-10-2004 00:00,22-10-2004 01:00,22-10-2004 02:00,22-10-2004 03:00,22-10-2004 04:00,22-10-2004 05:00,22-10-2004 06:00,22-10-2004 07:00,22-10-2004 08:00,22-10-2004 09:00,22-10-2004 10:00,22-10-2004 11:00,22-10-2004 12:00,22-10-2004 13:00,22-10-2004 14:00,22-10-2004 15:00,22-10-2004 16:00,22-10-2004 17:00,22-10-2004 18:00,22-10-2004 19:00,22-10-2004 20:00,22-10-2004 21:00,22-10-2004 22:00,22-10-2004 23:00,23-10-2004 00:00,23-10-2004 01:00,23-10-2004 02:00,23-10-2004 03:00,23-10-2004 04:00,23-10-2004 05:00,23-10-2004 06:00,23-10-2004 07:00,23-10-2004 08:00,23-10-2004 09:00,23-10-2004 10:00,23-10-2004 11:00,23-10-2004 12:00,23-10-2004 13:00,23-10-2004 14:00,23-10-2004 15:00,23-10-2004 16:00,23-10-2004 17:00,23-10-2004 18:00,23-10-2004 19:00,23-10-2004 20:00,23-10-2004 21:00,23-10-2004 22:00,23-10-2004 23:00,24-10-2004 00:00,24-10-2004 01:00,24-10-2004 02:00,24-10-2004 03:00,24-10-2004 04:00,24-10-2004 05:00,24-10-2004 06:00,24-10-2004 07:00,24-10-2004 08:00,24-10-2004 09:00,24-10-2004 10:00,24-10-2004 11:00,24-10-2004 12:00,24-10-2004 13:00,24-10-2004 14:00,24-10-2004 15:00,24-10-2004 16:00,24-10-2004 17:00,24-10-2004 18:00,24-10-2004 19:00,24-10-2004 20:00,24-10-2004 21:00,24-10-2004 22:00,24-10-2004 23:00,25-10-2004 00:00,25-10-2004 01:00,25-10-2004 02:00,25-10-2004 03:00,25-10-2004 04:00,25-10-2004 05:00,25-10-2004 06:00,25-10-2004 07:00,25-10-2004 08:00,25-10-2004 09:00,25-10-2004 10:00,25-10-2004 11:00,25-10-2004 12:00,25-10-2004 13:00,25-10-2004 14:00,25-10-2004 15:00,25-10-2004 16:00,25-10-2004 17:00,25-10-2004 18:00,25-10-2004 19:00,25-10-2004 20:00,25-10-2004 21:00,25-10-2004 22:00,25-10-2004 23:00,26-10-2004 00:00,26-10-2004 01:00,26-10-2004 02:00,26-10-2004 03:00,26-10-2004 04:00,26-10-2004 05:00,26-10-2004 06:00,26-10-2004 07:00,26-10-2004 08:00,26-10-2004 09:00,26-10-2004 10:00,26-10-2004 11:00,26-10-2004 12:00,26-10-2004 13:00,26-10-2004 14:00,26-10-2004 15:00,26-10-2004 16:00,26-10-2004 17:00,26-10-2004 18:00,26-10-2004 19:00,26-10-2004 20:00,26-10-2004 21:00,26-10-2004 22:00,26-10-2004 23:00,27-10-2004 00:00,27-10-2004 01:00,27-10-2004 02:00,27-10-2004 03:00,27-10-2004 04:00,27-10-2004 05:00,27-10-2004 06:00,27-10-2004 07:00,27-10-2004 08:00,27-10-2004 09:00,27-10-2004 10:00,27-10-2004 11:00,27-10-2004 12:00,27-10-2004 13:00,27-10-2004 14:00,27-10-2004 15:00,27-10-2004 16:00,27-10-2004 17:00,27-10-2004 18:00,27-10-2004 19:00,27-10-2004 20:00,27-10-2004 21:00,27-10-2004 22:00,27-10-2004 23:00,28-10-2004 00:00,28-10-2004 01:00,28-10-2004 02:00,28-10-2004 03:00,28-10-2004 04:00,28-10-2004 05:00,28-10-2004 06:00,28-10-2004 07:00,28-10-2004 08:00,28-10-2004 09:00,28-10-2004 10:00,28-10-2004 11:00,28-10-2004 12:00,28-10-2004 13:00,28-10-2004 14:00,28-10-2004 15:00,28-10-2004 16:00,28-10-2004 17:00,28-10-2004 18:00,28-10-2004 19:00,28-10-2004 20:00,28-10-2004 21:00,28-10-2004 22:00,28-10-2004 23:00,29-10-2004 00:00,29-10-2004 01:00,29-10-2004 02:00,29-10-2004 03:00,29-10-2004 04:00,29-10-2004 05:00,29-10-2004 06:00,29-10-2004 07:00,29-10-2004 08:00,29-10-2004 09:00,29-10-2004 10:00,29-10-2004 11:00,29-10-2004 12:00,29-10-2004 13:00,29-10-2004 14:00,29-10-2004 15:00,29-10-2004 16:00,29-10-2004 17:00,29-10-2004 18:00,29-10-2004 19:00,29-10-2004 20:00,29-10-2004 21:00,29-10-2004 22:00,29-10-2004 23:00,30-10-2004 00:00,30-10-2004 01:00,30-10-2004 02:00,30-10-2004 03:00,30-10-2004 04:00,30-10-2004 05:00,30-10-2004 06:00,30-10-2004 07:00,30-10-2004 08:00,30-10-2004 09:00,30-10-2004 10:00,30-10-2004 11:00,30-10-2004 12:00,30-10-2004 13:00,30-10-2004 14:00,30-10-2004 15:00,30-10-2004 16:00,30-10-2004 17:00,30-10-2004 18:00,30-10-2004 19:00,30-10-2004 20:00,30-10-2004 21:00,30-10-2004 22:00,30-10-2004 23:00,31-10-2004 00:00,31-10-2004 01:00,31-10-2004 01:00,31-10-2004 02:00,31-10-2004 03:00,31-10-2004 04:00,31-10-2004 05:00,31-10-2004 06:00,31-10-2004 07:00,31-10-2004 08:00,31-10-2004 09:00,31-10-2004 10:00,31-10-2004 11:00,31-10-2004 12:00,31-10-2004 13:00,31-10-2004 14:00,31-10-2004 15:00,31-10-2004 16:00,31-10-2004 17:00,31-10-2004 18:00,31-10-2004 19:00,31-10-2004 20:00,31-10-2004 21:00,31-10-2004 22:00,31-10-2004 23:00,01-11-2004 00:00,01-11-2004 01:00,01-11-2004 02:00,01-11-2004 03:00,01-11-2004 04:00,01-11-2004 05:00,01-11-2004 06:00,01-11-2004 07:00,01-11-2004 08:00
BL00002BB      ,35    ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,82,111,111,111,111,111,111,111,111,111,111,111,55,9,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,107,71,74,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,86,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0
 
LVL 7

Expert Comment

by:bvanderveen
ID: 12425395
That's kind of what I thought.  I think you need to parse this, and output it into another file that is in the format you want to load.
0
 

Author Comment

by:joehodge
ID: 12431320
when you say parse it, what approach would you use?

I haven't much java experience but am fairly experiencde in plsql.

my latest idea was to have a staging table with two columns

col_data         blob
col_number    number pk

and then sqlldr every line into col_data field and put a sequential number in the col_number .

then do some string manipulation on the blob.

will this be incredibly slow or should I use utl_file.get?
0
 

Author Comment

by:joehodge
ID: 12431813
Hi,

thanks for your help bvanderveen.

I've decided to use utl_file.get
0
 
LVL 2

Expert Comment

by:muskware77
ID: 12432347
Hi,

happy to see you've got a solution. However, you may wish to ugrade to
Oracle9i R2 and use external table in the future. This will certainly improve the
performance and reduce the time you spend in these ETF (Extraction, Transformation and Load) efforts.

Plse see: http://www.oracle.com/technology/products/oracle9i/daily/sept19.html

Good luck.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Working with Network Access Control Lists in Oracle 11g (part 2) Part 1: http://www.e-e.com/A_8429.html Previously, I introduced the basics of network ACL's including how to create, delete and modify entries to allow and deny access.  For many…
How to Unravel a Tricky Query Introduction If you browse through the Oracle zones or any of the other database-related zones you'll come across some complicated solutions and sometimes you'll just have to wonder how anyone came up with them.  …
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question