Solved

PARSE DATA FILE WITH TWO SECTIONS TO LOAD INTO ORACLE TABLE

Posted on 2011-03-14
13
534 Views
Last Modified: 2013-12-19
I have several data files that will need to be uploaded into Oracle tables, which I have experience with.  These files are unique in that they contain data that will be loaded into 2 tables, there is a header section with header columns, followed by a row of data, then the details section, with detail column headers and data that will go into a second table.  I want to use perl to parse, and am hoping someone could offer code help with parsing the data file.  I have done much research to try to do it in sql_loader, but could not figure out of find a way to parse the data.  Thanks so much for any help you can give.  Below is what  the data table looks like.

FILENO|REPEAT|LOGDATE|SAM_TYPE|SAM_LOC|SAM_ROOM|SAM|INSTUSED|ANA|ANA_DATE|ANA_NO
33124999|0|02/10/2010 08:17|Air 2|48|1||Bert 2010/143|Gross|02/10/2010 15:10|999999|044444
SAMPLE_NO|HEAD_LOCATION|ISO|AL_DET_A|AL_DET_B
1|||FALSE|FALSE
2|||FALSE|FALSE
3|||FALSE|FALSE
4|||FALSE|FALSE
0
Comment
Question by:newtoperlpgm
  • 6
  • 5
  • 2
13 Comments
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35134012
What goes where and why? Table defs and the logic behind what goes where.

If you can define the logic do write it in perl, I'm thinking it still can be done in sqlloader.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 35134253
Here is the perl code to parse the data.  I can't provide code to insert into Oracle since you don't give any details about the tables.
#!/usr/local/bin/perl

use strict;
use warnings;
use DBD::Oracle;

# change this to 0 if you just want an array of the data rows
my $want_hash = 1;

my (@summ, %summ, @dtl);
foreach my $fil (@ARGV) {
    open IN, $fil or die "could not open $fil: $!";
    my $hdr = <IN>;
    chomp $hdr;
    my $data = <IN>;
    chomp $data;
    if ($want_hash) {
        my @hdr = split /\|/, $hdr;
        my @data = split /\|/, $data;
        for my $i (0..@hdr-1) {
            $summ{$hdr[$i]} = $data[$i];
        }
    } else {
        @summ = split /\|/, $data;
    }
    $hdr = <IN>;
    my @hdr = split /\|/, $data if $want_hash;
    chomp $hdr;
    while (<IN>) {
        chomp;
        if ($want_hash) {
            my @data = split /\|/;
            my $row;
            for my $i (0..@hdr-1) {
                $row->{$hdr[$i]} = $data[$i];
            }
            push @dtl, $row;
        } else {
            push @dtl, [split /\|/];
        }
    }
    close IN;
}

# if $want_hash is true, then
# %summ = (
#   FILENO => 33124999,
#   REPEAT => 0,
#   LOGDATE => '02/10/2010 08:17',
#   SAM_TYPE => 'Air 2',
#   ...
# @dtl = (
#   { SAMPLE_NO => 1, HEAD_LOCATION => '', ISO => '', AL_DET_A => 'FALSE', AL_DET_B => 'FALSE' }
#   { SAMPLE_NO => 2, HEAD_LOCATION => '', ISO => '', AL_DET_A => 'FALSE', AL_DET_B => 'FALSE' }
#   ...

# if $want_hash is false, then
# @summ = (33124999, 0, '02/10/2010 08:17', 'Air 2', ...
# @dtl = (
#   [1, '', '', 'FALSE', 'FALSE']
#   [2, '', '', 'FALSE', 'FALSE']
#   ...

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 35134260
Forgot...

Usage: script.pl file1 file2 file3 ...
0
 

Author Comment

by:newtoperlpgm
ID: 35135139
Thanks so much!  The first column data goes into table 1 and the second column data goes into table 2.  My Oracle tables are defined the same as my column headers in each of the two sections.  
--This is Oracle table 1, it is the header section--
FILENO|REPEAT|LOGDATE|SAM_TYPE|SAM_LOC|SAM_ROOM|SAM|INSTUSED|ANA|ANA_DATE|ANA_NO
33124999|0|02/10/2010 08:17|Air 2|48|1||Bert 2010/143|Gross|02/10/2010 15:10|999999|044444
--this is oracle table 2, it is the details section--
SAMPLE_NO|HEAD_LOCATION|ISO|AL_DET_A|AL_DET_B
1|||FALSE|FALSE
2|||FALSE|FALSE
3|||FALSE|FALSE
4|||FALSE|FALSE
0
 

Author Comment

by:newtoperlpgm
ID: 35135144
Oh, I should have mentioned, my data files are going to be different, this is not a one time deal, it will be an automated process, so the values will always change.  I realized this important point after looking over the proposed solution.
Thanks!
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35137492
So each datafile goes into it's own table?

That is really simple:  Straight sql loader.  Why use Perl?

in the control file to skip the header row: OPTIONS (SKIP=1)

Then it's a simple: FIELDS TERMINATED BY '|'

If you need a working sample, I just posted a detailed TAB separated example in:
http://www.experts-exchange.com/Programming/Automation/Q_26887301.html?#a35137412
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:newtoperlpgm
ID: 35137623
Hi,  but how will that work with loading the second set of data into the second Oracle table, in the details section below, the second header row with column headers.  thanks again, here is my control file, sorry, I couldn't add it last night, this site went down for maintenance as I was trying.  Thanks again!

OPTIONS ( PARALLEL=TRUE)
LOAD DATA
INFILE ‘29124124.TXT'
BADFILE ‘29124124.bad'
DISCARDFILE ‘29124124_.dsc'
INTO TABLE "TBL_HEADER"
APPEND
--WHEN ((1) ='??????')
FIELDS TERMINATED BY '|'
(FILENO,
REPEAT,
LOGDATE,
SAM_TYPE,
SAM_LOC,
SAM_ROOM,
SAM,
INSTUSED,
ANA,
ANA_DATE,
ANA_NO terminated by whitespace)
 
INTO TABLE "TBL_DETAIL"
--WHEN  ???????
FIELDS TERMINATED BY '|'
(SAMPLE_NO,
HEAD_LOCATION,
ISO,
AL_DET_A,
AL_DET_B terminated by whitespace)

0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35138211
Just to clarify:  You have ONE file and need to load everything in the single file into two tables.
0
 

Author Comment

by:newtoperlpgm
ID: 35138458
Yes, I will have multiple one files, so multiple "INFILE" but for simplicity I only included one in my question.  Thanks!
0
 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 500 total points
ID: 35139889
All this was tested using 10.2.0.3.


You need to figure out some way to determine which row is which.

Assuming the AL_DET_A columns can be either TRUE or FALSE and the REPEAT column can be either 0 or 1, check the following.

The WHEN clause is kind of limited in what it can do.  If you can figure our some 'AND' logic you won't need to repeat the INTO clauses.

Note:  You have an extra number that doesn't appear to match a heading for TAB1. So, I made up a column name.

Tables:
drop table tab1 purge;
create table tab1
(
FILENO number,
REPEAT number,
LOGDATE date,
SAM_TYPE varchar2(20),
SAM_LOC number,
SAM_ROOM number,
SAM varchar2(20),
INSTUSED varchar2(20),
ANA varchar2(20),
ANA_DATE date,
ANA_NO number,
someOtherNumber number
);

drop table tab2 purge;
create table tab2
(
SAMPLE_NO number,
HEAD_LOCATION number,
ISO number,
AL_DET_A varchar2(10),
AL_DET_B varchar2(10)
);



The controlfile is in the code window below.  I used INLINE data but you can specify your data files.  You might also need to mess with the error counts depending on the amounts of data you have.
load data
 infile *
 APPEND INTO TABLE Tab1
 WHEN REPEAT = '0'
 FIELDS TERMINATED BY '|' TRAILING NULLCOLS
(
FILENO,
REPEAT,
LOGDATE date 'MM/DD/YYYY HH24:MI',
SAM_TYPE,
SAM_LOC,
SAM_ROOM,
SAM,
INSTUSED,
ANA,
ANA_DATE date 'MM/DD/YYYY HH24:MI',
ANA_NO,
someOtherNumber
) 
 INTO TABLE Tab1
 WHEN REPEAT = '1'
 FIELDS TERMINATED BY '|' TRAILING NULLCOLS
(
FILENO position(1),
REPEAT,
LOGDATE date 'MM/DD/YYYY HH24:MI',
SAM_TYPE,
SAM_LOC,
SAM_ROOM,
SAM,
INSTUSED,
ANA,
ANA_DATE date 'MM/DD/YYYY HH24:MI',
ANA_NO,
someOtherNumber
) 
 INTO TABLE Tab2
 WHEN AL_DET_A = 'TRUE'
 FIELDS TERMINATED BY '|' TRAILING NULLCOLS
(
SAMPLE_NO position(1),
HEAD_LOCATION,
ISO,
AL_DET_A,
AL_DET_B
)
 INTO TABLE Tab2
 WHEN AL_DET_A = 'FALSE'
 FIELDS TERMINATED BY '|' TRAILING NULLCOLS
(
SAMPLE_NO position(1),
HEAD_LOCATION,
ISO,
AL_DET_A,
AL_DET_B
)
begindata
FILENO|REPEAT|LOGDATE|SAM_TYPE|SAM_LOC|SAM_ROOM|SAM|INSTUSED|ANA|ANA_DATE|ANA_NO
33124999|0|02/10/2010 08:17|Air 2|48|1||Bert 2010/143|Gross|02/10/2010 15:10|999999|044444
SAMPLE_NO|HEAD_LOCATION|ISO|AL_DET_A|AL_DET_B
1|||FALSE|FALSE
2|||FALSE|FALSE
3|||FALSE|FALSE
4|||FALSE|FALSE

Open in new window

0
 

Author Closing Comment

by:newtoperlpgm
ID: 35150445
Thanks, it worked perfectly.
0
 

Author Comment

by:newtoperlpgm
ID: 35152814
slightwv:  I have a problem in that records that should get inserted into table2 get discarded from the first table check and don't get input into the second table.  
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35152823
Why are they failing?

The log file should tell you why.
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
Via a live example show how to connect to RMAN, make basic configuration settings changes and then take a backup of a demo database
This video shows how to configure and send email from and Oracle database using both UTL_SMTP and UTL_MAIL, as well as comparing UTL_SMTP to a manual SMTP conversation with a mail server.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now