sue Tons
asked on
Load files from a directory into two tables automatically.
Data.txt
I am trying to read the files from a directory and load the data into these two table and automate this process
I am trying to read the files from a directory and load the data into these two table and automate this process
Why do you want / have to use C# for the data load?! Databases do have neat built-in tools for this purpose ;-) Oracle for instance has the SQL Loader...
ASKER
So how can I automate it though? I don't want to load it manually. I want it to be able to load into the database every time I have a new file in that folder.
There may be several ways to Rome here ;-)
You could setup the scheduling wihtin your OS (Windows task scheduler or crontab on Linux) and call some sort of batch/shell/powershell script, similar to this one here:
https://www.dbasupport.com/forums/showthread.php?32572-automating-amp-scheduling-sqlloader-process
Or you may also setup this directly within the database. For example, Oracle is capable of scheduling jobs that call "external" commands....
You could setup the scheduling wihtin your OS (Windows task scheduler or crontab on Linux) and call some sort of batch/shell/powershell script, similar to this one here:
https://www.dbasupport.com/forums/showthread.php?32572-automating-amp-scheduling-sqlloader-process
Or you may also setup this directly within the database. For example, Oracle is capable of scheduling jobs that call "external" commands....
ASKER
That's what I am trying to figure out, how to do that Alex. Can you You mean like a stored proc? i don't know exactly what to do. Please?
Ok, this setup is untested (I simply don't have the time to set this up properly), but should be fixed with ease in case of problems or errors.
This setup assumes you have set ORACLE_HOME correctly and you have the following folders in addition:
C:\DMC\CFG
C:\DMC\DONE
Put the following 2 files into the CFG folder:
1. PARFILE for titles
The following batch files may be put "anywhere" you like (or you can combine them):
Keep in mind: in both files the part "username/password@db" has to be changed accordingly ;-)
1. for titles
This setup assumes you have set ORACLE_HOME correctly and you have the following folders in addition:
C:\DMC\CFG
C:\DMC\DONE
Put the following 2 files into the CFG folder:
1. PARFILE for titles
OPTIONS
(
SKIP=1,
ROWS=1000,
PARALLEL=true,
DIRECT=true,
SKIP_INDEX_MAINTENANCE=true
)
LOAD DATA
APPEND INTO TABLE title
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(OLD_TITLE_NO,
OLD_PARCEL_NO,
OLD_PARCEL_AREA,
TITLE_NO,
PARCEL_NO,
PARCEL_AREA,
INTEREST_REGISTER_NO,
INTEREST_EFFECTIVE_DATE "to_date(substr(:INTEREST_EFFECTIVE_DATE, 1, 10), 'MM/DD/YYYY')",
OLD_INT_NO,
INTEREST_NO,
INT_TYP_CODE,
INT_TYPE)
2. PARFILE for interestsOPTIONS
(
SKIP=1,
ROWS=1000,
PARALLEL=true,
DIRECT=true,
SKIP_INDEX_MAINTENANCE=true
)
LOAD DATA
APPEND INTO TABLE interest
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(INTEREST_TRN_TYP,
INT_TYPE,
INTEREST_NO,
INTEREST_REGISTER_NO,
TITLE_NO,
PARCEL_NO,
PARCEL_AREA,
INTEREST_EFFECTIVE_DATE "to_date(substr(:INTEREST_EFFECTIVE_DATE, 1, 10), 'MM/DD/YYYY')")
The following batch files may be put "anywhere" you like (or you can combine them):
Keep in mind: in both files the part "username/password@db" has to be changed accordingly ;-)
1. for titles
@ECHO OFF
ECHO Finding Data to be loaded...
CD C:\DMC
FOR %%V in (Title*) DO %ORACLE_HOME%\bin\sqlldr.exe username/password@db CONTROL=C:\DMC\CFG\titles.CTL DATA=%%V ERRORS=1500
MOVE Title* C:\DMC\DONE
2. for interests@ECHO OFF
ECHO Finding Data to be loaded...
CD C:\DMC
FOR %%V in (interest*) DO %ORACLE_HOME%\bin\sqlldr.exe username/password@db CONTROL=C:\DMC\CFG\interest.CTL DATA=%%V ERRORS=1500
MOVE interest* C:\DMC\DONE
These batch files can be used within the Windows Task Scheduler to perform these jobs on a daily base (or whatever window you need)...
ASKER
So on the PARFILES, instead of using Option, can I create or replace PROCEDURE Load Title etc. I am not understanding the Option part, not familiar with it. Alex I really appreciate your help, I don't understand what I am doing. So when I add these files into the C:\DMC\CFG , and run the batch file, then it logs into C:\DMC\DONE , right once completed? I need to understand the Option, I don't understand, I am lost. I have no clue to be honest.
The 2 PARFILES (parameter files) "just" contain the "code" from above. Create 2 files in the (new) folder
These 2 files contain processing information for the SQL Loader tool which is beuing called in conjunction with these later on...
The batch file will call the SQLLDR.EXE (SQL Loader) and moves the input files (CSV data) into the "DONE"-folder (to be created) after importing the data into the corresponding tables...
No need for any procedure or such ;-) SQL Loader is THE (built-in) tool for importing that kind of data into the database!
C:\DMC\CFG\
The 2 files will be "titles.CTL" and "interests.CTL"These 2 files contain processing information for the SQL Loader tool which is beuing called in conjunction with these later on...
The batch file will call the SQLLDR.EXE (SQL Loader) and moves the input files (CSV data) into the "DONE"-folder (to be created) after importing the data into the corresponding tables...
No need for any procedure or such ;-) SQL Loader is THE (built-in) tool for importing that kind of data into the database!
ASKER
I am so lost Alex I don't mean to be funny, This is too complicated for me as a new developer. Can you list in steps as to what is needed to be done first please, so i follow your steps. I am really lost to be honest. Walk me through please, please?
Ok, let's step back: what database are you trying to load that data into?
ASKER
ClientData is a database in Oracle and in that database I have two tables named title and interest. So I am trying to load data in those two tables. Interest files into Interest Table and title files into Title Table.
Excellent :-))
Then, the setup I mentioned above is exactly what you'll need!
Oracle's built-in tool "SQL Loader" is the weapon og choice when it comes to loading data from (flat) files like CSV or such!
It can be completely be command-line driven or can make use of a parameter file, as I suggested above. You just "tell" Oracle with the help of this tool to load some data in some specific tables.
I'm trying to break down some stuff:
PARFILE:
OPTIONS (
SKIP=1, -------------------------------------------- skip the first row (usually this contains the header names)
ROWS=1000, ------------------------------------ commit every 1000 rows (default is 2 I guess)
PARALLEL=true, -------------------------------- make use of parallelism (if applicable/possible)
DIRECT=true, ------------------------------------ "directly" write data into tables, avoiding the buffer cache (-> faster)
SKIP_INDEX_MAINTENANCE=true ------- make things even more faster ;-)
)
LOAD DATA
APPEND INTO TABLE title --------------------- where should I put the data & how (method): here APPEND
FIELDS ----------------------------------------------- fields in the file are:
TERMINATED BY "," ------------------------------ terminated/delimted by the character ","
OPTIONALLY ENCLOSED BY '"' -------------- could be enclosed with the quote character
TRAILING NULLCOLS --------------------------- and treat missing fields at the end as null
(OLD_TITLE_NO, OLD_PARCEL_NO, OLD_PARCEL_AREA, TITLE_NO, PARCEL_NO, PARCEL_AREA, INTEREST_REGISTER_NO, INTEREST_EFFECTIVE_DATE "to_date(substr(:INTEREST_EFFECTIVE_DATE, 1, 10), 'MM/DD/YYYY')",
OLD_INT_NO, INTEREST_NO, INT_TYP_CODE, INT_TYPE)
----------------------------------------------------------- the last part is the definition/mapping of the fields/columnn
BATCH FILE:
@ECHO OFF ------------------------------- suppress unwanted output
ECHO Finding Data to be loaded... -- display some text fyi ;-)
CD C:\DMC --------------------------------- change to the folder where the data (CSV files) reside
FOR %%V in (Title*) DO %ORACLE_HOME%\bin\sqlldr.exe username/password@db CONTROL=C:\DMC\CFG\titles.CTL DATA=%%V ERRORS=1500
-------------------------------------------------- "simple" for loop which iterates through all files that match "Title*" and call for each of them the sqlldr command
MOVE Title* C:\DMC\DONE ------------ when import is done, move the processed files into the DONE subfolder
Then, the setup I mentioned above is exactly what you'll need!
Oracle's built-in tool "SQL Loader" is the weapon og choice when it comes to loading data from (flat) files like CSV or such!
It can be completely be command-line driven or can make use of a parameter file, as I suggested above. You just "tell" Oracle with the help of this tool to load some data in some specific tables.
I'm trying to break down some stuff:
PARFILE:
OPTIONS (
SKIP=1, -------------------------------------------- skip the first row (usually this contains the header names)
ROWS=1000, ------------------------------------ commit every 1000 rows (default is 2 I guess)
PARALLEL=true, -------------------------------- make use of parallelism (if applicable/possible)
DIRECT=true, ------------------------------------ "directly" write data into tables, avoiding the buffer cache (-> faster)
SKIP_INDEX_MAINTENANCE=true ------- make things even more faster ;-)
)
LOAD DATA
APPEND INTO TABLE title --------------------- where should I put the data & how (method): here APPEND
FIELDS ----------------------------------------------- fields in the file are:
TERMINATED BY "," ------------------------------ terminated/delimted by the character ","
OPTIONALLY ENCLOSED BY '"' -------------- could be enclosed with the quote character
TRAILING NULLCOLS --------------------------- and treat missing fields at the end as null
(OLD_TITLE_NO, OLD_PARCEL_NO, OLD_PARCEL_AREA, TITLE_NO, PARCEL_NO, PARCEL_AREA, INTEREST_REGISTER_NO, INTEREST_EFFECTIVE_DATE "to_date(substr(:INTEREST_EFFECTIVE_DATE, 1, 10), 'MM/DD/YYYY')",
OLD_INT_NO, INTEREST_NO, INT_TYP_CODE, INT_TYPE)
----------------------------------------------------------- the last part is the definition/mapping of the fields/columnn
BATCH FILE:
@ECHO OFF ------------------------------- suppress unwanted output
ECHO Finding Data to be loaded... -- display some text fyi ;-)
CD C:\DMC --------------------------------- change to the folder where the data (CSV files) reside
FOR %%V in (Title*) DO %ORACLE_HOME%\bin\sqlldr.exe username/password@db CONTROL=C:\DMC\CFG\titles.CTL DATA=%%V ERRORS=1500
-------------------------------------------------- "simple" for loop which iterates through all files that match "Title*" and call for each of them the sqlldr command
MOVE Title* C:\DMC\DONE ------------ when import is done, move the processed files into the DONE subfolder
ASKER
I see, WOW thank you for the explanation, I greatly appreciate you so much. I have learned something I never knew.
Now the question is, I want to set up a job in task scheduler, so in the job scheduler do I add the PARFILE CODE: section with The Option(------- or the batch file. Now the question is where do I plug in the code? Do I run them in command prompt and store the batch file somewhere?
Now the question is, I want to set up a job in task scheduler, so in the job scheduler do I add the PARFILE CODE: section with The Option(------- or the batch file. Now the question is where do I plug in the code? Do I run them in command prompt and store the batch file somewhere?
I'm glad I could help or at least show you some directions ;-)
With this setup, you may just use the Windows Task Scheduler and create one or two simple tasks, let's say with daily or hourly schedules as trigger. For the "action", you'd have to tell the scheduler to call/execute the batch file from above, that's it.
With this setup you'll end up having 2 tasks, one for "Titles" and one for "Interests", though you could combine/merge them into just one if you like...
Using the Windows Task Scheduler is quite straight forward, so you should easily be able to apply all from above :-)
With this setup, you may just use the Windows Task Scheduler and create one or two simple tasks, let's say with daily or hourly schedules as trigger. For the "action", you'd have to tell the scheduler to call/execute the batch file from above, that's it.
With this setup you'll end up having 2 tasks, one for "Titles" and one for "Interests", though you could combine/merge them into just one if you like...
Using the Windows Task Scheduler is quite straight forward, so you should easily be able to apply all from above :-)
I tried to setup the whole stuff a bit ;-)
Please be cautions when extracting this as it might overwrite your stuff (but shouldn't as these files are all new)...
Due to some restrictions on this site regarding possible malicious file extensions, you need to rename the 2 ".rename2bat" to actual ".bat" files.
The 1 xml file is an export of the Task Scheduler just to show you, how you might setup this in the Scheduler...
DMC.zip
Please be cautions when extracting this as it might overwrite your stuff (but shouldn't as these files are all new)...
Due to some restrictions on this site regarding possible malicious file extensions, you need to rename the 2 ".rename2bat" to actual ".bat" files.
The 1 xml file is an export of the Task Scheduler just to show you, how you might setup this in the Scheduler...
DMC.zip
ASKER
Awesome, I will set it up and test it and see what happens. I will give you feedback.
ASKER
@Alex
I realized some of the data I am loading is not consistent for example
In the title table, I have these headings below
Old_title_no ,Old_parcel_no,Old_parcel_area,Title_no,Parcel_no,Parcel_area,Interest Register No,Interest Effective Date, Old_int_no, Interest_no, Int_typ_code, Int_type
152995607,146378391,581,153382833,146378391,10/26/2020 14:51:30,,,, --data is missing here on this one but I would like date to go under date no matter what is missing.
152995708,146378414,290,153382855,146378414,290,,10/26/2020 14:51:30,,,, -- this one is good
152075347,111827763,,153382237,111827763,,108661181,10/26/2020 14:13:14,185613666,189112035,43,"Mortgage " -- this one is good.
How can I automatically match the columns with the data I have despite the format of the data. , its one of the challenges I am now facing.
I realized some of the data I am loading is not consistent for example
In the title table, I have these headings below
Old_title_no ,Old_parcel_no,Old_parcel_area,Title_no,Parcel_no,Parcel_area,Interest Register No,Interest Effective Date, Old_int_no, Interest_no, Int_typ_code, Int_type
152995607,146378391,581,153382833,146378391,10/26/2020 14:51:30,,,, --data is missing here on this one but I would like date to go under date no matter what is missing.
152995708,146378414,290,153382855,146378414,290,,10/26/2020 14:51:30,,,, -- this one is good
152075347,111827763,,153382237,111827763,,108661181,10/26/2020 14:13:14,185613666,189112035,43,"Mortgage " -- this one is good.
How can I automatically match the columns with the data I have despite the format of the data. , its one of the challenges I am now facing.
152995607,146378391,581,153382833,146378391,10/26/2020 14:51:30,,,, --data is missing here on this one but I would like date to go under date no matter what is missing.This problem lies within the data itself! The line should read:
152995607,146378391,581,153382833,146378391,,10/26/2020 14:51:30,,,,
How can Oracle (or any other vendor) ever tell that there's a missing value?! Impossible ;-)
If you have a certain file specification, the files should comply with it!!
ASKER
what about C# where you can call column by column like Old_title_no is always index position 0 and has 9 characters, etc and date will always be date, When the data is inconsistent, its always a challenge. Not sure how go about it anymore.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.