<

Data Load to MYSQL Sever

Published on
13,193 Points
6,493 Views
2 Endorsements
Last Modified:
Loading csv or delimited data files to MySQL database is a very common task frequently questioned about and almost every time LOAD DATA INFILE comes to the rescue.

Here we will try to understand some of the very common scenarios for loading data into a MySQL Database.

The Load Data Syntax:
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE tbl_name
    [CHARACTER SET charset_name]
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number LINES]
    [(col_name_or_user_var,...)]
    [SET col_name = expr,...]

Open in new window



Consider we have to load file with following contents:
#File-name: example.csv
col-1,col-2,col-3
a,2,3
b,4,5
c,6,7

Open in new window


1. A simple comma-separated file with column header:

#table structure: example 
col-1	col-2	col-3

Open in new window


Considering our MySQL table having the same column sequence as the csv file above, we can issue the following SQL statement:
LOAD DATA INFILE 'path/to/example.csv' INTO TABLE example FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES ;

Open in new window


This is a very common and simple scenario.


Quick Notes:

Of course, if we don't have column headers (col-1,col-2,col-3) in example.csv, IGNORE 1 LINES is not required.
Note the file path. Here you should make sure your slashes are proper.
You may give path as: C:\\path\\file.csv or C:/path/file.csv.
If we have a data file to be loaded stored on client ( Not on server ), we will add LOCAL keyword as given in Syntax.

So, the command will become:
LOAD DATA LOCAL INFILE 'path/to/example.csv' INTO TABLE example FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES ;

Open in new window


- If we want to replace existing data by data being loaded from file, we will add REPLACE keyword before INTO TABLE.
Similarly if we want input rows that duplicate an existing row on a unique key value to be skipped, we will use IGNORE keyword before INTO TABLE.


2. Column sequence in file and table are different.

#table structure: example 
col-2	col-1	col-3

Open in new window


In this case we need to specify column-name sequence of csv file in order to get data loaded in to proper columns.

LOAD DATA INFILE 'path/to/example.csv' INTO TABLE example FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES (col-1,col-2,col-3);

Open in new window


3. csv / load data file have lesser number of columns than targeted table

#table structure: example 
col-1	col-2	col-3	col-4

Open in new window

Consider, col-1 is an auto-increment column and not provided in csv.

LOAD DATA INFILE 'path/to/example.csv' INTO TABLE example FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES (col-2,col-3,col-4) set col-1=null;

Open in new window

Passing null value will cause col-1 to utilize an auto-increment value.
Using SET you can assign values to those columns which were not available in csv and are not-null.
You may also use a function for doing some particular task and set a value.
e.g.,.  SET col-x=rand();


4. Filling the extra date columns:

This is very similar to 3. Here, we require col-4 to be filled with the present timestamp value: a very simple way to do is altering table. :)
ALTER TABLE example CHANGE COLUMN col-4 col-4 TIMESTAMP DEFAULT CURRENT_TIMESTAMP;

Open in new window


And then:
LOAD DATA INFILE 'path/to/example.csv' INTO TABLE example FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES (col-1,col-2,col-2=3) set col-4=null;

Open in new window


It should automatically fill the current_timestamp values for us.


5. Loading data with calculated columns:

#table: example 
col-1	col-2	col-3	col-4

Open in new window



LOAD DATA INFILE 'path/to/example.csv' INTO TABLE example FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES (col-1,col-2,col-3, @var1)
  SET col-4 = @var1/100;

Open in new window


Similarly we can alter a string variable as well by altering the variable as follows:

SET col-4 = replace(@var1,"find","replace")

Open in new window



6. Other ways of loading separated files to MySQL:

CREATE TABLE csv_foo LIKE foo;

ALTER TABLE csv_foo MODIFY COLUMN id INT(10) UNSIGNED NOT NULL;
// remove auto increment

ALTER TABLE csv_foo DROP PRIMARY KEY;
// drop key as no keys are supported in csv storage engine

Alternatively you may do:
CREATE TABLE csv_foo AS SELECT * FROM FOO LIMIT 0;
// Ignores key definitions and auto-increment
// Make sure you don't have any nullable columns.

Now,
STOP MYSQL SERVER

under data directory replace csv_foo.csv file by available data-file.csv. (Rename it to csv_foo.csv)

START MYSQL SERVER

you may need to do: REPAIR TABLE csv_foo;

You're done.

Well, this is not a "good" way though.


7. Loading multiple files:

Documentation says that MYSQL LOAD DATA will not be able to do it for us.
We have a separate option available for the same.
Refer: mysqlimport


Conclusion:
I hope we have covered common scenarios which shall mostly help; rest will always be answered on EE or here.
Finally, If you want to load data to MySQL Server, LOAD DATA
2
Comment
Author:theGhost_k8
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
1 Comment
 

Expert Comment

by:xathras1982
On mysqlimport there is a restriction on what you can load in terms of the file name must match the table name for load.
using the existing Data load command you can have a simple shell script that loops round your files and esxecute
0

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Join & Write a Comment

There's a multitude of different network monitoring solutions out there, and you're probably wondering what makes NetCrunch so special. It's completely agentless, but does let you create an agent, if you desire. It offers powerful scalability …
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month