Avatar of KzKrew
KzKrewFlag for United States of America

asked on 

Python CSV Import into MySql

I am having issues importing a CSV file into a MySql Table with the python script below;  when attempting to import the file  101321_Odr.csv

User generated imageAs can be seen -- All the data fields imports correctly except for the Alloy Field --(Grabs data from the next line) - Note: Have same results if i delete the Index Field.

Table structure can be seen below:

Output from from my print statement  --- print (load_sql) is 

LOAD DATA LOCAL INFILE 'C:Test/101321_Odr.csv' INTO TABLE timerite.tbl_orders FIELDS TERMINATED BY ',' ENCLOSED BY '"' IGNORE 1 LINES;


Any thoughts -- let me know 

Thanks - Dan



User generated image


#import csv
import mysql.connector
import sys import pymysql from time import strftime import datetime import logging import subprocess #used to mapp drive import smtplib host = '10.X.X.X' user = 'user' password = 'password' def csv_to_mysql(load_sql, host, user, password):     '''     Function loads a csv file to MySQL table according to     '''     try:         con = pymysql.connect(host=host,                               user=user,                               password=password,                               autocommit=True,                               local_infile=1)         print('Connected to DB: {}'.format(host))         cursor = con.cursor()         cursor.execute(load_sql)         print('Succuessfully loaded the table from csv.')         con.close()     except Exception as e:         print('Error: {}'.format(str(e)))         # logger.debug('Error: {}'.format(str(e)))         sys.exit(1) # SET GLOBAL local_infile = true print ("Allow InFile Import  ----- SET GLOBAL local_infile = true") load_sql = "SET GLOBAL local_infile = true;" print (load_sql) csv_to_mysql(load_sql, host, user, password) # Delete Data from tables print ("Deleting Data from Tables ") load_sql = "DELETE FROM timerite.tbl_orders;" print (load_sql) csv_to_mysql(load_sql, host, user, password) print(("Table Data has been deleted")) # Import CSV File to table  strDate = strftime("%m%d%y") strDate = (datetime.datetime.now() - datetime.timedelta(days=1)).strftime("%m%d%y") print((strDate)) strFileOdr = "'" + "C:Test/" + strDate + "_Odr.csv'" print("Order File is: " + strFileOdr) load_sql = "LOAD DATA LOCAL INFILE " + strFileOdr +  " INTO TABLE timerite.tbl_orders FIELDS TERMINATED BY ',' ENCLOSED BY '\"' IGNORE 1 LINES;" print (load_sql) csv_to_mysql(load_sql, host, user, password) print("All Tables have been Updated")

Open in new window

PythonMySQL Server

Avatar of undefined
Last Comment
pepr
Avatar of arnold
arnold
Flag of United States of America image

The issue is , and delimiter.
To escape the commas in the field, the field is enclosed by quotes" in your case, a quote is messed up.

The entire cell should be enclosed, not a portion.
Avatar of ste5an
ste5an
Flag of Germany image

Verify the used line terminator locally, in the attachment it is CRLF.

Add LINES TERMINATED BY '\r\n' (using the correct value) to the import statement.
Avatar of KzKrew
KzKrew
Flag of United States of America image

ASKER

Arnold are you saying the data is messed up or my statement is incorrect? The ALLOY filed is enclosed by quotes

"JOB_ORDER","WORK_ORDER","JOB#","SHIP_2_NAME","BILL_2_NAME","PART#","PART_DESC","SHIP_DATE","ORDER_QTY","ALLOY"
80768,9635,"561405","INC.","INC.","252-550-02","1.5"" MM COVER",1211013,1344,"BIALLOY"
80774,9641,"516015","TECHNOLOGIES, LLC","TECHNOLOGIES","0750-9243ARC(HT)","DG OUTLET",1211027,330,"356T6"

ste5a 
How would i fix my load_sql statement?
load_sql = "LOAD DATA LOCAL INFILE " + strFileOdr +  " INTO TABLE timerite.tbl_orders FIELDS TERMINATED BY ',' ENCLOSED BY '\"' IGNORE 1 LINES;"
Avatar of arnold
arnold
Flag of United States of America image

The issue I see in the cell is the issue
Alloy.
"BIALLOY" 80774

Has a quote mismatch at sone point that the closing quote reflected above is seen as an opening quote instead to then be terminated (closing) the opening quote of the "516015"

The line is evaluated from ..
Potentially reformatting the lines with cells separated by tabs
To fix things by script you have to have a rules basis.
Number, number, word number combination.....

To then make sure the total cell count is correct.
Avatar of KzKrew
KzKrew
Flag of United States of America image

ASKER

Yes -- seeing that now -- many of our part descriptions have double quotes for Inch in the description. Data File is what it is.  I have found that if i take out the quotes on the header line that import works properly
User generated image
Not a solution as i do not know why -- any thoughts?
Avatar of arnold
arnold
Flag of United States of America image

that is always an issue/consideration
one is to mask the inch indicator by using a place holder &quote; that you translate back and forth or use the 0x (ASCII encoding

not sure what generates the data into CSV..... to account for the issue.
by stripping quotes in this case might not be a concern, but if you have one cell that has "you are making an interesting point, but it does not apply here"

The stripping of quotes on the header line is of little consequence to the rest as it should be evaluated a line at a time unless there is a quote mismatch on the first line that then cascades through the rest of them, i.e. a closing quote is by count an opening quote....

You have "1.5"" MM Cover"

Nightmare is always to try and track down that errand quote (single or double) open or close parenthesis, brackets or braces.
Avatar of ste5an
ste5an
Flag of Germany image

You need to check your local file first for the used line terminators. Cause the attached file maybe correct and yours not. E.g. the attached file looks like this in notepad++:

User generated image
Thus '\r\n'. E.g. as

from time import strftime
import datetime

strDate = strftime("%m%d%y")
strDate = (datetime.datetime.now() - datetime.timedelta(days=1)).strftime("%m%d%y")
strFileOdr = f"C:/Test/{strDate}_Odr.csv"

load_sql = f"""LOAD DATA 
    LOCAL INFILE '{strFileOdr}' 
    INTO TABLE timerite.tbl_orders 
    FIELDS TERMINATED BY ','
    ENCLOSED BY '\"' 
    LINES TERMINATED BY '\\r\\n' 
    IGNORE 1 LINES;
"""
print(f"Order File is:\n{strFileOdr}\n")
print(f"LOAD SQL:\n{load_sql}\n")

Open in new window

User generated image
Avatar of arnold
arnold
Flag of United States of America image

The termination is not really an issue of concern and will not be a cause for the misapplication of which entry is for which column.
Avatar of ste5an
ste5an
Flag of Germany image

Either it is the line terminator handled not correctly, cause the defaults are

LINES TERMINATED BY '\n' STARTING BY ''

Open in new window

or the double quote we see is not the default double quote, but some Unicode characters.
ASKER CERTIFIED SOLUTION
Avatar of pepr
pepr

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
MySQL Server
MySQL Server

MySQL is an open source, relational database management system that runs as a server providing multi-user access to a number of databases. Acquired by Oracle in 2009, it is frequently used in combination with PHP installations, powering most of the WordPress installations.

49K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo