Avatar of KzKrew
KzKrew
Flag for United States of America asked on

Python CSV Import into MySql

I am having issues importing a CSV file into a MySql Table with the python script below;  when attempting to import the file  101321_Odr.csv

As can be seen -- All the data fields imports correctly except for the Alloy Field --(Grabs data from the next line) - Note: Have same results if i delete the Index Field.

Table structure can be seen below:

Output from from my print statement  --- print (load_sql) is 

LOAD DATA LOCAL INFILE 'C:Test/101321_Odr.csv' INTO TABLE timerite.tbl_orders FIELDS TERMINATED BY ',' ENCLOSED BY '"' IGNORE 1 LINES;


Any thoughts -- let me know 

Thanks - Dan





#import csv
import mysql.connector
import sys import pymysql from time import strftime import datetime import logging import subprocess #used to mapp drive import smtplib host = '10.X.X.X' user = 'user' password = 'password' def csv_to_mysql(load_sql, host, user, password):     '''     Function loads a csv file to MySQL table according to     '''     try:         con = pymysql.connect(host=host,                               user=user,                               password=password,                               autocommit=True,                               local_infile=1)         print('Connected to DB: {}'.format(host))         cursor = con.cursor()         cursor.execute(load_sql)         print('Succuessfully loaded the table from csv.')         con.close()     except Exception as e:         print('Error: {}'.format(str(e)))         # logger.debug('Error: {}'.format(str(e)))         sys.exit(1) # SET GLOBAL local_infile = true print ("Allow InFile Import  ----- SET GLOBAL local_infile = true") load_sql = "SET GLOBAL local_infile = true;" print (load_sql) csv_to_mysql(load_sql, host, user, password) # Delete Data from tables print ("Deleting Data from Tables ") load_sql = "DELETE FROM timerite.tbl_orders;" print (load_sql) csv_to_mysql(load_sql, host, user, password) print(("Table Data has been deleted")) # Import CSV File to table  strDate = strftime("%m%d%y") strDate = (datetime.datetime.now() - datetime.timedelta(days=1)).strftime("%m%d%y") print((strDate)) strFileOdr = "'" + "C:Test/" + strDate + "_Odr.csv'" print("Order File is: " + strFileOdr) load_sql = "LOAD DATA LOCAL INFILE " + strFileOdr +  " INTO TABLE timerite.tbl_orders FIELDS TERMINATED BY ',' ENCLOSED BY '\"' IGNORE 1 LINES;" print (load_sql) csv_to_mysql(load_sql, host, user, password) print("All Tables have been Updated")

Open in new window

PythonMySQL Server

Avatar of undefined
Last Comment
pepr

8/22/2022 - Mon
arnold

The issue is , and delimiter.
To escape the commas in the field, the field is enclosed by quotes" in your case, a quote is messed up.

The entire cell should be enclosed, not a portion.
ste5an

Verify the used line terminator locally, in the attachment it is CRLF.

Add LINES TERMINATED BY '\r\n' (using the correct value) to the import statement.
KzKrew

ASKER
Arnold are you saying the data is messed up or my statement is incorrect? The ALLOY filed is enclosed by quotes

"JOB_ORDER","WORK_ORDER","JOB#","SHIP_2_NAME","BILL_2_NAME","PART#","PART_DESC","SHIP_DATE","ORDER_QTY","ALLOY"
80768,9635,"561405","INC.","INC.","252-550-02","1.5"" MM COVER",1211013,1344,"BIALLOY"
80774,9641,"516015","TECHNOLOGIES, LLC","TECHNOLOGIES","0750-9243ARC(HT)","DG OUTLET",1211027,330,"356T6"

ste5a 
How would i fix my load_sql statement?
load_sql = "LOAD DATA LOCAL INFILE " + strFileOdr +  " INTO TABLE timerite.tbl_orders FIELDS TERMINATED BY ',' ENCLOSED BY '\"' IGNORE 1 LINES;"
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
arnold

The issue I see in the cell is the issue
Alloy.
"BIALLOY" 80774

Has a quote mismatch at sone point that the closing quote reflected above is seen as an opening quote instead to then be terminated (closing) the opening quote of the "516015"

The line is evaluated from ..
Potentially reformatting the lines with cells separated by tabs
To fix things by script you have to have a rules basis.
Number, number, word number combination.....

To then make sure the total cell count is correct.
KzKrew

ASKER
Yes -- seeing that now -- many of our part descriptions have double quotes for Inch in the description. Data File is what it is.  I have found that if i take out the quotes on the header line that import works properly

Not a solution as i do not know why -- any thoughts?
arnold

that is always an issue/consideration
one is to mask the inch indicator by using a place holder &quote; that you translate back and forth or use the 0x (ASCII encoding

not sure what generates the data into CSV..... to account for the issue.
by stripping quotes in this case might not be a concern, but if you have one cell that has "you are making an interesting point, but it does not apply here"

The stripping of quotes on the header line is of little consequence to the rest as it should be evaluated a line at a time unless there is a quote mismatch on the first line that then cascades through the rest of them, i.e. a closing quote is by count an opening quote....

You have "1.5"" MM Cover"

Nightmare is always to try and track down that errand quote (single or double) open or close parenthesis, brackets or braces.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ste5an

You need to check your local file first for the used line terminators. Cause the attached file maybe correct and yours not. E.g. the attached file looks like this in notepad++:

Capture.PNG
Thus '\r\n'. E.g. as

from time import strftime
import datetime

strDate = strftime("%m%d%y")
strDate = (datetime.datetime.now() - datetime.timedelta(days=1)).strftime("%m%d%y")
strFileOdr = f"C:/Test/{strDate}_Odr.csv"

load_sql = f"""LOAD DATA 
    LOCAL INFILE '{strFileOdr}' 
    INTO TABLE timerite.tbl_orders 
    FIELDS TERMINATED BY ','
    ENCLOSED BY '\"' 
    LINES TERMINATED BY '\\r\\n' 
    IGNORE 1 LINES;
"""
print(f"Order File is:\n{strFileOdr}\n")
print(f"LOAD SQL:\n{load_sql}\n")

Open in new window

Capture.PNG
arnold

The termination is not really an issue of concern and will not be a cause for the misapplication of which entry is for which column.
ste5an

Either it is the line terminator handled not correctly, cause the defaults are

LINES TERMINATED BY '\n' STARTING BY ''

Open in new window

or the double quote we see is not the default double quote, but some Unicode characters.
Your help has saved me hundreds of hours of internet surfing.
fblack61
ASKER CERTIFIED SOLUTION
pepr

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question