Avatar of Anthony Mellor
Anthony Mellor
Flag for United Kingdom of Great Britain and Northern Ireland asked on

Postgresql: Import csv constant errors, how to fix?

I am using Navicat and I (finally) have Postgresql running.

I accept defaults as
Record delimiter LF
Text qualifier " (there aren't any - maybe there should be)

Error Message: ERROR:  invalid input syntax for type timestamp: "01:28:00"
LINE 1: ...','180480','','','1','2','1','1','1987-01-05','2','01:28:00'...
                                                             ^

[Msg] [Imp] Processed: 49, Added: 0, Updated: 0, Deleted: 0, Error: 1
[Msg] [Imp] Finished - Unsuccessfully

Open in new window


So I make both date and time field varchar and try again:

Error Message: ERROR:  malformed array literal: "5/1/1987"
LINE 1: ...1AD00002','529950','180480','','','1','2','1','1','5/1/1987'...
                                                             ^
DETAIL:  Array value must start with "{" or dimension information.

[Msg] [Imp] Processed: 49, Added: 0, Updated: 0, Deleted: 0, Error: 1
[Msg] [Imp] Finished - Unsuccessfully

Open in new window


and the query is I think: (not written by me I hasten to add, by Navicat)

[Msg] [Imp] Import start
[Msg] [Imp] Import type - CSV
[Msg] [Imp] Import from - /Users/ADM/Sync/6-Process01/accXtract.csv
[Msg] [Imp] Create table [accXtract]
[Msg] [Imp] Import table [accXtract]
[Err] [Imp] Cannot insert rows 1 - 49:
Query: insert into "public"."accXtract" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_District","Local_Authority_Highway","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values 

Open in new window

PostgreSQLDatabasesMongoDB

Avatar of undefined
Last Comment
Anthony Mellor

8/22/2022 - Mon
Bernard Savonet

It seems that you have a (typical, classical and awful) problem of dates handling.

On your first example, the target field expects a time stamp while you deliver a (hour) string, but the day field seems fine
On your second example, you have changed the day format to a string format different from your first example, and this creates the error.

My suggestion:
- keep the input fields unchanged
- keep the (export) day field as it was in the first example
- test the impact of changing the (export) hour format to a timestamp value insted of a date value
Anthony Mellor

ASKER
Can't I set the dates and times to some other type? I don't need them. What type though? I tried "varchar" (thinking that might be a sort of catchall)  and that made no difference. It sounds like the system is deciding and I have to comply properly whereas I'd like to comply not at all.

I don't understand the bit where you write about "export" because I am trying to import only.
I did notice it didn't seem to like the seconds part of the time; I could delete those I think (AWK to the rescue)

Maybe the penny will drop over a cuppa tea.
ASKER CERTIFIED SOLUTION
earth man2

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Anthony Mellor

ASKER
ah! thank you. It's being auto set, so when it's seeing either a date OR a time it's assuming both - I guess. I'll see.

wahay!

[Msg] [Imp] Import start
[Msg] [Imp] Import type - CSV
[Msg] [Imp] Import from - /Users/ADM/Sync//5-CombineBOTH/accEXT.csv
[Msg] [Imp] Create table [accEXT]
[Msg] [Imp] Import table [accEXT]
[Msg] [Imp] Processed: 585, Added: 585, Updated: 0, Deleted: 0, Error: 0
[Msg] [Imp] Finished - Successfully

Open in new window


That was on an extract of the file.. but the one that failed also works. Will try the full size file.

Tables 1
Processed 5,434,484
Errors 150
Added 5,425,230
Update 0
Deleted 0
Time 1523.64s

and the extract form the end is:

insert into "public"."acc" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_(District)","Local_Authority_(Highway)","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values ('2011921200188','323460','870490','-3.286473','57.717724','92','3','3','2','2011-12-18','1','21:35:00','930','S12000020','3','941','6','30','6','4','6','0','0','0','4','1','2','0','0','2','1','','921200188','12','35','18','21','2011','24150','32346','87049');
insert into "public"."acc" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_(District)","Local_Authority_(Highway)","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values ('2011921200193','394230','806020','-2.096969','57.145101','92','3','1','1','2011-11-21','2','17:36:00','910','S12000033','3','93','3','30','0','-1','-1','0','0','0','4','1','1','0','0','1','1','','921200193','11','36','21','17','2011','24153','39423','80602')
Error Message: ERROR:  invalid input syntax for type time: ""
LINE 5: ...66','57.142932','92','3','2','1','2011-12-03','7','','910','...
                                                                  ^

[Msg] [Imp] Processed: 5434484, Added: 5425230, Updated: 0, Deleted: 0, Error: 150
[Msg] [Imp] Finished - Unsuccessfully

and this is the line from the csv file:

"2011921200193","394230","806020","-2.096969","57.145101","92","3","1","1","21/11/2011","2","17:36","910","S12000033","3","93","3","30","0","-1","-1","0","0","0","4","1","1","0","0","1","1","","921200193","11","36","21","17","2011","24153","39423","80602"
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Anthony Mellor

ASKER
that's what I was thinking, but how has it accepted 5 million previously ?

I'll just add the error line and the two before it here to compare:

"2011921200082","340210","853490","-3.001145","57.567589","92","2","2","1","20/12/2011","3","08:50","930","S12000020","4","9016","6","40","0","-1","-1","0","0","0","1","1","4","0","0","2","1","","921200082","12","50","20","08","2011","24152","34021","85349"
"2011921200188","323460","870490","-3.286473","57.717724","92","3","3","2","18/12/2011","1","21:35","930","S12000020","3","941","6","30","6","4","6","0","0","0","4","1","2","0","0","2","1","","921200188","12","35","18","21","2011","24150","32346","87049"
"2011921200193","394230","806020","-2.096969","57.145101","92","3","1","1","21/11/2011","2","17:36","910","S12000033","3","93","3","30","0","-1","-1","0","0","0","4","1","1","0","0","1","1","","921200193","11","36","21","17","2011","24153","39423","80602"
"2011921200318","387300","806150","-2.211498","57.146128","92","3","3","1","23/12/2011","6","14:00","910","S12000033","4","944","3","50","0","-1","-1","0","0","0","1","1","1","0","0","2","2","","921200318","12","00","23","14","2011","24155","38730","80615"

Open in new window


I cannot see any difference between lines 1 and 2 versus 3


can the text as seen at the hex level have an impact? and looking at the file in hex I can't see any odd characters around the date of the time: but I have plagued myself with file compatibility errors during this work.


comparison from the navicat import log above

previous line accepted, second line rejected

('2011921200188','323460','870490','-3.286473','57.717724','92','3','3','2','2011-12-18','1','21:35:00','930','S12000020','3','941','6','30','6','4','6','0','0','0','4','1','2','0','0','2','1','','921200188','12','35','18','21','2011','24150','32346','87049');
('2011921200193','394230','806020','-2.096969','57.145101','92','3','1','1','2011-11-21','2','17:36:00','910','S12000033','3','93','3','30','0','-1','-1','0','0','0','4','1','1','0','0','1','1','','921200193','11','36','21','17','2011','24153','39423','80602')

Open in new window


I'm stripping out the colons from the whole file and will try an import without specifying the time format.

edit 6 million "find and replaces" is taking a while...

light dawns. Removing the colons reduces the errors from 150 to 44. There were 44 files of one type incorporated in to this file. A quick search on their (old) headings reveals that I have combined i.e. not excluded line one of each file and within my 6 million records are 44 surplus headers. I suppose thanks are to Navicat for spotting this.
Anthony Mellor

ASKER
In the end I deleted the date and time fields altogether, then the import worked.
Bernard Savonet

Glad you eventually solved the problem. The repeated / extra headers have probablu not made things easier.

Note also that depending on how you import such a huge file, problems might arise from various time limits: if possible, avoid making a single huge import, that will help you to have at least not that possible source of trouble.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Anthony Mellor

ASKER
Thanks - hadn't occurred to me there may be timeouts.
I'm still perplexed about how the headers got in there, because I combined  52 files all with headers and there were only 44 duplicates.. so either I set the ignore row one or I didn't and yet it seems I did both!