Avatar of Anthony Mellor
Anthony MellorFlag for United Kingdom of Great Britain and Northern Ireland

asked on 

Postgresql: Import csv constant errors, how to fix?

I am using Navicat and I (finally) have Postgresql running.

I accept defaults as
Record delimiter LF
Text qualifier " (there aren't any - maybe there should be)

Error Message: ERROR:  invalid input syntax for type timestamp: "01:28:00"
LINE 1: ...','180480','','','1','2','1','1','1987-01-05','2','01:28:00'...
                                                             ^

[Msg] [Imp] Processed: 49, Added: 0, Updated: 0, Deleted: 0, Error: 1
[Msg] [Imp] Finished - Unsuccessfully

Open in new window


So I make both date and time field varchar and try again:

Error Message: ERROR:  malformed array literal: "5/1/1987"
LINE 1: ...1AD00002','529950','180480','','','1','2','1','1','5/1/1987'...
                                                             ^
DETAIL:  Array value must start with "{" or dimension information.

[Msg] [Imp] Processed: 49, Added: 0, Updated: 0, Deleted: 0, Error: 1
[Msg] [Imp] Finished - Unsuccessfully

Open in new window


and the query is I think: (not written by me I hasten to add, by Navicat)

[Msg] [Imp] Import start
[Msg] [Imp] Import type - CSV
[Msg] [Imp] Import from - /Users/ADM/Sync/6-Process01/accXtract.csv
[Msg] [Imp] Create table [accXtract]
[Msg] [Imp] Import table [accXtract]
[Err] [Imp] Cannot insert rows 1 - 49:
Query: insert into "public"."accXtract" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_District","Local_Authority_Highway","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values 

Open in new window

PostgreSQLDatabasesMongoDB

Avatar of undefined
Last Comment
Anthony Mellor
Avatar of Bernard Savonet
Bernard Savonet
Flag of France image

It seems that you have a (typical, classical and awful) problem of dates handling.

On your first example, the target field expects a time stamp while you deliver a (hour) string, but the day field seems fine
On your second example, you have changed the day format to a string format different from your first example, and this creates the error.

My suggestion:
- keep the input fields unchanged
- keep the (export) day field as it was in the first example
- test the impact of changing the (export) hour format to a timestamp value insted of a date value
Avatar of Anthony Mellor

ASKER

Can't I set the dates and times to some other type? I don't need them. What type though? I tried "varchar" (thinking that might be a sort of catchall)  and that made no difference. It sounds like the system is deciding and I have to comply properly whereas I'd like to comply not at all.

I don't understand the bit where you write about "export" because I am trying to import only.
I did notice it didn't seem to like the seconds part of the time; I could delete those I think (AWK to the rescue)

Maybe the penny will drop over a cuppa tea.
ASKER CERTIFIED SOLUTION
Avatar of earth man2
earth man2
Flag of United Kingdom of Great Britain and Northern Ireland image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of Anthony Mellor

ASKER

ah! thank you. It's being auto set, so when it's seeing either a date OR a time it's assuming both - I guess. I'll see.

wahay!

[Msg] [Imp] Import start
[Msg] [Imp] Import type - CSV
[Msg] [Imp] Import from - /Users/ADM/Sync//5-CombineBOTH/accEXT.csv
[Msg] [Imp] Create table [accEXT]
[Msg] [Imp] Import table [accEXT]
[Msg] [Imp] Processed: 585, Added: 585, Updated: 0, Deleted: 0, Error: 0
[Msg] [Imp] Finished - Successfully

Open in new window


That was on an extract of the file.. but the one that failed also works. Will try the full size file.

Tables 1
Processed 5,434,484
Errors 150
Added 5,425,230
Update 0
Deleted 0
Time 1523.64s

and the extract form the end is:

insert into "public"."acc" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_(District)","Local_Authority_(Highway)","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values ('2011921200188','323460','870490','-3.286473','57.717724','92','3','3','2','2011-12-18','1','21:35:00','930','S12000020','3','941','6','30','6','4','6','0','0','0','4','1','2','0','0','2','1','','921200188','12','35','18','21','2011','24150','32346','87049');
insert into "public"."acc" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_(District)","Local_Authority_(Highway)","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values ('2011921200193','394230','806020','-2.096969','57.145101','92','3','1','1','2011-11-21','2','17:36:00','910','S12000033','3','93','3','30','0','-1','-1','0','0','0','4','1','1','0','0','1','1','','921200193','11','36','21','17','2011','24153','39423','80602')
Error Message: ERROR:  invalid input syntax for type time: ""
LINE 5: ...66','57.142932','92','3','2','1','2011-12-03','7','','910','...
                                                                  ^

[Msg] [Imp] Processed: 5434484, Added: 5425230, Updated: 0, Deleted: 0, Error: 150
[Msg] [Imp] Finished - Unsuccessfully

and this is the line from the csv file:

"2011921200193","394230","806020","-2.096969","57.145101","92","3","1","1","21/11/2011","2","17:36","910","S12000033","3","93","3","30","0","-1","-1","0","0","0","4","1","1","0","0","1","1","","921200193","11","36","21","17","2011","24153","39423","80602"
SOLUTION
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of Anthony Mellor

ASKER

that's what I was thinking, but how has it accepted 5 million previously ?

I'll just add the error line and the two before it here to compare:

"2011921200082","340210","853490","-3.001145","57.567589","92","2","2","1","20/12/2011","3","08:50","930","S12000020","4","9016","6","40","0","-1","-1","0","0","0","1","1","4","0","0","2","1","","921200082","12","50","20","08","2011","24152","34021","85349"
"2011921200188","323460","870490","-3.286473","57.717724","92","3","3","2","18/12/2011","1","21:35","930","S12000020","3","941","6","30","6","4","6","0","0","0","4","1","2","0","0","2","1","","921200188","12","35","18","21","2011","24150","32346","87049"
"2011921200193","394230","806020","-2.096969","57.145101","92","3","1","1","21/11/2011","2","17:36","910","S12000033","3","93","3","30","0","-1","-1","0","0","0","4","1","1","0","0","1","1","","921200193","11","36","21","17","2011","24153","39423","80602"
"2011921200318","387300","806150","-2.211498","57.146128","92","3","3","1","23/12/2011","6","14:00","910","S12000033","4","944","3","50","0","-1","-1","0","0","0","1","1","1","0","0","2","2","","921200318","12","00","23","14","2011","24155","38730","80615"

Open in new window


I cannot see any difference between lines 1 and 2 versus 3


can the text as seen at the hex level have an impact? and looking at the file in hex I can't see any odd characters around the date of the time: but I have plagued myself with file compatibility errors during this work.


comparison from the navicat import log above

previous line accepted, second line rejected

('2011921200188','323460','870490','-3.286473','57.717724','92','3','3','2','2011-12-18','1','21:35:00','930','S12000020','3','941','6','30','6','4','6','0','0','0','4','1','2','0','0','2','1','','921200188','12','35','18','21','2011','24150','32346','87049');
('2011921200193','394230','806020','-2.096969','57.145101','92','3','1','1','2011-11-21','2','17:36:00','910','S12000033','3','93','3','30','0','-1','-1','0','0','0','4','1','1','0','0','1','1','','921200193','11','36','21','17','2011','24153','39423','80602')

Open in new window


I'm stripping out the colons from the whole file and will try an import without specifying the time format.

edit 6 million "find and replaces" is taking a while...

light dawns. Removing the colons reduces the errors from 150 to 44. There were 44 files of one type incorporated in to this file. A quick search on their (old) headings reveals that I have combined i.e. not excluded line one of each file and within my 6 million records are 44 surplus headers. I suppose thanks are to Navicat for spotting this.
Avatar of Anthony Mellor

ASKER

In the end I deleted the date and time fields altogether, then the import worked.
Glad you eventually solved the problem. The repeated / extra headers have probablu not made things easier.

Note also that depending on how you import such a huge file, problems might arise from various time limits: if possible, avoid making a single huge import, that will help you to have at least not that possible source of trouble.
Avatar of Anthony Mellor

ASKER

Thanks - hadn't occurred to me there may be timeouts.
I'm still perplexed about how the headers got in there, because I combined  52 files all with headers and there were only 44 duplicates.. so either I set the ignore row one or I didn't and yet it seems I did both!
Databases
Databases

Databases are organized collections of data, most commonly accessed through management systems including schemas, tables, queries and processes that allow users to enter and manipulate the information or utilize it in other fashions, such as with web applications or for reporting purposes.

62K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo