Postgresql: Import csv constant errors, how to fix?

I am using Navicat and I (finally) have Postgresql running.

I accept defaults as
Record delimiter LF
Text qualifier " (there aren't any - maybe there should be)

Error Message: ERROR:  invalid input syntax for type timestamp: "01:28:00"
LINE 1: ...','180480','','','1','2','1','1','1987-01-05','2','01:28:00'...
                                                             ^

[Msg] [Imp] Processed: 49, Added: 0, Updated: 0, Deleted: 0, Error: 1
[Msg] [Imp] Finished - Unsuccessfully

Open in new window


So I make both date and time field varchar and try again:

Error Message: ERROR:  malformed array literal: "5/1/1987"
LINE 1: ...1AD00002','529950','180480','','','1','2','1','1','5/1/1987'...
                                                             ^
DETAIL:  Array value must start with "{" or dimension information.

[Msg] [Imp] Processed: 49, Added: 0, Updated: 0, Deleted: 0, Error: 1
[Msg] [Imp] Finished - Unsuccessfully

Open in new window


and the query is I think: (not written by me I hasten to add, by Navicat)

[Msg] [Imp] Import start
[Msg] [Imp] Import type - CSV
[Msg] [Imp] Import from - /Users/ADM/Sync/6-Process01/accXtract.csv
[Msg] [Imp] Create table [accXtract]
[Msg] [Imp] Import table [accXtract]
[Err] [Imp] Cannot insert rows 1 - 49:
Query: insert into "public"."accXtract" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_District","Local_Authority_Highway","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values 

Open in new window

LVL 9
Anthony MellorChartered AccountantAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Bernard S.CTOCommented:
It seems that you have a (typical, classical and awful) problem of dates handling.

On your first example, the target field expects a time stamp while you deliver a (hour) string, but the day field seems fine
On your second example, you have changed the day format to a string format different from your first example, and this creates the error.

My suggestion:
- keep the input fields unchanged
- keep the (export) day field as it was in the first example
- test the impact of changing the (export) hour format to a timestamp value insted of a date value
0
Anthony MellorChartered AccountantAuthor Commented:
Can't I set the dates and times to some other type? I don't need them. What type though? I tried "varchar" (thinking that might be a sort of catchall)  and that made no difference. It sounds like the system is deciding and I have to comply properly whereas I'd like to comply not at all.

I don't understand the bit where you write about "export" because I am trying to import only.
I did notice it didn't seem to like the seconds part of the time; I could delete those I think (AWK to the rescue)

Maybe the penny will drop over a cuppa tea.
0
earth man2Commented:
If a column is defined as a timestamp type then it is expecting both a date and a time.  There is a specific TIME type for time on its own.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

Anthony MellorChartered AccountantAuthor Commented:
ah! thank you. It's being auto set, so when it's seeing either a date OR a time it's assuming both - I guess. I'll see.

wahay!

[Msg] [Imp] Import start
[Msg] [Imp] Import type - CSV
[Msg] [Imp] Import from - /Users/ADM/Sync//5-CombineBOTH/accEXT.csv
[Msg] [Imp] Create table [accEXT]
[Msg] [Imp] Import table [accEXT]
[Msg] [Imp] Processed: 585, Added: 585, Updated: 0, Deleted: 0, Error: 0
[Msg] [Imp] Finished - Successfully

Open in new window


That was on an extract of the file.. but the one that failed also works. Will try the full size file.

Tables 1
Processed 5,434,484
Errors 150
Added 5,425,230
Update 0
Deleted 0
Time 1523.64s

and the extract form the end is:

insert into "public"."acc" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_(District)","Local_Authority_(Highway)","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values ('2011921200188','323460','870490','-3.286473','57.717724','92','3','3','2','2011-12-18','1','21:35:00','930','S12000020','3','941','6','30','6','4','6','0','0','0','4','1','2','0','0','2','1','','921200188','12','35','18','21','2011','24150','32346','87049');
insert into "public"."acc" ("Accident_Index","Location_Easting_OSGR","Location_Northing_OSGR","Longitude","Latitude","Police_Force","Accident_Severity","Number_of_Vehicles","Number_of_Casualties","Date","Day_of_Week","Time","Local_Authority_(District)","Local_Authority_(Highway)","1st_Road_Class","1st_Road_Number","Road_Type","Speed_limit","Junction_Detail","Junction_Control","2nd_Road_Class","2nd_Road_Number","Pedestrian_Crossing-Human_Control","Pedestrian_Crossing-Physical_Facilities","Light_Conditions","Weather_Conditions","Road_Surface_Conditions","Special_Conditions_at_Site","Carriageway_Hazards","Urban_or_Rural_Area","Did_Police_Officer_Attend_Scene_of_Accident","LSOA_of_Accident_Location","ACCREF","ACCMTH","A8M","ACCDAY","A8H","ACCYR","MONTH","Location_5Easting_OSGR","Location_5Northing_OSGR") values ('2011921200193','394230','806020','-2.096969','57.145101','92','3','1','1','2011-11-21','2','17:36:00','910','S12000033','3','93','3','30','0','-1','-1','0','0','0','4','1','1','0','0','1','1','','921200193','11','36','21','17','2011','24153','39423','80602')
Error Message: ERROR:  invalid input syntax for type time: ""
LINE 5: ...66','57.142932','92','3','2','1','2011-12-03','7','','910','...
                                                                  ^

[Msg] [Imp] Processed: 5434484, Added: 5425230, Updated: 0, Deleted: 0, Error: 150
[Msg] [Imp] Finished - Unsuccessfully

and this is the line from the csv file:

"2011921200193","394230","806020","-2.096969","57.145101","92","3","1","1","21/11/2011","2","17:36","910","S12000033","3","93","3","30","0","-1","-1","0","0","0","4","1","1","0","0","1","1","","921200193","11","36","21","17","2011","24153","39423","80602"
0
Bernard S.CTOCommented:
the receiving part is expecting a time (not date) value at this place...
0
Anthony MellorChartered AccountantAuthor Commented:
that's what I was thinking, but how has it accepted 5 million previously ?

I'll just add the error line and the two before it here to compare:

"2011921200082","340210","853490","-3.001145","57.567589","92","2","2","1","20/12/2011","3","08:50","930","S12000020","4","9016","6","40","0","-1","-1","0","0","0","1","1","4","0","0","2","1","","921200082","12","50","20","08","2011","24152","34021","85349"
"2011921200188","323460","870490","-3.286473","57.717724","92","3","3","2","18/12/2011","1","21:35","930","S12000020","3","941","6","30","6","4","6","0","0","0","4","1","2","0","0","2","1","","921200188","12","35","18","21","2011","24150","32346","87049"
"2011921200193","394230","806020","-2.096969","57.145101","92","3","1","1","21/11/2011","2","17:36","910","S12000033","3","93","3","30","0","-1","-1","0","0","0","4","1","1","0","0","1","1","","921200193","11","36","21","17","2011","24153","39423","80602"
"2011921200318","387300","806150","-2.211498","57.146128","92","3","3","1","23/12/2011","6","14:00","910","S12000033","4","944","3","50","0","-1","-1","0","0","0","1","1","1","0","0","2","2","","921200318","12","00","23","14","2011","24155","38730","80615"

Open in new window


I cannot see any difference between lines 1 and 2 versus 3


can the text as seen at the hex level have an impact? and looking at the file in hex I can't see any odd characters around the date of the time: but I have plagued myself with file compatibility errors during this work.


comparison from the navicat import log above

previous line accepted, second line rejected

('2011921200188','323460','870490','-3.286473','57.717724','92','3','3','2','2011-12-18','1','21:35:00','930','S12000020','3','941','6','30','6','4','6','0','0','0','4','1','2','0','0','2','1','','921200188','12','35','18','21','2011','24150','32346','87049');
('2011921200193','394230','806020','-2.096969','57.145101','92','3','1','1','2011-11-21','2','17:36:00','910','S12000033','3','93','3','30','0','-1','-1','0','0','0','4','1','1','0','0','1','1','','921200193','11','36','21','17','2011','24153','39423','80602')

Open in new window


I'm stripping out the colons from the whole file and will try an import without specifying the time format.

edit 6 million "find and replaces" is taking a while...

light dawns. Removing the colons reduces the errors from 150 to 44. There were 44 files of one type incorporated in to this file. A quick search on their (old) headings reveals that I have combined i.e. not excluded line one of each file and within my 6 million records are 44 surplus headers. I suppose thanks are to Navicat for spotting this.
0
Anthony MellorChartered AccountantAuthor Commented:
In the end I deleted the date and time fields altogether, then the import worked.
0
Bernard S.CTOCommented:
Glad you eventually solved the problem. The repeated / extra headers have probablu not made things easier.

Note also that depending on how you import such a huge file, problems might arise from various time limits: if possible, avoid making a single huge import, that will help you to have at least not that possible source of trouble.
0
Anthony MellorChartered AccountantAuthor Commented:
Thanks - hadn't occurred to me there may be timeouts.
I'm still perplexed about how the headers got in there, because I combined  52 files all with headers and there were only 44 duplicates.. so either I set the ignore row one or I didn't and yet it seems I did both!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PostgreSQL

From novice to tech pro — start learning today.