Link to home
Start Free TrialLog in
Avatar of mirde
mirdeFlag for Canada

asked on

DateTime SSIS issue potential loss of data

Hello,

I am trying to convert my date column which is a varchar flat file source data in SSIS, however i get an error: The value could not be converted because of a potential loss of data.

The dates look like this:

8102010
8102010
8102010

8 being the month, I assume SQL expects 08102010? How could I accomplish this, how can I add the missing 0 using a Derived Column? What would be my expression?

I  have tried:

DT_DBTIME
DT_DBDATE
DT_DATE

Same error.
Avatar of jogos
jogos
Flag of Belgium image

case when len(mydate)=7 then '0'+mydate else mydate end
ASKER CERTIFIED SOLUTION
Avatar of 8080_Diver
8080_Diver
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of mirde

ASKER

Just what I was looking for, thank you.
IF not only the first '0' can be missing then you are in a very nasty position. When len()=6 you know that both month and day are missing a '0', so no problem.
But what about len()=7 how can you know if there is a zero missing in the month-part or in the day-parth?
1112009, is this november 1th  or january 11th ?  No way to tell!  
So if the additional code that was added for len()=6 is needed as you mentioned by accepting that sollution. The len=7-parth off it is not reliable.

jogos,
So, you're now saying that your offering was also not reliable? ;-)
Having dealt with date problems in the past, one has to make some assumptions at some point and then test the results.  If the dates have been stored based upon valid input (albeit input that passed teh dates as text), then the two most likely situations for "short" dates are single digit months and single digit days.  If, on the other hand the dates were entered by a human, all bets are off and one needs to code for various other errors and the process is best handled by creating a function that accepts the input string and returns a datetime datatype value.  Unfortunately, even then, one needs to decide how seemingly errorneous dates (i.e. the valid 6 or 7 digit dates) are to restored to "validity" and how totally invalid dates are to be handled.
Suppose the 7 digit date is 1232009.  Is that supposed to be 12/03/2009 or 01/23/2009?  Similarly, if the "date" is 112009, should that be inerpretted as 01/01/2009 or should the assumption be that the person entered the month and year but not the day of the month and, thus, the "date" should be interpretted as 11/??/2009?
@8080 diver,

It often occurs that by automatic conversion from something that looks like a numeric the prefix-zero disappears so this only affects the first part op your date and then you only have 2 options
length 8: month 10-12
length 7: month 01 to 09 where the prefix-zero disapeared

Your sollution added some more safety by dealing with 'unconventional lengths', but a when length = 6 is possible there is no way to reliably treath len=7  (november 1th / january 11th) .

I still don't know why the IT-world didn't took the year 2000 as an opportunity to make the YYYYMMDD-format as a reliable format to sort, convert  and exchange  worldwide.
And I don't care (that much) about the points.  But I care about something that can't be correct.
jogos,
Maybe you missed my point in my last post.  
I conceded that you may be correct with regard to the 6 character "date" issue and indicated a possible scenario (i.e. the month/year entry) that would invalidate my offered solution.  However, it is also possible that a 7-character "date" would be interpretted incorrectly by either of our offerings (as indicated by a date of 1172009 which might have a missing 0 in the month or a missing 0 in the day).
At some point, one has to have some knowledge with regard to the nature of the data being received and, based on that, make some assumptions.  If one cannot make any assumptions withregard to the data coming in, then one has to fall back on the alternative approach of failing anything that does not conform to the strictest standards (i.e. 8 characters and passing the IsDate test).
I currently have to deal with "dates" that are provided as a string of 8 characters, which sounds like it is an ideal data source.  However, these are zero-left-fill fields coming from COBOL programs that capture purpuorted "dates" as "additional data" (i.e. it is just baggage to those programs) and, therefore, there is no validation performed on the "dates."  In addition to valid dates, I have received dates like 02292009, 02312010, 00212009, 00123109, and 12009000.  We use a SQL Server UDF that validates the date and returns  NULL if it doesn't validate . . . but the date could still be wrong if someone initially entered 12/1/2009 as 1212009 because that would be stored as 01212009 . . . a valid but incorrect date. ;-)
I have also worked with date data that I knew was created by applications that had 3 separate fields, one each for Month, Day, and Year, but did validations on the resulting date (using the format (m)m/(d)d/yyyy) but then removing the slashes before storing the data in a string.  (Don't ask me why they didn't just use a datatime datatype. ;-)  We had to make some assumptions, based on the knowledge that the "date" was, at some point, validated.  That was when we used the routine I presented.  ;-)
In general, unless you know a lot about the origins of the data, it is a terrible idea to store dates as character strings.
As for the points . . . I tend to agree with your attitude.  
@8080 diver: it doesn't matter if I got your point (but I did), it matters that the anser is put in perspective: what it does and what it doesn't

PS: As European with international software+clients we can even have problems with switching month and day when automatic conversion and no rules :(