Link to home
Start Free TrialLog in
Avatar of mlcktmguy
mlcktmguyFlag for United States of America

asked on

Editing Input from a TransferSpreadsheet command

One of my clients is receving data from a business partner on a spreadhseet.  I have no say or control over this, it is a given.  

The first part of my app pops a file selection dialog that they use to select the spreadsheet to be imported.  All spreadhseets should have the same number of columns and column attributes (date, numeric, text  etc..)

As a result of my transferspreadsheet command, a table is created in my MDB.  I am trying to develop edit routines to run on the imported data.

Specifically I am want to edit a field that should contain a valid date.  I would like to point the client to any problem records as specifically as possible.

On the example spreadsheet the column 'DateOfOrder' is formatted as a date.  The contents of the first 3 rows of this column on the spreadhseet are valid dates, the 'DateOfOrder' cell on the 4th record contains the word "test".  apparently Excel doesn't make you enter a date into a cell formatted as date.

DateOfOrder
1/4/13
1/4/13
1/7/13
Test

When the new table is created as a result of the transferspreadsheet, the column "DateOfOrder' is set to type 'Text'.  The valid dates of the first three records appear as numbers (41278, 41278, 41281) instead of dates.

The issue:  Becuase of the column was imported as 'Text', even though only row 4 contains an invalid date, none of the first 3 columns are vaild dates either.

My edits would show "invalid date" errors on all 4 rows, when in reality only the fourth row is wrong.  This would be confusing to the client trying to research and correct any errors.

I am used to using import specs when importing comma delimited data to control the format that a particular column is imported as.  There is no such option, or at least none that I know of in the 'TrasnferSpreadsheet command.

Is there any solution to how this is being imported so I can be more specific in my edits to the client?  In reality the spreadsheets will contain hundreds of rows.  Telling the client that there is a date error 'somewhere' in first column without pointing them to the specific record would be just about worthless as far as helping them resolve the error.

Is there a different approach I can take to accomplish my objective of specifically pointing the client to invalid data?
ASKER CERTIFIED SOLUTION
Avatar of Hamed Nasr
Hamed Nasr
Flag of Oman image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of mlcktmguy

ASKER

In the above example data I supplied.  Because of the way the collumn was imported, won't every record in the recordset be identified as having a bad date since they were imported as number, even though it is really only the 4th record that had a bad date.

From the above post:

DateOfOrder
1/4/13
1/4/13
1/7/13
Test

When the new table is created as a result of the transferspreadsheet, the column "DateOfOrder' is set to type 'Text'.  The valid dates of the first three records appear as numbers (41278, 41278, 41281) instead of dates.

After they are imported they look like:
DateOfOrder
41278
41278
41281
Test
I always prefer csv format for data exchange between Excel and Access. You can read it line by line (in this case you can send client line numbers with wrong data) and check data or link it to your DB selecting type for each column (you can link it once and rename new file to some standard name) or import all columns as text and convert to proper values in append query.
Yes, the fields are changed to numbers as you mentioned. Here is a table tr1 I created from transferring a spreadsheet.

F1      F2
40909      2
40941      3
40939      4
test      5

 The code changes the value to date, if it is not valid as a date, error processing isolates such record. You can end up with 2 tables for good and bad records in addition to the origional table.
Note you can use IsDate to determine if the value in a field is a Date. So following with hansr's advice:

Dim i As Integer
Dim sErrs As String
Dim rst As DAO.Recordset
Set rst = Currentdb.OpenRecordset("SELECT * FROM YourImportTable")

Do Until rst.EOF
  i = i+1
  If Not IsDate(rst("YourDateField"))
    sErrs = sErrs & vbCrLf & "Date value in Row " & Cstr(i) & " is invalid"
  End If

  If Not IsNumeric("SomeOtherField")
    If Len(sErrs) > 0 Then sErrs = sErrs & vbCrLf
    sErrs = sErrs & "The value in Row " & Cstr(i) & " is not a valid number"
  End If

  rst.MoveNext

Loop

Msgbox "One or more Errors were found in the imported data:" & vbCrLf & sErrs
I never have understood why MS won't allow users to create import specifications for Excel files!

Whenever I am going to repeat a process like this, I create a permanent staging table in my database.  This table contains two additional fields, [Comments](text) and [InValid](boolean).

All of the other fields in the table are set as text datatype to ensure they will import whatever is in Excel.

After importing the file, I run through a series of checks to check the validity of data in each of the fields, and update the [Comments] and [InValid] fields for each field.  For example:

UPDATE [yourTable]
SET [Comments] = [Comments] & ", " & "Invalid DateOfOrder"
     , [Invalid] = -1
WHERE IsDate([DateOfOrder]) = False

UPDATE [yourTable]
SET [Comments] = [Comments] & ", " & "DateOfOrder outside acceptable range"
     , [Invalid] = -1
WHERE IsDate([DateOfOrder]) = True
AND (CDate([DateOfOrder]) < Date() - 30 OR CDate([DateOfOrder]) > Date())

In this case, with a date field, I would then update the values in that field where the values appear to be valid dates.

UPDATE [yourTable]
SET [DateOfOrder] = Format(Cdate([DateOfOrder]), "mm/dd/yy"))
WHERE IsDate([DateOfOrder])

Once I have completed all of the columns that need to be checked, I will present the user with a datasheet of those records that are invalid, to allow them to correct any invalid entries.  I generally provide a way to export or print these records as well, so they can be researched.

Then, finally, I provide the user the opportunity to upload the Valid records to the actual table the data is supposed to go to.  This can be a pain, but regardless of the format the data comes in, you should always run through a series of error checks similar to this before importing data into your production application.
In the current situation IsDate returns False because the field is stored as text numeric value.
"My edits would show "invalid date" errors on all 4 rows, when in reality only the fourth row "
Try this to convert the numeric field back to date format and apply your edits.

This applies for table tr1(f1, f2)
F1      F2
40909      2
40941      3
40939      4
test          5

Private Sub cmdCheck_Click()
    Dim rs1 As Recordset
    Dim df1 As Date
    Set rs1 = CurrentDb.OpenRecordset("tr1")
    On Error GoTo Err_cmdCheck_Click
    
    Do While Not (rs1.EOF)
        df1 = CDate(rs1("f1"))
        rs1.Edit
            rs1("f1") = df1
        rs1.Update
        Debug.Print rs1("f1"), df1
cmdCheck_Click_jump_1:
        rs1.MoveNext
    Loop
Exit Sub
    
Err_cmdCheck_Click:
Debug.Print rs1("f1"), rs1("f2")
    Resume cmdCheck_Click_jump_1:
Exit Sub

Open in new window

hnsar: Thank you, you are correct, the CDate() is the key.  In this case IsDate() doesn't work as desired.

When I posted my question I already had a loop in place to validate the records.  It was working for all fields except any date fields.  I was initially using IsDate() to validate the dates but that was flaging every date (even the ones that were valid on the spreadsheet) in the 'DateOfApproval' column as invalid because of the way it imported from the spreadsheet.

Executing the CDate function only threw an error on the date that wasn't a valid date ("Test") on the spreadsheet.
It converted "41278" into the valid date (1/4/2013) it started out as on the spreadsheet.

Perfect.
You are welcome!