We need to write a program or use a tool to validate a csv file. Periodically we receive a csv file of roughly 10 million rows that we need to upload into our data mart.
Prior to the upload we would like to validate that the file is formatted correctly and that the content appears to be valid. The validations would be:
1. That the column headings are correct and match the expected names
2. That the content of numeric variables are numeric and date variables are dates
A report should be produced showing which rows and fields are in error.
The solution will be installed on a Windows server that we will dedicate to this process.
The program should be user friendly enough that a typical office person could run it and review the results. A network admin person will install the solution on the server. Speed is not critical as this will run on a dedicated server, and overnight processing is acceptable.
What's the best tool and/or language to use to get something like this developed and deployed?