Importing inconsistent excel data into a sql server table

Posted on 2014-07-24
Last Modified: 2014-11-16
I have a bunch of spreadsheets with info listed in multiple rows and columns, similar to the attached file.  Each sheet represents a box with multiple files in it.  The box number is listed 1 time in column A.  Then column D starts the specific data for a file.  Some rows have data in columns E, F and/or G. Some files have 1 or 2 rows of info, others may have 3, 4 or 5.  (See attached).  The last row of data for each file contains "Attorney Initial(s)" in column D and "Destroy Date" in column G .  

What I'm trying to do is develop a process that can read the info and import it into a table where each "File" is in one row.  Column Headers are:

Box #
Attorney Initials
Destroy Date
Client # (the data in column G with numbers)

Question by:RavenTim
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +2
LVL 22

Expert Comment

by:Nico Bontenbal
ID: 40218796
If someone asked me to do that I'dd write some VBA code in Excel to process the data. This macro can then loop though all the Excel files, open them one by one an loop through all the sheets in the Excel files. Then it loops though all the rows and columns of the sheet and extracts the data and inserts it into SQL server. When you have never created VBA code before this is quite a big task. You might consider hiring someone to do this for you. The code should do something like this:
Loop through all the .xls files in a folder
   Open the file
   Loop though all sheets in the file
      Get the box number from column A
      loop though all the rows
          if first row or new file 'remember' client number from column G
          if not add and 'remember' all the desciptions
          until column d starts with "Attorney Initials" then get the initials and the destoy date from the row and store all the data in SQL server
          start with the next file

Might not be 100% correct but you'll need some kind of rule set like this. Also keep in mind that a slight variation in the structure that would be totaly clear to a human (having 'DFG (Attorney Initials)' in a field somewhere, would create undesired results.
LVL 48

Expert Comment

by:Dale Fye (Access MVP)
ID: 40219056
Concur with Nicobo.

From your example, what you would like is 2 records or 3?  The inconsistency that row 6 does not have any initials but has text above it would be a complication, but could simply leave that field empty.

I assume that the "Box #" field would contain "Tub 7 - J Cool", "EPA", "Tub 8 - Kelly Hump"

But would you want to compile all of the info from Cells D8, E9, D9, D10, D11, D12 into a single description field?

Also, the 1721.002 is a bit confusing as a "Destroy Date".  That value might mean something in the legal profession, but is hard to fathom as a "date" field.
LVL 22

Expert Comment

by:Nico Bontenbal
ID: 40219098
And I agree with Dale. It's not totally clear how you would like to process the data. I think it would help if you described the exact result (so the data in the SQL table) that you would like from the Sample.xls. Don't forget that you might have been working with these sheets for hours/days/months/years but we see them for the first time.
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

LVL 12

Expert Comment

by:James Elliott
ID: 40220266
I normally love this sort of thing, but I agree with the above, your data is a complete mess. I think you need at the very least (for a free solution) to provide an annonymised file of your data so we can see the full problem, and, a draft mock up of how you want it to look after.


Author Comment

ID: 40230037
Hey Guys, sorry, but I was on vacation and been catching up... I'll get some more info together & post it later today.


Author Comment

ID: 40230411
Ok.   I've attached two different samples showing the data as is (sheet 1) and the "Desired Result" (sheet 2).  I'm not sure how doable this is going to be as I have about 1500 separate files to process and there is little consistency.  Please see attached samples.

LVL 12

Accepted Solution

James Elliott earned 500 total points
ID: 40230498
I'm slightly confused by the first file. Are we to assume that rows 80 onwards should have been completed in the pro forms cells above? Is this how the files differ? Ie. People have chosen to adhere to the template at random?

if so i fear a quick generic approach is not going to be possible.

If u were able to provide a list of every partner initials and surnames then a generic routine might hit 70% accuracy but I'm guessing the final outcome needs to be better than that?

I'd suggest getting a fast-typing temp for a week :/

Author Comment

ID: 40230517
:-)  That's kind of what I thought...  

LVL 48

Expert Comment

by:Martin Liss
ID: 40445447
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Some code to ensure data integrity when using macros within Excel. Also included code that helps secure your data within an Excel workbook.
You need to know the location of the Office templates folder, so that when you create new templates, they are saved to that location, and thus are available for selection when creating new documents.  The steps to find the Templates folder path are …
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question