We help IT Professionals succeed at work.

How do I normalise spreadsheet data with lots of  rows and columns?

Finchie
Finchie asked
on
1,125 Views
Last Modified: 2010-04-21
I have imported a spreadsheet from excel into Access that is not normalised. I wish to normalise the data (reverse cross-tab query or unpivot process). I am after a more efficient solution than Ive come across so far (eg typing each field title which is impractical). I am sure visual basic will provide the solution but Ive struggled.

My table is currently in the format below with PupilID in the first column and then exam marks in following columns.

PupilID Exam1 Exam2 Exam3 up to Exam300
1                  23
2
3            22

up to 300 pupils

My normalised data should have 3 fields
PupilID, Exam, Mark
Comment
Watch Question

CERTIFIED EXPERT
Top Expert 2016

Commented:


place this codes in a module

Sub convertTable()
Dim rs As DAO.Recordset, rs1 As DAO.Recordset
Dim i As Integer, s, fldArr(), j

Set rs = CurrentDb.OpenRecordset("tblExam")  '<< your old table
Set rs1 = CurrentDb.OpenRecordset("tblExam2")  '<<< your new table

If rs.EOF Then
    MsgBox "no records"
    Exit Sub
End If
rs.MoveFirst
    For i = 0 To rs.Fields.Count - 1
        ReDim Preserve fldArr(i)
        fldArr(i) = rs.Fields(i).Name
    Next
Do Until rs.EOF
    For j = 2 To UBound(fldArr)
            With rs1
                .AddNew
                !pupilID = rs("pupilId")
                !exam = rs.Fields(fldArr(j)).Name
                !mark = rs.Fields(fldArr(j))
                .Update
               
            End With
    Next
    rs.MoveNext
Loop
rs.Close
rs1.Close
End Sub
CERTIFIED EXPERT
Top Expert 2016
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Scott McDaniel (EE MVE )Infotrakker Software
CERTIFIED EXPERT
Most Valuable Expert 2012
Top Expert 2014

Commented:
First: Since there is a 255 column limit in Access, I don't see how you could have that many Exam columns in a table ...

Seems to me you'd need 3 tables:

tPupil
---------------
lPupilID
sPupilName
etc etc

tExam
-----------
lExamID
sExamDescription
etc etc

tPupilExams
-------------
lExamID
lPupilID
dbExamScore
etc etc

From here, you'd do something like this:

************ Start Copy Here

Function NormalizeMyData()

'/This would give you a record in tPupils for EACH pupil. You may need to flesh this out a bit, of course.
CurrentProject.connection.execute "INSERT INTO tPupils(lPupilID)  SELECT DISTINCT PupilID FROM YourImportedTable"

'/Now, to build your Exam table: Open the importtable, and loop through the Fields collection:

Dim rst As ADODB.Recordset
Dim i as integer

Set rst = New ADODB.Recordset
'/only need 1 record for this
rst.Open "SELECT TOP 1 FROM YourImportTable", CurrentProject.Connection

For i = 0 to rst.Fields.Count - 1
  '/change this to suit the naming conventions of your table
  If Left(rst.Fields(i).Name,4) = "Exam" Then
    CurrentProject.Connection.Execute "INSERT INTO tExams(sExamDescription) VALUES('" & rst.Fields(i).Name & "')"
  End If
Next i

'/Finally, to get the marks:
Set rst = New ADODB.Recordset
rst.Open "SELECT * FROM YourImportTable", CurrentProject.Connection

Do Until rst.EOF
  For i = 0 to rst.Fields.count - 1
    If Left(rst.Fields(i).Name,4) = "Exam" Then
      CurrentProject.Connection.Execute "INSERT INTO tPupilExams(lPupilID,lExamID,dbMark) VALUES(" & rst("PupilID") & "," & GetExamID(rst.fields(i).Name & "," & rst.fields(i).value & ")"
    End If
 
  Next i
 rst.movenext
Loop

Msgbox "Finished"

End Function

'/helper function:
Function GetExamID(ExamDesc As String) As Long
Dim rst As ADODB.Recordset

Set rst = New ADODB.Recordset
rst.Open "SELECT lExamID FROM tExams WHERE sExamDescription='" & examdesc & "'", currentproject.connection
If Not(rst.eof and rst.bof) then
  GetExamID = rst("lExamID")
End If

****** END COPY

You could add all these to a Standard Module and run them from the Immediate window like this:

?NormalizeMyData

Author

Commented:
Capricorn1,
I'm working on your solution initially. It has a good go and produces correct normalized data but it only seems to do the first row/record.
I then get an error message.

"The field is too small to accept the amount of data you attempted to add. Try inserting or pasting less data. (Error 3163)
The field into which you tried to insert or paste data is not large enough to hold the data. Try inserting or pasting less data."

The debugger highlights this.
!mark = rs.Fields(fldArr(j))

Any thoughts, anyone.

I originally said Exam300 ((thanks to LSM consulting for pointing this out). This was a estimated figure and  the number is 245 fields keeping it below the 255 maximum.
CERTIFIED EXPERT
Top Expert 2016

Commented:
check the size of the field { mark } in design view of the table

also, post sample actual data.

Commented:
Fastest solution is to create a UNION query like:

select PupilID, "Exam1" as Exam, Exam1 from tblSpreadsheetName
UNION
select PupilID, "Exam2" as Exam, Exam2 from tblSpreadsheetName
UNION
select PupilID, "Exam3" as Exam, Exam3 from tblSpreadsheetName
etc.

This UNION query can be used in an Append query to save the data in the target table.

Nic;o)

Author

Commented:
I put the data in a blank database, preparing it for upload and thought I'll give Capricorn 1's solution another go and it works!!
Thanks to everyone for their assistance ( I will have a play with all ideas suggested) and to Capricorn 1in particular for providing the solution I'll use.

Author

Commented:
Capricorn1, Thanks for your effort. I found the solution well set out in an easy to follow format. It was exactly the sort of elegant and efficient solution I was looking for. It will save me and any many other teachers valuable time. Previously we have had to manually type in marks. This will automate the process.

Commented:
Always amazed when people prefer a "complex" and slow VBA code solution instead of a simple and fast query solution...

But success with your application.

Nic;o)
CERTIFIED EXPERT
Top Expert 2016

Commented:
Finchie,

u r welcome!!! and thanks for the compliment.

Commented:
For the reference and the credits, the code came from:
http://support.microsoft.com/kb/202176

Nic;o)

Author

Commented:
Thanks for the microsoft link Nico5038. I will study the commentary. I decided against your union query solution because of the sheer number of fields(exams) involved in the original spreadsheet. I was also after a solution that didn't need modifying every time the exams changed. The visual basic solution was also quick enough. I do have a use for the Union query solution planned when I am starting off with a 4 column/177 row spreadsheet so thanks for your help.

Commented:
Thanks for your response Finchie. I value people that are eager to learn and when you see the advantages of the UNION query, my goal has been achieved :-)

Nic;o)
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.