How do I normalise spreadsheet data with lots of rows and columns?
I have imported a spreadsheet from excel into Access that is not normalised. I wish to normalise the data (reverse cross-tab query or unpivot process). I am after a more efficient solution than Ive come across so far (eg typing each field title which is impractical). I am sure visual basic will provide the solution but Ive struggled.
My table is currently in the format below with PupilID in the first column and then exam marks in following columns.
PupilID Exam1 Exam2 Exam3 up to Exam300
1 23
2
3 22
up to 300 pupils
My normalised data should have 3 fields
PupilID, Exam, Mark
Microsoft Access
Last Comment
nico5038
8/22/2022 - Mon
Rey Obrero (Capricorn1)
place this codes in a module
Sub convertTable()
Dim rs As DAO.Recordset, rs1 As DAO.Recordset
Dim i As Integer, s, fldArr(), j
Set rs = CurrentDb.OpenRecordset("tblExam") '<< your old table
Set rs1 = CurrentDb.OpenRecordset("tblExam2") '<<< your new table
If rs.EOF Then
MsgBox "no records"
Exit Sub
End If
rs.MoveFirst
For i = 0 To rs.Fields.Count - 1
ReDim Preserve fldArr(i)
fldArr(i) = rs.Fields(i).Name
Next
Do Until rs.EOF
For j = 2 To UBound(fldArr)
With rs1
.AddNew
!pupilID = rs("pupilId")
!exam = rs.Fields(fldArr(j)).Name
!mark = rs.Fields(fldArr(j))
.Update
End With
Next
rs.MoveNext
Loop
rs.Close
rs1.Close
End Sub
'/This would give you a record in tPupils for EACH pupil. You may need to flesh this out a bit, of course.
CurrentProject.connection.execute "INSERT INTO tPupils(lPupilID) SELECT DISTINCT PupilID FROM YourImportedTable"
'/Now, to build your Exam table: Open the importtable, and loop through the Fields collection:
Dim rst As ADODB.Recordset
Dim i as integer
Set rst = New ADODB.Recordset
'/only need 1 record for this
rst.Open "SELECT TOP 1 FROM YourImportTable", CurrentProject.Connection
For i = 0 to rst.Fields.Count - 1
'/change this to suit the naming conventions of your table
If Left(rst.Fields(i).Name,4) = "Exam" Then
CurrentProject.Connection.Execute "INSERT INTO tExams(sExamDescription) VALUES('" & rst.Fields(i).Name & "')"
End If
Next i
'/Finally, to get the marks:
Set rst = New ADODB.Recordset
rst.Open "SELECT * FROM YourImportTable", CurrentProject.Connection
Do Until rst.EOF
For i = 0 to rst.Fields.count - 1
If Left(rst.Fields(i).Name,4) = "Exam" Then
CurrentProject.Connection.Execute "INSERT INTO tPupilExams(lPupilID,lExamID,dbMark) VALUES(" & rst("PupilID") & "," & GetExamID(rst.fields(i).Name & "," & rst.fields(i).value & ")"
End If
Next i
rst.movenext
Loop
Msgbox "Finished"
End Function
'/helper function:
Function GetExamID(ExamDesc As String) As Long
Dim rst As ADODB.Recordset
Set rst = New ADODB.Recordset
rst.Open "SELECT lExamID FROM tExams WHERE sExamDescription='" & examdesc & "'", currentproject.connection
If Not(rst.eof and rst.bof) then
GetExamID = rst("lExamID")
End If
****** END COPY
You could add all these to a Standard Module and run them from the Immediate window like this:
?NormalizeMyData
Finchie
ASKER
Capricorn1,
I'm working on your solution initially. It has a good go and produces correct normalized data but it only seems to do the first row/record.
I then get an error message.
"The field is too small to accept the amount of data you attempted to add. Try inserting or pasting less data. (Error 3163)
The field into which you tried to insert or paste data is not large enough to hold the data. Try inserting or pasting less data."
The debugger highlights this.
!mark = rs.Fields(fldArr(j))
Any thoughts, anyone.
I originally said Exam300 ((thanks to LSM consulting for pointing this out). This was a estimated figure and the number is 245 fields keeping it below the 255 maximum.
check the size of the field { mark } in design view of the table
also, post sample actual data.
nico5038
Fastest solution is to create a UNION query like:
select PupilID, "Exam1" as Exam, Exam1 from tblSpreadsheetName
UNION
select PupilID, "Exam2" as Exam, Exam2 from tblSpreadsheetName
UNION
select PupilID, "Exam3" as Exam, Exam3 from tblSpreadsheetName
etc.
This UNION query can be used in an Append query to save the data in the target table.
Nic;o)
Finchie
ASKER
I put the data in a blank database, preparing it for upload and thought I'll give Capricorn 1's solution another go and it works!!
Thanks to everyone for their assistance ( I will have a play with all ideas suggested) and to Capricorn 1in particular for providing the solution I'll use.
Capricorn1, Thanks for your effort. I found the solution well set out in an easy to follow format. It was exactly the sort of elegant and efficient solution I was looking for. It will save me and any many other teachers valuable time. Previously we have had to manually type in marks. This will automate the process.
nico5038
Always amazed when people prefer a "complex" and slow VBA code solution instead of a simple and fast query solution...
Thanks for the microsoft link Nico5038. I will study the commentary. I decided against your union query solution because of the sheer number of fields(exams) involved in the original spreadsheet. I was also after a solution that didn't need modifying every time the exams changed. The visual basic solution was also quick enough. I do have a use for the Union query solution planned when I am starting off with a 4 column/177 row spreadsheet so thanks for your help.
nico5038
Thanks for your response Finchie. I value people that are eager to learn and when you see the advantages of the UNION query, my goal has been achieved :-)
place this codes in a module
Sub convertTable()
Dim rs As DAO.Recordset, rs1 As DAO.Recordset
Dim i As Integer, s, fldArr(), j
Set rs = CurrentDb.OpenRecordset("t
Set rs1 = CurrentDb.OpenRecordset("t
If rs.EOF Then
MsgBox "no records"
Exit Sub
End If
rs.MoveFirst
For i = 0 To rs.Fields.Count - 1
ReDim Preserve fldArr(i)
fldArr(i) = rs.Fields(i).Name
Next
Do Until rs.EOF
For j = 2 To UBound(fldArr)
With rs1
.AddNew
!pupilID = rs("pupilId")
!exam = rs.Fields(fldArr(j)).Name
!mark = rs.Fields(fldArr(j))
.Update
End With
Next
rs.MoveNext
Loop
rs.Close
rs1.Close
End Sub