Avatar of Finchie
Finchie
 asked on

How do I normalise spreadsheet data with lots of rows and columns?

I have imported a spreadsheet from excel into Access that is not normalised. I wish to normalise the data (reverse cross-tab query or unpivot process). I am after a more efficient solution than Ive come across so far (eg typing each field title which is impractical). I am sure visual basic will provide the solution but Ive struggled.

My table is currently in the format below with PupilID in the first column and then exam marks in following columns.

PupilID Exam1 Exam2 Exam3 up to Exam300
1                  23
2
3            22

up to 300 pupils

My normalised data should have 3 fields
PupilID, Exam, Mark
Microsoft Access

Avatar of undefined
Last Comment
nico5038

8/22/2022 - Mon
Rey Obrero (Capricorn1)



place this codes in a module

Sub convertTable()
Dim rs As DAO.Recordset, rs1 As DAO.Recordset
Dim i As Integer, s, fldArr(), j

Set rs = CurrentDb.OpenRecordset("tblExam")  '<< your old table
Set rs1 = CurrentDb.OpenRecordset("tblExam2")  '<<< your new table

If rs.EOF Then
    MsgBox "no records"
    Exit Sub
End If
rs.MoveFirst
    For i = 0 To rs.Fields.Count - 1
        ReDim Preserve fldArr(i)
        fldArr(i) = rs.Fields(i).Name
    Next
Do Until rs.EOF
    For j = 2 To UBound(fldArr)
            With rs1
                .AddNew
                !pupilID = rs("pupilId")
                !exam = rs.Fields(fldArr(j)).Name
                !mark = rs.Fields(fldArr(j))
                .Update
               
            End With
    Next
    rs.MoveNext
Loop
rs.Close
rs1.Close
End Sub
ASKER CERTIFIED SOLUTION
Rey Obrero (Capricorn1)

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Scott McDaniel (EE MVE )

First: Since there is a 255 column limit in Access, I don't see how you could have that many Exam columns in a table ...

Seems to me you'd need 3 tables:

tPupil
---------------
lPupilID
sPupilName
etc etc

tExam
-----------
lExamID
sExamDescription
etc etc

tPupilExams
-------------
lExamID
lPupilID
dbExamScore
etc etc

From here, you'd do something like this:

************ Start Copy Here

Function NormalizeMyData()

'/This would give you a record in tPupils for EACH pupil. You may need to flesh this out a bit, of course.
CurrentProject.connection.execute "INSERT INTO tPupils(lPupilID)  SELECT DISTINCT PupilID FROM YourImportedTable"

'/Now, to build your Exam table: Open the importtable, and loop through the Fields collection:

Dim rst As ADODB.Recordset
Dim i as integer

Set rst = New ADODB.Recordset
'/only need 1 record for this
rst.Open "SELECT TOP 1 FROM YourImportTable", CurrentProject.Connection

For i = 0 to rst.Fields.Count - 1
  '/change this to suit the naming conventions of your table
  If Left(rst.Fields(i).Name,4) = "Exam" Then
    CurrentProject.Connection.Execute "INSERT INTO tExams(sExamDescription) VALUES('" & rst.Fields(i).Name & "')"
  End If
Next i

'/Finally, to get the marks:
Set rst = New ADODB.Recordset
rst.Open "SELECT * FROM YourImportTable", CurrentProject.Connection

Do Until rst.EOF
  For i = 0 to rst.Fields.count - 1
    If Left(rst.Fields(i).Name,4) = "Exam" Then
      CurrentProject.Connection.Execute "INSERT INTO tPupilExams(lPupilID,lExamID,dbMark) VALUES(" & rst("PupilID") & "," & GetExamID(rst.fields(i).Name & "," & rst.fields(i).value & ")"
    End If
 
  Next i
 rst.movenext
Loop

Msgbox "Finished"

End Function

'/helper function:
Function GetExamID(ExamDesc As String) As Long
Dim rst As ADODB.Recordset

Set rst = New ADODB.Recordset
rst.Open "SELECT lExamID FROM tExams WHERE sExamDescription='" & examdesc & "'", currentproject.connection
If Not(rst.eof and rst.bof) then
  GetExamID = rst("lExamID")
End If

****** END COPY

You could add all these to a Standard Module and run them from the Immediate window like this:

?NormalizeMyData
Finchie

ASKER
Capricorn1,
I'm working on your solution initially. It has a good go and produces correct normalized data but it only seems to do the first row/record.
I then get an error message.

"The field is too small to accept the amount of data you attempted to add. Try inserting or pasting less data. (Error 3163)
The field into which you tried to insert or paste data is not large enough to hold the data. Try inserting or pasting less data."

The debugger highlights this.
!mark = rs.Fields(fldArr(j))

Any thoughts, anyone.

I originally said Exam300 ((thanks to LSM consulting for pointing this out). This was a estimated figure and  the number is 245 fields keeping it below the 255 maximum.
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
Rey Obrero (Capricorn1)

check the size of the field { mark } in design view of the table

also, post sample actual data.

nico5038

Fastest solution is to create a UNION query like:

select PupilID, "Exam1" as Exam, Exam1 from tblSpreadsheetName
UNION
select PupilID, "Exam2" as Exam, Exam2 from tblSpreadsheetName
UNION
select PupilID, "Exam3" as Exam, Exam3 from tblSpreadsheetName
etc.

This UNION query can be used in an Append query to save the data in the target table.

Nic;o)
Finchie

ASKER
I put the data in a blank database, preparing it for upload and thought I'll give Capricorn 1's solution another go and it works!!
Thanks to everyone for their assistance ( I will have a play with all ideas suggested) and to Capricorn 1in particular for providing the solution I'll use.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Finchie

ASKER
Capricorn1, Thanks for your effort. I found the solution well set out in an easy to follow format. It was exactly the sort of elegant and efficient solution I was looking for. It will save me and any many other teachers valuable time. Previously we have had to manually type in marks. This will automate the process.
nico5038

Always amazed when people prefer a "complex" and slow VBA code solution instead of a simple and fast query solution...

But success with your application.

Nic;o)
Rey Obrero (Capricorn1)

Finchie,

u r welcome!!! and thanks for the compliment.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
nico5038

For the reference and the credits, the code came from:
http://support.microsoft.com/kb/202176

Nic;o)
Finchie

ASKER
Thanks for the microsoft link Nico5038. I will study the commentary. I decided against your union query solution because of the sheer number of fields(exams) involved in the original spreadsheet. I was also after a solution that didn't need modifying every time the exams changed. The visual basic solution was also quick enough. I do have a use for the Union query solution planned when I am starting off with a 4 column/177 row spreadsheet so thanks for your help.
nico5038

Thanks for your response Finchie. I value people that are eager to learn and when you see the advantages of the UNION query, my goal has been achieved :-)

Nic;o)
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.