Solved

How to normalize data - sql - vb6

Posted on 2013-11-23
16
370 Views
Last Modified: 2013-12-02
I have to develop a table that contains the records of students and each day of the month that they had hours in attendance.  Here are the columns:


Last Name
First Name
1
2
3
....
30
31


When I pull up the data in a datagrid within VB6 I would like for a grid to pop up that shows all the days of the month as columns so that I can enter the hours for each student.

What is the proper to normalize the table so that the application can pull up the columns accordingly?
0
Comment
Question by:al4629740
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 3
  • +3
16 Comments
 
LVL 21

Assisted Solution

by:oleggold
oleggold earned 50 total points
ID: 39672188
You have to design simple sql , just wanted to understand the task , actually looks like what You need is de-normalize the normalized data or rather pivot it to show each day of the month as column?
0
 
LVL 21

Expert Comment

by:oleggold
ID: 39672189
if that what You need to do , just use appropriate function in sql server.
0
 

Author Comment

by:al4629740
ID: 39672216
What is an example of pivoting the data?
0
Free eBook: Backup on AWS

Everything you need to know about backup and disaster recovery with AWS, for FREE!

 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 50 total points
ID: 39672668
for a normalised solution

i'd expect as a minimum you had the following tables

student   -  id,name.dob,....
attendance - id,date,hours,status
0
 
LVL 38

Assisted Solution

by:Jim P.
Jim P. earned 100 total points
ID: 39672713
You should also have a course table that ties to the attendance table. Then you will probably need an instructor table.

There is a decision point in the course table as well. Do you want duplicate courses with different instructors? Or do you want a single course name/ID and use a cross reference table to the instructors that teach it?

And how do you handle substitute instructors?
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 50 total points
ID: 39675275
students ( student_id int, last_name varchar(nnn), first_name varchar(nnn), ... )
-- student_id = unique key


attendance ( date, student_id, hours, ... )
-- ( date, student_id ) = unique key

OR, if you need to record different blocks of hours per student per day, then this:

attendance ( date, student_id, code, hours, ... )
-- ( date, student_id, code ) = unique key
0
 

Author Comment

by:al4629740
ID: 39679481
So if I had two tables,

student   -  id,name.dob,....
attendance - id,date,hours,status

Then could someone provide an sql statement that would take the data from the two tables and output them as a grid in my datagrid in vb6?  I am assuming I would use an innerjoin statement somehow.

As you can tell I am new to pivoting the tables to accomplish this task.
0
 
LVL 38

Assisted Solution

by:Jim P.
Jim P. earned 100 total points
ID: 39679619
The ID should really something like StudentID and then the Attendance would have an AttID as the PK, and the StudentID as a separate column. The reason is that when you start using just ID it gets confusing really quick.

But the query would be something like:
select  student.id, student.name, student.dob, attendance.date, attendance.hours, attendance.status
from student
left join attendance 
      on student.id = attendance.id

Open in new window


And just a side comment. This sounds like you are working out of a book. Probably the best method to get it all down faster is to try coding the query yourself and then show us what you're having issues with.
0
 
LVL 51

Accepted Solution

by:
Mark Wills earned 250 total points
ID: 39679922
Yeah, seems to be a missing column in attendance... You do need StudentID in there.

student   -  StudenttID,name.dob,....
attendance - attendanceID,StudentID,date,hours,status

Because there will be multiple dates for a Student, so, need StudentID in there (maybe that's what "ID" was, but lets clarify)

Could use StudentID + Date as the unique key, or, add in a separate unique key. I would be inclined to have the separate / surrogate key and just an index over StudentID + Date

Now, you don't mention what version of SQL you are using, so will assume minimum 2005.

To get the combined data...

Select *
From Student S
Inner Join Attendance A on S.studentID = A.studentID

Open in new window


BUT the problem with that is what about a new period ? Then the Attendance row wont exist, unless, you pre-populate, or, you have a "school calendar" table

So, would suggest you build a calendar table as well...

Calendar - Date, Day, Month, Year, term, ...

Then your join would look like :

select * 
from Student S
Cross join Calendar C
left join Attendance A on S.studentID = A.studentID AND C.date = S.date

Open in new window


And then just add in a WHERE clause to decide the month / year

select * 
from Student S
Cross join Calendar C
left join Attendance A on S.studentID = A.studentID AND A.date = C.date
where year = 2013
and month = 9

Open in new window


So, lets now start to construct some data so we can test out our query...

-- first Student and Attendance
create table student (StudentID int Identity primary key,Name Varchar(100),DOB datetime)
create table attendance (attendanceID int identity primary key,StudentID int,[date] datetime,[hours] int, [status] int)

-- now some values
 
insert student values ('Mark','19220607')
insert student values ('Jim','19200607')

Insert attendance values (1, '20130906',6,1)
Insert attendance values (1, '20130915',5,1)

-- note we don't have any attendance for Jim at this point

Open in new window


and now out calendar table which is going to be a bit boring because I am not doing any checks like holidays, school term, etc...

create table Calendar ([Date] datetime primary key, [Day] int, [Month] int, [Year] int, [WeekDay] Char(1))

declare @d datetime = '20130101'
While @d < '20140101'
begin

   insert calendar
   select @d, day(@d), month(@d), year(@d), case when left(datename(dw,@d),1) <> 'S' then 'Y' else 'N' end 
   set @d = @d +1

end

Open in new window


There is an article about building a calendar table : http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/A_12267-Date-Fun-Part-One-Build-your-own-SQL-calendar-table-to-perform-complex-date-expressions.html

So, now lets convert our select statement to get the "raw" info required...

select S.StudentID, S.Name, C.Day, isnull(a.hours,0) as attended 
from Student S
Cross join Calendar C
left join Attendance A on S.studentID = A.studentID AND A.date = C.date
where year = 2013
and month = 9

Open in new window


Now we can see that the cross join has created entries for each day, and attendance (hours) are appearing only where there is attendance hours. So, even Jim is appearing despite not having any attendance.

But we do want to pivot. So we use the above and pivot student and hours over days as columns

select * 
from 
     (select S.StudentID, S.Name, C.Day, isnull(a.hours,0) as attended 
      from Student S
      Cross join Calendar C
      left join Attendance A on S.studentID = A.studentID AND A.date = C.date
      where year = 2013
      and month = 9) src
PIVOT
     (sum(attended) for day in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],[31])) pvt

Open in new window


The above will give you 31 columns everytime. to get those columns to be date sensitive, you will need to get into dynamic SQL.

For that, please read : http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL-Server-2005/A_653-Dynamic-Pivot-Procedure-for-SQL-Server.html
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39681059
You do NOT need NOR WANT an "attendance id".  You have natural keys for that table.

More generally, for the sake of your database designs and performance, get away from the dopey, lazy notion that "every table should/must have an identity".
0
 
LVL 38

Expert Comment

by:Jim P.
ID: 39681236
Scott,

I agree that the composite key will eventually be StudentID, AttendanceDate, CourseID and maybe InstructorID depending on design.

I disagree that a simple index is not needed. I have had too many issues with doing bug fixes from programmers that I have had to get into a long and complicated queries building up what to update or delete. Where if I had an ID number then I could have dumped the rows into a temp table where StudentID = "####", then delete out non valid rows from the temp table, and then go back  into the production table and just delete where InexNum in (select InexNum from temptable).
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39681547
For this table, the composite key must itself be unique.  Thus, there is absolutely no need for an identity key.  An identity "crutch" added to this table will likely cause problems by allowing the natural key values to be repeated when they should not be (because when a nice, simple key is available, that gets used for the PK, which becomes the clustered index, and "messy", i.e. non-simple, indexes are then not created).
0
 
LVL 38

Expert Comment

by:Jim P.
ID: 39681903
(because when a nice, simple key is available, that gets used for the PK,

I'm not advocating for it to be the primary key. I'm advocating to have it available as an easier way for doing data manipulation.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 39681954
Scott,

I clearly said that StudentID + Date can serve as a natural key.

It is not a dopey / lazy "crutch" to add an Identity.

The reality is the design is very far from complete, and can well imagine additional requirements in Attendance such as Course and other potential unique identifiers needed for referential integrity (such s courseID).

You are quite wrong saying that a surrogate (automatically) leads to duplication of natural keys. I also mentioned adding in a unique key. There are constraints such as "UNIQUE" that is available. And if student + date is "nice and simple" as a PK, then that exact combo cannot become a "messy non-simple" (unique) index for no other reason than the introduction of an identity. It is still a "nice and simple" unique index.

Quite frankly, your reaction and tone with the bold comments was very disappointing to see and read, and assume the comments were targeted to discredit or raise doubts about my considered post.

While I understand and will often support the use of natural keys, when dates are involved and progressively being added to, by student, there is naturally a strong tendency of fragmentation and page shuffles due to dates being inserted across the range of students.

There is no real chronology other than the sequence in which rows are being added leading to built in inserts between pre-existing rows over time.

So, considering the page structure using natural keys, I was quite deliberate when being "inclined" to use an attendanceID as a surrogate key. And clearly said there was a choice. And clearly referred to uniqueness of Student + Date

To say it was "dopey, lazy" is simply wrong.

Let's also consider (or simply remember the old education days) that there could quite easily be additional requirements in "attendance" to reflect curriculum such as course, maybe year, electives, subject matter. We have yet to be told about any frequency of "marking the role" and who does that, or, if this requirement is a different kind of planner for student volumes.

Now, I will agree that making StudentID an identity was rather lazy, but also convenient for the example. The reality is that studentID has (more than likely) some other method of being created or assigned which becomes integral with institutional identity cards (library, travel etc) for that student.

My objective was two fold.

1) to show the pivot to present normalised data as 31 columns
2) to show the inclusion of the Calendar Table to provide the "template" regardless of actual attendance

That was my understanding of the brief in the question header.

If the question was more about designing a student system, then there is considerable more work that might well change the very scant design being presented.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 39682716
Apologies Scott, upon reflection it may not have been directed purely at my post.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39685312
I was very specific as well: what I said was "dopey and lazy" was the idea that essentially every table should have an identity, and that it should be the clustering key.  

Yes, it's certainly possible that one could add an identity column, say as a PK, but still properly define and use the natural keys as clustered and unique.  I've just seen in practice that rarely ever happens.  Once an identity appears in a table, it is usually very quickly designated the clustered PK and the more natural keys are ignored.  That can, of course, lead to duplicate natural key values where, from a business standpoint, none should be allowed.

I point this out so dramatically because I've seen it on literally hundreds of tables.  The identity-as-default-primary-clustering-key is the single most common mistake seen in table design.  And the worst performance loss is caused by clustering by identity instead of choosing the single best clustering key.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Building JSON Results Table FROM DB 9 37
Logical Operator should return Integer value in SSIS 9 39
sql query 5 44
Need SSIS project 2 30
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
I have a large data set and a SSIS package. How can I load this file in multi threading?
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question