Solved

best way to design table

Posted on 2013-11-05
10
252 Views
Last Modified: 2013-11-10
I have an excel spreadsheet that looks like this

                          January          February     March      etc....
John Smith             2                    2              2
Ann Smith             2                    0                2  
Joe Harvard           0                      2               2
etc...

I was thinking of making the months into columns and then making the rows according to the name and hours.  Any suggestions on how I should be designing this?
0
Comment
Question by:al4629740
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 45

Assisted Solution

by:Kent Olsen
Kent Olsen earned 167 total points
ID: 39625253
Hi Al,

While there are certain pros and cons to the differing styles, normalizing the date (month) into a single column is the easiest to use for most purposes.

CREATE TABLE xxx(
  user    varchar (30),
  period  date,
  hours    float
);

Open in new window


The alternative is something like this:

CREATE TABLE xxx(
  user    varchar (30),
  year    integer,
  January_hours float,
  February_hours float,
  March_hours  float,
  ... etc...
  December_hours float
);

Open in new window


A query to sum hours by year, quarter, etc. is trivial with the first.  It's a little more cumbersome with the second query.  Queries that want fiscal year hours (July to June) or any query across years can be particularly wordy with the second style.


Keep it simple!
Kent
0
 

Author Comment

by:al4629740
ID: 39625320
I would want to present the same format as above in a datagrid in vb6.  Would that pose a problem?
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 250 total points
ID: 39625597
You should normalize all data if possible.

So you should have "user_first" and "user_last" (and others, if needed), if possible, rather than just "user".  Frankly, I'd be more concerned with that normalization than the other.  I could live with denormalizing all months for the year in the same row if it would work much better for this specific table in this specific situation.

Another key thing is to cluster the table based on how it's most often going to be referenced:
( period, user_last, user_first )  OR
( user_last, user_first, period )  OR
etc..
Just don't cluster it on identity.
0
Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

 

Author Comment

by:al4629740
ID: 39625712
Thank you
0
 

Author Comment

by:al4629740
ID: 39625714
What's the main difference between setting it as a non cluster and cluster in this case?
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 250 total points
ID: 39625791
Cluster is the table itself, so will automatically all columns in the table.

Nonclustered is a separate, duplicate copy of some/all columns in the table; any column not in the index requires a (costly) lookup back to the original table.

A normal data table should never have a non-clustered index only -- first put the proper clustered index on the table, then add non-clustered index(es) as needed.
0
 
LVL 48

Assisted Solution

by:PortletPaul
PortletPaul earned 83 total points
ID: 39626302
Consider a report that compares this quarter with the corresponding quarter of last year (Quarter on Quarter).

These are far easier to furnish if the data is normalized as fieldnames don't change - just the date values. Also; Inserts and Updates are easier into normalized tables for similar reasons in my view.

>>I would want to present the same format as above in a datagrid in vb6.
It is possible to "pivot" the normalized data into the 12 column arrangement needed for your grid.

------
The alternative, to store in the 12 monthly columns, can be "unpivoted" into a normalized structure during the process of producing reports such as Quarter on Quarter.

Inserts and Updates are more complex in a de-normalized approach in my view. e.g. when updating a month's data you have to choose both the right row and the right column (fieldname) here.

------
i.e. It's a trade-off, either are possible. In general: normalize.
0
 

Author Comment

by:al4629740
ID: 39626414
Interesting.  So a cluster basically remakes the whole table then?
0
 
LVL 45

Assisted Solution

by:Kent Olsen
Kent Olsen earned 167 total points
ID: 39627194
Yes and no.

A clustered index ensures that rows with similar values (in the clustered key) are are stored "near" each other in the database.  The theory being that the designer wants a performance boost when data is accessed by the clustered key.

When the clustered key is created, the DBMS can do a lot of work to align the data according to the clustering rules.  During routine usage, things can get a bit tricky, and even a bit dicey.  When a new row is stored, or a row is modified so that the values in the clustered key change, the DBMS works to enforce the clustering.  The modified row can be particularly taxing on a busy DBMS as movement of the row to a page designated for the new clustered key value may occur when that happens, every index over the row must also be updated.

-  A clustered index can move a lot of data when it's created so that the data is aligned with the clustering.  
-  When a row is inserted, the overhead is similar to adding most any row.  
-  When a row is modified and the clustered key is not affected, the overhead is similar to modifying a row the does not have a clustered index.
-  When a row is modified and the value in the clustered key changes, the overhead may be significant the row may have to be relocated and all of the table indexes updated.


Kent
0
 
LVL 69

Accepted Solution

by:
Scott Pletcher earned 250 total points
ID: 39627461
>> So a cluster basically remakes the whole table then? <<

Yes, 100%.

Again, a clustered index is the table itself.  So adding a clus index to a table that doesn't have one can re-write the entire table; if any nonclus index(es) exist on the table, all of them must be rebuilt as well.

The clustered key should (almost) never change because of the overhead involved in changing it.  It's easy to pick a value that won't change, such as user name, employee id, item number, identity value (last resort, never as a default "key"), etc..
0

Featured Post

Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SQL Pivot with row total 5 30
Database Integrity 1 51
sql server query 12 26
job schedule 8 21
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Viewers will learn how the fundamental information of how to create a table.

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question