Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Database normalization

Posted on 2014-02-06
16
Medium Priority
?
235 Views
Last Modified: 2014-02-07
Just curious what the Guru's (and above) thoughts are on normalization whereby you separate TEXT fields from a table. I was brought up on where a field was an indeterminate length then you should separate it but is that the case with MySQL? Is it true within InnoDB and/or MyISAM?
Am I just wasting my time separating the data and joining tables?
Or would I be better off just putting those fields in the main table even where some of those fields maybe blank?
0
Comment
Question by:Gary
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 4
  • 3
  • +2
16 Comments
 
LVL 84

Assisted Solution

by:Dave Baldwin
Dave Baldwin earned 1332 total points
ID: 39841209
I thought you were supposed to store 'constant' and reused fields in their own separate tables so that any changes would be reflected in all the other places that the data is used.

The storage requirements for the different 'engines' in MySQL are a little confusing.  Large TEXT and BLOB fields are stored in 'hidden tables' outside the main table file.  Sometimes.  This link will take you to the beginning of the description of the storage requirements and the different 'engines'.

https://dev.mysql.com/doc/refman/5.5/en/storage-requirements.html
0
 
LVL 58
ID: 39841567
There are a couple of very fundamental thoughts behind relational design.  The base idea is that of a relation, which is a table.

A table should describe one "thing", or entity.  For example, customers.    All the data about that "thing" belongs in that table and that table only.   For example, customer name belongs here and not in an orders table.  By doing this, you avoid problems with data being stored in multiple places.

As part of forming the relation, all fields should be "atomic".  That is that should not be able to be split.  Example  "1 hammer, 2 nails"  has a qty and description, and has multiples of both.  Those are no-no's.

The other part of a relation is that each row needs to be uniquely identifiable.  If by combining every fact (column), you cannot uniquely identify the row, you don't have enough facts.

For a relational database to function well and properly, you do need to normalize.   However if it is not a relational DBMS, then there is no real requirement to normalize.

However since normalization is a process for organizing data in a mathematical and logical way, it makes sense to use it in many situations.   And if your using SQL with the DBMS, then it really makes sense to try and use it.

The mathematical model behind relational design stems from set theory, and SQL is designed specifically to work with sets of records.

Jim.
0
 
LVL 58

Author Comment

by:Gary
ID: 39841868
The question is specifically about TEXT/BLOB fields, not about normal normalisation to preventing duplication.
Dave...
Large TEXT and BLOB fields are stored in 'hidden tables' outside the main table file. Sometimes.
What is the sometimes?
I am reading conflicting advice about the best method, from makes no difference to memory storage for fixed length rows...etc
Also just reading up on https://dev.mysql.com/doc/innodb/1.1/en/innodb-other-changes-file-formats.html
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
LVL 58
ID: 39841927
<<The question is specifically about TEXT/BLOB fields, not about normal normalisation to preventing duplication.>>

 Well you better clairfy the question then.  Doesn't seem like your talking about normalization at all.

 re-reading your question and your last comment, it seems more about data storage within a DBMS system, which is something an end user is typically not involved with.

 There no reason I'm aware of that one would ever want to break a text field from the rest of the record simply because it's variable length.  It's up to the DBMS to decide how something like that should be physically stored.

  With that said, there are times when you might want to poke a DBMS into doing something it ordnairly would not do, but you need a through understand of how the DBMS system works.

  A good example of that is with JET databases; you can get "record level" locking by understanding that JET never splits a record across multiple pages.  So if you pad the record past half a page size, you end up with one record per page and thus record level locking even though the DBMS is locking at page level.   To do that properly though, you also need to understand that the OLE and Memo field types are not stored with the rest of the record, but in another part of the database.

 It would help if you would calirfy exactly what your going after here.

Jim.
0
 
LVL 58

Author Comment

by:Gary
ID: 39841978
It is a shopping website.
I have a table with fixed length fields, about 5000 rows, which stores things like product name, weight, cost, price, etc
For every row there are two variable length Text fields - could be 100 characters, could be 1000's of characters. And some rows have nothing at all.
This is MySQL - not MSSQL or Access (you are mentioning Jet and Memo)

The question is specifically is there any advantage in InnoDB to seperating the Text fields to their own table.
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 39842131
Normalization is a term of art in data base work.  Make a Google search for the exact phrase, "Should I Normalize my Database" and read the very good ideas from several different viewpoints.

If you have a table of 5,000 rows, you are unlikely to see any difference from just separating out the text columns.  Here's the math...

Assume 6,000 text bytes per row
Assume 5,000 rows
Data Size is 29 MB

That fits in memory.

This would be a different answer if the numbers were substantially higher, in which case I would recommend "normalization" of these long text columns.
0
 
LVL 58

Author Comment

by:Gary
ID: 39842176
Would using Barracuda file format make any difference?
0
 
LVL 111

Assisted Solution

by:Ray Paseur
Ray Paseur earned 668 total points
ID: 39842398
I'm not familiar with Barracuda, but I can share a general idea that might be useful.  About twice a month somebody posts a question here at EE surrounding the topic of putting images or videos into a data base.  We try to discourage that because those sorts of things are usually very large, perhaps megabytes.   SQL-type data bases are not well-designed for that kind of storage.  The folks who do not understand that also do not understand why they would not want to use SELECT * when querying the data base.  The symptom that often sends them to EE is "my data base is so slow."

The usual advice is to remove the images from the data base and store the images in the file system.  Store the URL of the images in the data base instead of the images.

If you removed the large wide columns from your DB and stored them in the file system, with a file pointer in the DB, you would be following the same general idea.  It might be interesting to time the application to get before and after comparisons.
0
 
LVL 58

Author Comment

by:Gary
ID: 39842466
This is just text, no images et al
0
 
LVL 84

Accepted Solution

by:
Dave Baldwin earned 1332 total points
ID: 39842518
In the link I posted, some of the MySQL engines will store only the first 256 bytes of a TEXT or BLOB field in the row and anything over that in a 'hidden' table.  Others will store it inline in the row if it will fit along with the other data in the row limit of 65,536 bytes.  That is why I said 'Sometimes'.
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 39842588
This is just text, no images et al
Yes, I understand.  But I think the size of the text may make the same principles applicable.
0
 
LVL 58

Author Comment

by:Gary
ID: 39842626
@Dave
So basically storing it directly in the main table makes no difference as far as MySQL is concerned as it ends up being two tables anyway. This is the general concensus I found reading various sites.
0
 
LVL 84

Expert Comment

by:Dave Baldwin
ID: 39842659
That's what I understand except that the second table is not directly visible to you, it is automatically generated by the MySQL storage engine.  There are limits on what you can do with a TEXT or BLOB field.  I don't think you can index them.  But as Ray pointed out, if you "SELECT *" you will retrieve them which slow things down if you are not actually using the data.
0
 
LVL 58

Author Comment

by:Gary
ID: 39842670
They are only retrieved on one page and updated on another page, so no select * for every query
0
 
LVL 84

Expert Comment

by:Dave Baldwin
ID: 39842688
I usually set some limit on the text that can be entered thru a form.  I would say that if your data will fit in one row / 65535 bytes then I wouldn't worry about.  If it was image or PDF files, I would store them outside the database anyway and only put the file name in the database.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 39842704
Maybe you should ask yourself what benefit you would hope to achieve by storing the data directly in the database versus storing it in a file, with a pointer to the file being stored in the database. Cost-benefit analysis of either approach might have bearing on which way you go with your design.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Suggested Courses

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question