Ramification of using nvarchar(4000) versus nvarchar(300) and Normalization overkill?

SQL Server 2014
Two questions:
Ramifications of using fields in the same table, with many being null, versus putting them in a relational table and only storing when there is data.
And what are the ramifications between using nvarchar(300) versus nvarchar(4000) as far as space differences?

I have a table to store optional data text fields. The length of most of these fields will not exceed 300. However, I have a description field that can be up to 4000 characters. I could create a separate table to store that field as an nvarchar(4000), but if I just create the ONE table and store them all as nvarchar(4000) are there any performance or size issues if I also use this field length for my 300 max strings?

I did have ONE table with these fields:
      [Label] [nvarchar](300) NULL,
      [Tooltip] [nvarchar](300) NULL,
      [ValidationMessage] [nvarchar](300) NULL,
      [Description] [nvarchar](1000) NULL,

But since from another post, I was told it's not good to store null columns, and many of these columns might be null, I'm going to put them in a separate table and use a code to indicate which one of the fields it will represent:
      [Data] [nvarchar](300) NULL,
      [FieldTypeCode] [int] NULL,

My query will ultimately look like this to pull them all together into one row as though they're in the same table.

select
label.Data as label,
tooltip.Data as tooltip,
validationMessage.Data as validationMessage,
Localization.* from Localization
left outer join LocalizationData as label on localization.localizationid = label.localizationId and label.localizationfieldtypecode = 1
left outer join LocalizationData as tooltip on localization.localizationid = tooltip.localizationId and tooltip.localizationfieldtypecode = 2
left outer join LocalizationData as validationMessage on localization.localizationid = validationMessage.localizationId and validationMessage.localizationfieldtypecode = 3

This isn't overkill is it? For instance, if the validation message doesn't exist, it will be null. So with the second table, it will never get stored as null and will never take up that space. With the first table design, many of the rows would be null. Possibly same with tooltip, and definitely the same for descriptions.
LVL 2
Starr DuskkASP.NET VB.NET DeveloperAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Vitor MontalvãoMSSQL Senior EngineerCommented:
I'm more worried about the use of nvarchar instead of varchar. Why do you need an Unicode data type? Are you going to use it to store special characters from for example Chinese or Arabic language? nvarchar requires the double of bytes of varchar datatype to store characters (2bytes against 1byte).

Also, being varchar (or nvarchar), it means that will only spend the necessary bytes to store the data, meaning if you only use 100 characters it won't fill the rest of 4000 characters with 3900 spaces, so you're already saving space. The same wouldn't be true for char or nchar datatypes.
0
Starr DuskkASP.NET VB.NET DeveloperAuthor Commented:
yes, these fields will be storing languages.

So the question still remains, is having the first table with the separate fields, many occasionally empty -- versus the second table only storing data if it is needed -- a good data model? or is it overkill?

And second, it seems you're telling me that if I have a 300 character text, and use nvarchar(4000) - it won't take up anymore space than if it were nvarchar(300). Correct?
0
Scott PletcherSenior DBACommented:
There's no problem per se with NULL values in a row.  Have you thought through the possible issues here?  What if a column that previously had mostly nulls gets mostly full -- will you move it back to the main table?  And vice versa?  Putting data in a different table does relate to size, but it also relates to how often that column is used.  If a column is large and almost never referenced, it makes sense to put it in a separate table to improve the performance of the main table.

Note that if the column were varchar(max), you'd have another option: define it in the main table, but force SQL to store it out-of-row, in an overflow page(s).

As to column size vs. space used, as noted, nvarchar(4000) won't take any more disk/buffer space than nvarchar(300), but it will cause SQL to pre-allocate more memory for queries using that column.  Therefore, it's definitely best not to oversize columns too much.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Protecting & Securing Your Critical Data

Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

Starr DuskkASP.NET VB.NET DeveloperAuthor Commented:
Thank you Scott.

As far as my original post query to get the data from the sub table, is there going to be a performance hit with that, as opposed to having all those fields in the same table? Notice all the left outer joins on the table and renaming and such.

And no, I wouldn't be moving data back into the main table. Just on the other post I had earlier, seems I was being told that you shouldn't have a table with a lot of null values. I'll definitely move my "content" field into a separate table for the nvarchar(max) that will rarely get used, but maybe it caused me to overkill and move my text fields out as well. That's what I'm wondering about.

thanks!
0
Scott PletcherSenior DBACommented:
Tables with a lot of columns often NULL typically indicates some type of design issue, but you can't fix that just by adjusting one table anyway.

There is certainly a performance hit having to go to another table, as additional separate I/O must be done.  In fact, if you go to a second table for every row in the main table, you're almost doubling your I/O.  That's a big hit.  That's why it's more common to do that only for columns that are exceptionally large or only rarely, or at least less frequently, referenced.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
I just came here to answer you but then I saw that Scott gave you a very good explanation and honestly, I don't have nothing more to add to what Scott told you.
0
Starr DuskkASP.NET VB.NET DeveloperAuthor Commented:
In my situation, this will be used for localization of all the form labels, tooltips, descriptions, validation message. Some may not have tooltips, most will not have descriptions, and many will not have validation messages. Each client will be able to put in their own names for labels to override the defaults, as well as these will be used by foreign languages also.

So i'm just trying to come up with the best solution for this scenario. I really can't imagine what I can do with the database design other than the two scenarios that I have proposed -- same table versus join table. If you have suggestions, I'm open to them.

The only other design would be to have the main table be just one field type at a time, but then the other fields would be super redundant, like the clientId and Language that is applicable to each field. So I prefer to keep them all together as a "label set" - each label has a value, validation, tooltip, and description.

Final verdict: I will keep the 4 types in the main table: label, tooltip, description, validationMessage.
I will move the nvarchar(max) Comment out of the main table into a separate table, because will rarely be used and will use the left outer join on it.

If you see a problem with that, please let me know.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
I would still keep a single table with all columns.
0
Scott PletcherSenior DBACommented:
As I noted above above, you can keep a single table for any [n]varchar(max) columns to go outside the main table storage area:

USE <db_name>
EXEC sys.sp_tableoption '<your_table_name>', 'large value types out of row', 'ON'
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
SQL

From novice to tech pro — start learning today.