asked on

database design books

hi,

I am going to buy this:

SQL and Relational Theory: How to Write Accurate SQL Code:

https://smile.amazon.com/gp/product/1491941170/ref=ox_sc_act_title_1?ie=UTF8&psc=1&smid=ATVPDKIKX0DER

but this one quite basic on database topic, but when I check amazon book I see related stuff from

Database Design and Relational Theory: Normal Forms and All That Jazz

https://smile.amazon.com/gp/product/1449328016/ref=pe_2537300_206633840_em_gn_t1_2p_2_ti

What is the next database design book for MS SQL server 2014/2016 should I buy ? assuming from C. J. Date ?

SStory

I guess it depends upon what you hope to achieve. Are you a Computer Science student? If so I'd go with the university recommended books. Are you just wanting to learn SQL Server? If so, why wouldn't you just look for a single book on the subject, like maybe:
Murach's SQL Server 2016 for Developers: https://www.amazon.com/Murachs-SQL-Server-2016-Developers/dp/1890774960/ref=sr_1_3?s=books&ie=UTF8&qid=1473256675&sr=1-3&keywords=sql+server+2016

SOLUTION

zephyr_hex (Megan)

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SStory

I agree with zephyr_hex. Relational database concepts and theory are pretty much the same. There are SQL standards that keep the syntax relatively similar. Some vendors add a few proprietary things here and there and there are small differences in syntax from ACCESS to SQL Server to MySQL, etc. It is kind of like English in Britain, Austrailia, Southern US and Northern US. Understandable, but slightly differing. If you want to go specifically with SQL Server instead of something free like MariaDB(MySQL) or PostGres, then I would recommend a book on SQL Server specifically as there will be admin chores that are specific to it. But most comprehensive large books on the subject will cover darn near anything. I will say that using SQL Server in your product, unless small enough to use the free versions, does add an expense to it that can many times be avoided by MariaDB or Postgres (both free).

marrowyung

ASKER

SStory:

"Are you just wanting to learn SQL Server? If so, why wouldn't you just look for a single book on the subject"

I am a infrastruture side DBA and working more on T-SQL tuning and toward developer side, so that I can understand them more by using their world.

I just order this one:
1) https://www.amazon.com/gp/product/1491941170/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1
2) https://www.amazon.com/gp/product/150930200X/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1

I want to have books that describe to me what should do and what should not do in T-SQL world and why with example so that I can use this kind of skill to benchmark our own queries.

why that book you suggest ?

https://www.amazon.com/Murachs-SQL-Server-2016-Developers/dp/1890774960/ref=sr_1_3?s=books&ie=UTF8&qid=1473256675&sr=1-3&keywords=sql+server+2016

that one focus on good speed T-sQL queries ? page 100 talk about distinct and I am not sure if they know select distinct is not good! some advertiser will only see diff between select distinct and group by.

I heard that books from Ben-Gan, Itzik is good, how about your and I don't buy one more and I am waiting for more for SQL 2016, even SQL 2018 is on the way.

"I will say that using SQL Server in your product, unless small enough to use the free versions, does add an expense to it that can many times be avoided by MariaDB or Postgres (both free)."

i don't understand that. you mean MS SQL server is the best choice if we have to pay one ? I use MySQL before, I;d like to say very bad in terms of programming.

yeah, I;d like to learn more on primary on MS SQL and secondary on Oracle. these 2 x enough.

zephyr_hex,

"Database designs depends on its purpose. For example, if you were designing a data warehouse, your design would differ significantly than if you were designing a database for transactional data."

yeah, DW mostly focus on second normal form but transactional data like 3rd normal form.

all, I do some research, I see:
1) https://www.amazon.com/gp/product/1499367678/ref=ox_sc_act_title_4?ie=UTF8&psc=1&smid=ATVPDKIKX0DER
2) https://www.amazon.com/gp/product/0123735688/ref=ox_sc_sfl_title_29?ie=UTF8&psc=1&smid=ATVPDKIKX0DER
3) https://www.amazon.com/gp/product/0764599445/ref=ox_sc_sfl_title_30?ie=UTF8&psc=1&smid=ATVPDKIKX0DER
4) https://www.amazon.com/gp/product/1118530802/ref=ox_sc_sfl_title_31?ie=UTF8&psc=1&smid=ATVPDKIKX0DER
5)https://www.amazon.com/gp/product/0123693896/ref=ox_sc_sfl_title_32?ie=UTF8&psc=1&smid=ATVPDKIKX0DER

I am not sure which one is good but all of them should be good enough.

my objective is, read less theory but more practical example on make database application fast.

SStory

So you need to look for information on SQL Server best practices and SQL Server optimization.
I recommended the book I did by just looking inside at the table of contents. It seemed to be a well rounded book on the subject and from your original post I assumed you didn't know a lot about SQL Server and wanted to learn. If you already know how to use it for the most part, then you need a different type of book.

Why do you say DISTINCT is bad? I think it is neither good nor bad. There are times when you need to get DISTINCT results. That is what it is meant to do. Does it take longer? Probably so. If you want to speed up execution on SQL server, then you need a good understanding of indexes and what to put them on. You must decide if within a given field or fields that is/are searched upon frequently:

Are the values widely distributed enough to warrant an index? Is the benefit worth the overhead of having an index? Example: If the field holds only M for male or F for female and index on 10,000 records of those would not be helpful. If the field however hold M###### and followed by a unique number or if it is just a unique number, then it would benefit greatly from a index if that value is often used in Joins or where clauses, etc.

1.) Normalize DB
2.) No when to break normalization, if data warehousing and why and document that
3.) Use indexes appropriately. It may be that you have the live DB that replicates to a different DB that is used for decisions analysis, etc. It may be that you only want to overly use indexes on the second, non-live database.

This one looks to be good (I haven't read any of them). From the table of contents it looks good:

SQL Server Query Performance Tuning 4th ed. Edition
https://www.amazon.com/SQL-Server-Query-Performance-Tuning/dp/1430267437/ref=sr_1_fkmr2_3?s=books&ie=UTF8&qid=1473338609&sr=1-3-fkmr2&keywords=sql+server+2016+optimization

From your list, this one:
https://www.amazon.com/gp/product/1118530802/ref=ox_sc_sfl_title_31?ie=UTF8&psc=1&smid=ATVPDKIKX0DER
Has the most reader comments/reviews on Amazon.

Unless someone has read these books, it will be hard to give you a recommendation.

marrowyung

ASKER

SStory,

" from your original post I assumed you didn't know a lot about SQL Server and wanted to learn. "

No. I am not a new guy but I am a more on infrastructure DBA and I can pinpoint out quickly which queries has problems and why.

but I don't think it is enough as developer from time to time don't like my suggestion even later on they found I am right.

so I need more on developer's world concept and approach to learn T-SQL tunning, database architecture and then I will go to database warehouse using SQL 2014/2016.

I need more on from database design logical/phsically but more on T-SQL tunning in order to explain to devleoper why I suggest this.

", then you need a different type of book. "

good, then am my book list right? you can tell me which one is not that necessary as it is repeating.

"Why do you say DISTINCT is bad? I think it is neither good nor bad. T"

I have some post here around EE and many of the expertist here already told me the same thing, select distinct sort thing row by row and you can see from execution plan that there are a SORT operator, which is a CPU killer.

here, in peak hour, 2 x select distinct already make system very slow to response to all mouse click. I tried to kill 2 x select distinct queries everything returns to normal. just kill select distinct and change it to use group by.

we have an incident that one release, before developer knews select distinct is bad, create web click feature that keep generating a lot of select distinct. which totally kill MS SQL.

internet IIS error 503 appeared and company web system can't display anything anymore.

later on, even the junior customer service will tell developer DO NOT use select distinct or customer will call and complain.

the framework architecture also start to say do not select distinct anymore or CPU full loading.

the more the record it process, the higher the CPU as the SORT operator is a CPU killer.

I use whoisactive to check real time performance and select distinct will be always the one of the highest CPU cost.

" If you want to speed up execution on SQL server, then you need a good understanding of indexes and what to put them on.
You must decide if within a given field or fields that is/are searched upon frequently:
"

someone here told me one thing which is good to consider, the way to make thing good is:
1) carefully define primary key and therefore clustered index.
2) define unique constraint, which create unique index,which make no duplicate record,
3) the way to do T-SQL programm, get ride of subqueries by inner join and exists.
4) finally is the index.

I tried to add 3 x index suggested by optimizer and finally NO IMPROVEMENT at all. specially on the join colum, still no improvement .

it only improve search of the leaf node of the data loading operation but this is a small issue only.

if there are too much subqueries, sorry you queries was dead. too much nested loop join and eveything is slow (even nested loop join not necessary slow)

I tried to only convert on subqueries to inner join, already faster by 150%, I don't even have to touch index.

recently tried about index as on queries tuning point of view, it doesn't help that much.

but we review index join unnecessary index/ add missing index/ combine index to save page split and disk space.

"From your list, this one:
https://www.amazon.com/gp/product/1118530802/ref=ox_sc_sfl_title_31?ie=UTF8&psc=1&smid=ATVPDKIKX0DER"

is it focus on all type of database, like oracle and DB2 as well. ?

marrowyung

ASKER

sstory,

you didn't own a copy of the book at all ?

"From your list, this one:
https://www.amazon.com/gp/product/1118530802/ref=ox_sc_sfl_title_31?ie=UTF8&psc=1&smid=ATVPDKIKX0DER
Has the most reader comments/reviews on Amazon.
"

is this books more on DW 2.0 or even 3.0 ?

is 3.0 is more on big data and data scientist?

I find no books about DW 3.0 from Amazon.

ASKER CERTIFIED SOLUTION

SStory

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

marrowyung

ASKER

hi,

"I actually don't own specific books on SQL Server. "

I just order 2 x books but I also wondering why I need that as internet has a lot of information, but the books I order is highly recommended by everyone.

". There are times when I needed select distinct to get the job done."

yeah, I knew, programmer here do that too. but finally they really found it is not good for CPU resource.

" I learned tuning by reading docs online and looking at query execution plans."

so any good books to read the detail execution plan ? or good link to learn quickly the key to turn using execution plan ?

". I assume you are doing SHOW PLAN or something to see the execution plan and try to find the problems. Without a specific query to look at, it would be hard for us to help optimize it. Indexes on the correct fields should help. However if you are joining PK to FK, they are probably already indexed automatically. If you are doing WHERE statements with things that aren't PK or FK they would be great candidates for INDEXING if values are distributed widely enough to benefit."

tks. but here I tried many example that index can't help much so much, the optimizer suggest the missing index and 0% increase on speed.

"From a developer's point of view: They have data to return and are interested in getting the correct data somehow. Later if they are a good developer they may see how to optimize it and may even write it differently."

yes, I agree on this.

"Joins are better than subqueries sometimes. "

so what is the case subqueries i better than join ?

here finally they will use exists to replace join.

"i.e. SELECT * is almost always wrong. "

yes, when table very large.

"SQL Server is a memory and resource hog. "

I think most of the time, physical IO is, RAM is only second.

"I think functions are also costly, but not sure. "

you are right. e.g. XML function.

" That made querying it much better and more efficient. So sometimes you have to rearrange the way the data is accessed to improve querying offline data.

so this is about data architecture, right?

" In my case I took live data to the decision making side, and after import ran a query that created fields in table for each of these and put their values in it."

can you explain what is that mean ?

SStory

>>I just order 2 x books but I also wondering why I need that as internet has a lot of information, >>but the books I order is highly recommended by everyone.
Good books can avoid a lot of searching for the right opinions and guidance. Yes, you can Google things to find a lot of information. Example: "SQL Server" "Show plan" optimize
Or optimize "SQL SERVER" queries.

" I learned tuning by reading docs online and looking at query execution plans."

so any good books to read the detail execution plan ? or good link to learn quickly the key to turn using execution plan ?
Again, I learned to do this by searching Internet, reading books and someone else introducing me to the concept. However, Microsoft has books online:
https://technet.microsoft.com/en-us/library/ms130214%28v=sql.105%29.aspx
There may be some information in data mining that would be of interest.

>>tks. but here I tried many example that index can't help much so much, the optimizer suggest >>the missing index and 0% increase on speed.
It would be hard for anyone to help improve speed without knowing what you are doing exactly. Sometimes they query plan is already doing those things. If you have millions of rows, it is going to take a while.

>>so what is the case subqueries i better than join ?
I guess try them both and do the SHOW PLAN to see which yields better results

>>" That made querying it much better and more efficient. So sometimes you have to rearrange >>the way the data is accessed to improve querying offline data.
>>so this is about data architecture, right?
I'm just saying the original database may not be constructed in the most ideal format for querying. In my case of a string field where each char represents a status of a utility service
AACCDNACDA, this is called a compound field (I think), and should have been defined as one field per service, but it was brought over from non relational DB days of flat ISAM files and not split into separate fields. Therefore when I offloaded that data from the live system to a decision making database system ( an non-live, non-production RDBMS server), I used the SUBSTRING function of MySQL(in my case), but SQL Server has similar capabilities, in order to strip apart each character into its own field. I also indexed appropriate fields in tables after the import. An import was done regularly to refresh the data and after each time this script was run to make the structure better. After that we'd never use the original STATUS field, but rather individual ones. This avoided the function calls to substring in the middle of WHERE statements, etc and performed a LOT better. It avoided some full row scans (the worst). So if your DB is this sort and not live, you may have to improve the structure as I did to offer the programmers better options. Of course a good DBA would never have created such a field in the first placel

>" In my case I took live data to the decision making side, and after import ran a query that >created fields in table for each of these and put their values in it."
>can you explain what is that mean ?
(see above)

marrowyung

ASKER

"Good books can avoid a lot of searching for the right opinions and guidance."

yeah, that's why I still prefer books as it guide you from beginning instead of piece by piece information from the web.

marrowyung

ASKER

tks all