Database Design Question

I have a database design question. We have a database that holds delivery orders. Each order is associated with a company who is providing the delivery. So each order has an associated delivery company name,  and recipient phone number and recipient address. The add and edit forms have a text input field for the address, that uses ajax to query the existing addresses in the database based on the phone number entered, for autocomplete in a datalist associated with that text input field.

The Orders table will be huge. The concern is that the query of the database that gets the existing addresses from the Orders table  with each key press will get too slow. It has been suggested that I create a separate table for every company, in addition to the main Companies table and Orders table, that contains only the phone numbers and addresses that pertain to that company, so that the query becomes manageable and fast.

We expect to have 15,000 customers, so this will mean a few main tables, and then 15,000 separate tables that contain phone numbers and addresses that pertain only to each company.

Is this a good idea? 15,000 tables sounds odd to me, but maybe it is good. If not, what is the best way to handle this? Partitions of sorts?
LVL 9
BobHavertyComhAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Dale FyeOwner, Developing Solutions LLCCommented:
No, this is not a good idea.

1 table with indices on the Company and Phone number fields should work fine in SQL Server.
0
Guy Hengel [angelIII / a3]Billing EngineerCommented:
1 table with 15K records, 1 record per company/phone is just fine.
1 table for the orders, with a foreign key to the company.
0
Kent OlsenDBACommented:
Depending on how much data you'll really have and how you'll use it you might want to partition the orders table, but separate tables is an application and performance nightmare.  Avoid it at all costs.


Kent
0
Acronis Data Cloud 7.8 Enhances Cyber Protection

A closer look at five essential enhancements that benefit end-users and help MSPs take their cloud data protection business further.

Shaun KlineLead Software EngineerCommented:
Having separate tables per company will be a maintenance headache. If each company can have one or many "shipping" addresses/phone numbers, a better option would be to have one table for all companies and their shipping addresses/phone numbers, placing a non-clustered index across the company and the phone number columns. This index would then allow any query against company and phone number to be relatively fast.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
. It has been suggested that I create a separate table for every company
Didn't you misunderstand it? I think the idea is to create a separate table for all companies so each record of that table will belong to a single company then you just need to query the table.
0
BobHavertyComhAuthor Commented:
Hi guys. Thanks for the input. Really appreciated. Will divy points to all.

Didn't you misunderstand it? I think the idea is to create a separate table for all companies so each record of that table will belong to a single company then you just need to query the table.

What we had was one Orders table. Each order has a delivery company associated with it, and that value links to the Companies table. Each order also has a recipient phone number and address. We can possibly break the address and phone number out to a separate table and link it to the Orders table for better normal form.

But what is clearly being asked of me is to create a separate table for each company that stores the phone numbers and addresses for that order based on what company it is associated with. So instead of one phoneAddress table linking to the one Orders table, that phoneAddress value on the Orders table will link to a value on a phoneAddress table created just for that company.

I can see what they want, which is for less query time. They feel that it would be faster to only have to query a small table that is that company's own, rather than a larger one with a company name filter and a SELECT DISTINCT.
0
BobHavertyComhAuthor Commented:
Hi Guy
1 table with 15K records, 1 record per company/phone is just fine.
1 table for the orders, with a foreign key to the company.

I'm sorry, but I didn't explain everything correctly. The problem they have is that yes there will be 15,000 companies on the companies table, but each company will have a lot of orders on the Orders table, and hence a lot of duplicate addresses on that table and all the more values to search through. So we can split out phone number and address to another table and link it to orders, preventing duplicate phone and address info. The fear is that splitting out the phone number and address to a linked table with a company name field for filtering will still be too huge to query. So they want to be able to find the table by company name (or what really amounts to table name), and then query that, and it will be a much smaller query because it will only contain records pertaining to that company.

I'm wondering if anybody has ever heard of a technique like this. I have not. If all of this still doesn't sound like the right answer, what do I tell them?
0
Kent OlsenDBACommented:
Hi Bob,

The problem that you're going to have is when you have queries that span multiple companies.  That will have to be done with queries for each company.  At some point in the query a UNION ALL will have to be done, and depending on when that occurs performance will surely suffer.

It seems that partitioning the orders by company is probably what you want.  It isolates the overhead of a query by a particular company so you get about the same performance as a separate table but allows you to produce multi-company queries with the same SQL as if you had a monolithic table.


Kent
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
BobHavertyComhAuthor Commented:
Hi Shaun,
Having separate tables per company will be a maintenance headache.

Yes, if we change the schema, we have to have scripts change it for 15,000 tables rather than one. I don't have any idea how I would manage this is SQL Studio, so that's a tip off for me that this seems odd.

If each company can have one or many "shipping" addresses/phone numbers, a better option would be to have one table for all companies and their shipping addresses/phone numbers, placing a non-clustered index across the company and the phone number columns. This index would then allow any query against company and phone number to be relatively fast.

This sounds interesting, but I'm not sure if I quite got it. You did understand what I meant, because it's not the company's phone and address, it's the "shipping" address of their customer. This is why the phone and address will be linked to the orders table, not the companies table.
0
BobHavertyComhAuthor Commented:
Hi Kent,

Hi Bob,
The problem that you're going to have is when you have queries that span multiple companies. That will have to be done with queries for each company.

I think the way they see it, is that you search for a phoneAddress table that has the same name as the company, since it's the company's own private phoneAddresses table with each company having one, and then you query that table and it's as small of a query as possible at that point. I understand what they are thinking, but I've never heard of anything like creating 15,000 tables, so I'm immediately suspicious that this is not the way it's done. And if you have to search from 15,000 table names to get to the smaller table, are you really gaining anything?


  At some point in the query a UNION ALL will have to be done, and depending on when that occurs performance will surely suffer.

I'm not sure where this comes into play with their idea as explained above.


It seems that partitioning the orders by company is probably what you want.  It isolates the overhead of a query by a particular company so you get about the same performance as a separate table but allows you to produce multi-company queries with the same SQL as if you had a monolithic table.

Kent

This makes a lot of sense to me, and this is probably how people handle problems like this. As I said, I've never heard of creating a table for every customer or company. So if partitioning is used, maintenance and design changes still remain singular, rather than running scripts on 15,000 tables to make schema changes and other things.
0
Dale FyeOwner, Developing Solutions LLCCommented:
I think what you may be looking for a "Recipients" table, which describes who the shipment is going to.  That table might have RecipientID, and details about the recipient.  

Since each Recipient might have more than one phone #, then you might want a RecipientPhones table so that you can lookup the RecipientID when the user types in the phone number; this table would have RecipientID and Phone, maybe PhoneType and could have multiple values for each RecipientID, but would be unique across the RecipientID and Phone columns.  

Then you would have a RecipientAddress table which would allow multiple shipping addresses for a particular RecipientID.
0
BobHavertyComhAuthor Commented:
Hi Dale
No, this is not a good idea.
1 table with indices on the Company and Phone number fields should work fine in SQL Server.

I suspect you are correct, but I need to explain to them why it's not a good idea and I'm having difficulties doing so because I am not an expert, I just have a reasonably solid understanding of database design. So my explanations as to why this is not good are not definitive and authoritative enough.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
But what is clearly being asked of me is to create a separate table for each company that stores the phone numbers and addresses for that order based on what company it is associated with.
Run away from that "solution".
0
Kent OlsenDBACommented:
Use a physical analogy to explain this.

Would the office install a minimal number of file cabinets to hold 15,000 folders or would they buy 15,000 file cabinets?

The database works a lot like those file cabinets.
0
BobHavertyComhAuthor Commented:
Hi Dale.

I think what you may be looking for a "Recipients" table, which describes who the shipment is going to.  That table might have RecipientID, and details about the recipient.  

Since each Recipient might have more than one phone #, then you might want a RecipientPhones table so that you can lookup the RecipientID when the user types in the phone number; this table would have RecipientID and Phone, maybe PhoneType and could have multiple values for each RecipientID, but would be unique across the RecipientID and Phone columns.  

Then you would have a RecipientAddress table which would allow multiple shipping addresses for a particular RecipientID.

Currently, I'm thinking that phone and address get taken off orders table and linked to a separate phone/Address table with no duplicate values. This is kind of what you are talking about. Addresses to populate the datalist will be found by the phone that has been entered in the form matched against the phone field of the phone/address table to find addresses associated with that phone. But not only can each address have more than one phone#, each phone# can have more than one address in our scenario. So it's a many to many.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
Why are you relating phone to address?
You should have the ORDER table and the COMPANY table with a CompanyID column that will be the Primary Key for the table. Then you'll need a link table (COMPANY_ADDRESS) for company deliver addresses where you'll have the CompanyID as Foreign Key. These 3 tables should be enough to accomplish what you want.
0
Dale FyeOwner, Developing Solutions LLCCommented:
Yes, that would be the proper scenario a table of phone #s and addresses.  You can determine what that table would look like with a simple query,

SELECT DISTINCT PHone, Add_Street1, Add_Street2,  Add_City, Add_State, Add_Zip
FROM Orders

This would tell you how many unique phone/address combinations you have.  If you put that into a new table, with an identity field, you could then store the identity field in the Orders table.  This new table, indexed on Phone# would be a very fast search.
0
Dale FyeOwner, Developing Solutions LLCCommented:
Victor,

I think Bob's initial description was confusing.  I believe the phone# and Company are unrelated fields.  The Company field is describing the delivery company.  The phone number and address are describing the recipient.
0
BobHavertyComhAuthor Commented:
Use a physical analogy to explain this.
Would the office install a minimal number of file cabinets to hold 15,000 folders or would they buy 15,000 file cabinets?
The database works a lot like those file cabinets.

Super analogy. There aren't any concerns about maintenance nightmares, i.e., having to maintain 15,000 file cabinets, whether there should be or not. Their only concern is to find the right folder as fast as possible, no matter what the cost. They seem to feel that having 15,000 file cabinets will allow them to find the info faster. To be more specific, consider each company having 1,000 deliveries (orders), and there are 15,000 companies.  It seems to me that they feel that if 1 table had 15 million records on it, it would be slower to query that table for results than it would be if they first search through 15,000 table names for the correct table name (cabinet) and then queried only that table/cabinet. They feel the search for the table name is not expensive at all.
0
BobHavertyComhAuthor Commented:
Victor,
I think Bob's initial description was confusing.  I believe the phone# and Company are unrelated fields.  The Company field is describing the delivery company.  The phone number and address are describing the recipient.

That is correct. I apologize. By recipient phone  and address I meant the customer, not the company. Sorry.
0
BobHavertyComhAuthor Commented:
SELECT DISTINCT PHone, Add_Street1, Add_Street2,  Add_City, Add_State, Add_Zip
FROM Orders

That's what I was using. That got rejected.

If you put that into a new table, with an identity field, you could then store the identity field in the Orders table.  This new table, indexed on Phone# would be a very fast search.

Agreed, but they don't feel it would be fast enough because you would still be searching through ALL of the phone numbers and addresses of ALL delivery companies and filtering out by company name. They seem to feel that separate tables are more efficient than large table query filtering. So while I might agree with you, why would they be wrong? Unfortunately, I don't have the luxury of just saying something is wrong, I have to prove it and if I don't, I have to implement it.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
They feel the search for the table name is not expensive at all.
You should question them why. How they'll know which table is it? Then they need to build the query dynamically which means no use of cache and need to be compiled each time that the query will run.
Plus, they need to maintain more than 15K tables. Need to check if exist don't create. If not, create. And if in the future they need to add one more column? They need to add it in 15K tables!!!
0
BobHavertyComhAuthor Commented:
The issue of splitting out the address and phone fields into another table for 3rd normal form is kind of a separate issue that I will do one way or the other.

I guess the really simple question comes down to, is it any faster to first find the correct table from 15,000 tables and then query the table having to only search 1000 records, or is it faster to simply use the correct filters and query one table with 15,000,000 records on it?
0
Dale FyeOwner, Developing Solutions LLCCommented:
What do you mean by "that got rejected"?

Rejected by "they"?

If none of the shipping companies overlap jurisdictions, then you might add the Company field to the PhoneAddress table and index by Company and Phone# (as was indicated early on).  But I think that is overkill.  You mentioned in one of your posts that you want some form of type-ahead feature in your application, is that correct?

The creation and maintenance of 15000 tables (I don't even know whether that is feasible) alone would be a nightmare.

Who is this "They" you keep talking about?  Do they have any technical background at all?
0
BobHavertyComhAuthor Commented:
You should question them why. How they'll know which table is it? Then they need to build the query dynamically which means no use of cache and need to be compiled each time that the query will run.
Plus, they need to maintain more than 15K tables. Need to check if exist don't create. If not, create. And if in the future they need to add one more column? They need to add it in 15K tables!!!

Yup. I'm scared. but they don't care about those issues, it's worth it to them. As to why it would be easier to search for tables, rather than search within a table? It's an assumption they are making. I have no idea whether that's true, and I'm not sure if they do either. They mentioned that tables can be "indexed", and therefore searching for them is not expensive at all. I have no idea whether that is actually true or not and I have no reason to assume it is. But if I can't prove it to be wrong, I'll have to implement it. That's the way it works in the situation I am in.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
They mentioned that tables can be "indexed", and therefore searching for them is not expensive at all.
As company names, telephone numbers or addresses. So having the correct index set, searching a row in 15K rows should not be expensive at all.
Like Dale wrote: Do they have any technical background at all?
0
BobHavertyComhAuthor Commented:
What do you mean by "that got rejected"?

Rejected by "they"?

Yes, sorry for not being clear. I mean the way you are talking about, is the way that I originally did things and it got rejected, even though I used what I consider to be standard, textbook procedures

If none of the shipping companies overlap jurisdictions, then you might add the Company field to the PhoneAddress table and index by Company and Phone# (as was indicated early on).  But I think that is overkill.  You mentioned in one of your posts that you want some form of type-ahead feature in your application, is that correct?

We have a company table representing companies that make deliveries. We have an orders table that records all deliveries that all companies make, with a company field linking to the company table. Every order also has a recipient address and phone number.

As we agreed, address and phone number will go on a separate table called "PhoneNumbers" linked to the orders table, avoiding duplicate phone and address info on the orders table. Also included on the PhoneNumbers table will be a company name field for filtering of results. This field will link to the companies table

New order and edit order forms have a phone field and address field. After phone field is filled out, onFocus the address field goes out and gets all the addresses from the PhoneNumbers table. SELECT DISTINCT won't be necessary anymore as all values will be unique. The query is filtered by company name. All good so far, right? Wrong. It got rejected in favor of removing the company name field from the PhoneNumbers table that was acting as a filter, and instead creating a new PhoneNumbers table for each company.


The creation and maintenance of 15000 tables (I don't even know whether that is feasible) alone would be a nightmare.

They say it's easy and they don't care even if it's not.


Who is this "They" you keep talking about?  Do they have any technical background at all?

The main decision maker has 30 years of software development experience, has acknowledged not specifically knowing all about databases and has said that the software experience is more important than specific database experience when deciding information management. So as you might suspect, this is not a database person asking me to do this. The real problem is that my opinion does not have enough credibility with them. That's really what is comes down to, so they decided to guess it their way until someone they respect tells them otherwise. I'm just double checking to make sure I'm not crazy.
0
BobHavertyComhAuthor Commented:
Like Dale wrote: Do they have any technical background at all?

More in software development, not databases. Well, I think I'm going to have to insist upon having an expert come in and talk to them, as they just don't have any confidence in my recommendations if those recommendations don't agree with what they are thinking. I guess I just wanted to first check with you guys to make sure that calling an expert in would be the right idea, and I think that everyone here would be unanimous on that decision, given what is being proposed. I think it would avoid a very large potential disaster.
0
BobHavertyComhAuthor Commented:
Thanks guys. It's pretty clear and everyone seems to be in agreement. I'm going to ask to have an expert come in.
0
Dale FyeOwner, Developing Solutions LLCCommented:
Bob,

Although I write software, my expertise is in data analysis and database design.  Unfortunately, most software developers have very little technical knowledge of how databases work.  they are comfortable writing their SQL strings in their code, which allows them to write code like:

strSQL = "SELECT * FROM [SomeTable " & CompanyID & "]"

instead of

strSQL = "SELECT * FROM [SomeTable] WHERE [CompanyID] = " & CompanyID

They have no concept of the amount of effort required to create and maintain tables and to optimize queries.  They should leave that to the database experts.

Dale
0
BobHavertyComhAuthor Commented:
Unfortunately, most software developers have very little technical knowledge of how databases work
Precisely Dale. I've said this to them, and I was shot down as if I know nothing. I have done a reasonable amount of both web application design and database design, and the two are indeed two different animals with different overall methodologies and philosophies for managing information. So if they have no respect for my recommendations, then perhaps they will listen to a certified database expert in person (on their nickel). That's all I can do to prevent a potential disaster. It's going to be difficult to move forward with their plan knowing that I will eventually have to tear it down and rebuild it. But I'll just plan accordingly for that eventuality. Thanks again.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.