Solved

Overflow of database volume to multiple DB servers

Posted on 2013-12-27
5
260 Views
Last Modified: 2013-12-31
I am looking to understand generally how SQL, MySQL and Oracle maintain identity field uniqueness where the database overflows the capacity of a single physical or virtual database server.  If I have a "bigint" ID field that is the identity field and start running into a volume or performance issues, what happens?

Any guidance or pointers to articles on this would be appreciated.

Thanks!
0
Comment
Question by:codequest
  • 3
5 Comments
 
LVL 33

Accepted Solution

by:
snoyes_jw earned 250 total points
ID: 39742872
You can use a GUID. It generally includes some identifier of the host or hardware that it's running on to guarantee uniqueness across systems. It's not a bigint anymore, though - generally some kind of string.
0
 
LVL 2

Author Comment

by:codequest
ID: 39743067
Thanks for the input.  I think I get the general idea about GUIDs.  I may have not phrased my question well, being essentially a newbie in this area.  So...

The application I'm focusing on is in ASP.NET.  The data is highly compartmentalized, i.e. consists of many multi-table data sets, ranging from approximately 1MB to 10 MB each, organized under a subscriber ID, with no relationship between subscribers (or maybe they would be related through some "company" table at some point;  I'm making allowances for that).  Only about 100KB of each subscriber data set would be active at any one time, and each of these 100KB units might be getting an average of 1 write per second.  Reads would be almost entirely from cache memory;  full 100KB loads to cache would happen only once an hour or so.    Few higher volume reads would occur other than for application maintenance.

Suppose I'm hosting this application on some large DB infrastructure, such as provided by a service like AWS or Azure.   I'm assuming that long before I run out of bigint identity values for subscriber ID, at some point I might run into performance and capacity problem running off a single copy of an .mdf file, and have to transfer operation to multiple .mdf files, using a variety of strategies.  

I'm assuming that for such services, the size of the single .mdf file, and number of accesses that could occur against that file without performance degradation, could be very large, perhaps supporting thousands of subscribers (10GB, 1000s of writes per second).

In this framework, it seems like my alternatives are:
A) make DB design choices now about problems that I might run into if (should I be so fortunate) the application has hundreds of thousands of subscribers (1TB, 100,000s of writes per second).  Like for example (I now invent things) make the subscriber ID a GUID and maybe add it to all the tables in a subscriber data set, not always as  foreign key, but as a means of physical partitioning.   Maybe do the same for Company ID.

OR

B)  Use bigint for my subscriber identity ID, let the AWS/Azure type service give me as much volume and performance scaling as possible on a single .mdf file, and hope that I can pay for the more complex DB redesign to meet scaling needs when I hit the capacity issues later.

In either case the application logic for handling multiple .mdf files would be written later.

So may the real question is, how difficult is it going to be for a standing start newbie like myself to do an effective job on alternative A?

Which reduces to, what are some sources for learning more about alternative A?

Because right now this is the only major stumbling block in proceeding with my detailed DB design.  So I'm sore tempted to put it off if I can get away with it.  However, depending on how difficult and important it is, I'll bite the bullet and do the best I can with it.

Any guidance on this would be appreciated.

Thanks!
0
 
LVL 16

Assisted Solution

by:Surendra Nath
Surendra Nath earned 250 total points
ID: 39743364
OK, there are few pointers I can help you with...

1) Instead of GUID, if you are already using SQL Server 2012 or Oracle, start using Sequence every where....
     A sequence is an object which will give a unique number for your application code when requested and it will not generate the number again.

oracle : http://docs.oracle.com/cd/B12037_01/server.101/b10759/statements_6014.htm
sql server : http://technet.microsoft.com/en-us/library/ff878091.aspx

2) once you create the sequence, then use the data paritioning in SQL Server to get multiple .MDF files.
0
 
LVL 2

Author Comment

by:codequest
ID: 39743776
Thanks for the input.   At this point I'm pretty sure I need to study the question of "DB design for scaling" in more detail.

Any pointers to articles on this subject, or even the key search terms, would be appreciated.
0
 
LVL 2

Author Comment

by:codequest
ID: 39749308
I've discovered that this is a vast and complex subject, so I'm going to close this out with points for pointers.

Thanks!
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
This video shows how to Export data from an Oracle database using the Original Export Utility.  The corresponding Import utility, which works the same way is referenced, but not demonstrated.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now