Warehouse system design

Posted on 1998-04-25
Last Modified: 2010-04-06
First, please note this is a *design* question; i don't need code - not yet at least :)

Schematically, in my warehouse, items (goods)  are charged upon receipt of "charge receipts", and are discharged upon issue of "discharge receipts".

In an RDBMS, to keep track of how many items in "warehouse" have been charged, discharged, and unsold (the difference), i could - on the fly - run proper queries on the "receipts" tables.

For performance purposes, i could instead store those values in the "warehouse" table directly, and update them each time a new receipt is created.

But, the question is: apart from performance, which is the best way to guarantee - or to enforce - consistency among "warehouse" and the two "receipts" tables?

I'm looking for *acquired* practices.

TIA, julio.
Question by:julio011597
  • 4
  • 2

Expert Comment

ID: 1336942
it seems that you want realtime processing. In a multi-user environment, like your warehouse, all data has to be consistant, even if multiple users work with the same data. So the first thing is to differ between read and write processes on the DB. Write processes have always to be verified, so if the data that is to be written has already been modified by another user in the meantime. This is a basic for transaction oriented systems. A transaction environment buffers all the data that is needed in a transaction buffer (say, a snapshot of a part of the DB on a special point of time). When the transaction is finished, the data to be written is verified by the transaction environment, or, if necessary, locked to other users.
If you use such an environment, you would not need to track the status in the DB, because the transaction will keep track of the current states by requerying the status when the transaction buffer is filled. This is more flexible, but more complex to code.
Your choice !


Author Comment

ID: 1336943
Thanks dd,

BUT, i've quite a different situation.

My application is actually an Intranet, where i have a bunch of CGIs accessing a DB.
Transaction issues are solved in a very simple manner - please, don't kill me :) -: all my tables are set exclusive! (i have a "max-tries-to-attempt" system to avoid deadlocks)

My concern though remains: what if the "charge receipt" process adds a receipt, then (someway!?) there's a crash BEFORE the "charged" value into "warehouse" gets incremented?

I bet there's some sort of *general* approach to this kind of potential inconsistencies.

(i keep stressing on *acquired* approaches because, while i'm able to image a few ways to go, i'm not sure they are the best human beings have ever invented :)


Author Comment

ID: 1336944
Hey d003303, is this question too dumb?... please, be frank.

BTW, there's a question in the Web Authoring area you might be interested in.

Title: "compress HTML"
Active Directory Webinar

We all know we need to protect and secure our privileges, but where to start? Join Experts Exchange and ManageEngine on Tuesday, April 11, 2017 10:00 AM PDT to learn how to track and secure privileged users in Active Directory.


Expert Comment

ID: 1336945
Hi julio,
I gave a comment on the tread you pointed to. Let's see what's to do there.
No, your question is not too dumb. I think I misunderstood your goal. So, if I'm right, you want to make all billings safe against any crashes in the billing process. What about calling a stored procedure that is setting all values accordingly on the DB-server side ? Or, another thing, if your prog crashes, send an administrative EMail and log the current state of the prog into a file that a billing process can be recovered. What do you think ?


Accepted Solution

cymbolic earned 100 total points
ID: 1336946
Generally, operational (transaction based) databases are kept separate from a data warehouse database, and an extract/load process is used to carefully update the data warehouse from the operational database.  One reason is that essential data structure and content vary.  A true data warehouse is designed for quick lookup and summary of essential data via multiple attributes.  Many modern data warehouses are stored in a star based design, with one central fact table and multiple dimension tables related back to the fact table.  

On the other hand, your operational database is structured for rapid access and high volume update/insert operations.  Much more detailed, and often long text data is saved in the database.

Your operational data is constantly changing and has an "accretion edge" of accumulating, unstable and changing data that you do not want into your data warehouse until it has stabilized and reached a point of valid content.

Typically, you would wait for a decent interval before loading the accumulated operational data into your warehouse.  In the process you might "scrub" the data to get say, the twelve different spellings of Cleveland down to one correct version.

Data warehouses need to have reliable, unchanging historical data so that your users can have a degree of confidence in the validity of the information, hence the time lag and careful loading process.  Generally, you would not go back and change data in your warehouse, only add to it as more operational data is gathered and time intervals pass.

For a good source, read "The Data Warehouse Toolkit" by Ralph Kimball, WIley press.  

Author Comment

ID: 1336947
Hello back,

sorry for this long silence: the fact is that there are few things not yet clear to me, and i've tried to inquire further on the net before coming back.

First, given the original question, cymbolic's answer sounds already satisfactory. d003303's contribution is also much appreciated, but falls a bit too much outside the domain of *pure* DB design.

Anyway, i'm a beginner in this field, and some additional clarifications (examples? online white papers?) are needed. Basically, i miss the "operational DB" concept at all. A few sample questions follow:

> extract/load process is used to carefully update the
> data warehouse from the operational database

Any scheme (method) in order to decide what to extract to the operational DB?
And, is Structured Analysis (ERD and DFD) enough to model such an architecture?

> Many modern data warehouses are stored in a star based
> design, with one central fact table and multiple dimension
> tables related back to the fact table

What "dimension tables" are?

> On the other hand, your operational database is structured
> for rapid access and high volume update/insert operations.
> Much more detailed, and often long text data is saved in
> the database

The overall doubt: given usual charge/discharge operations - by far, the most accessed in my system -, i cannot see a way to separate what should be rapidly accessed from what is to be considered "more details"... am i completely missing the point?

And, a side question: are Delphi's transaction facilities to be considered equivalent to an operational "bridge"?

BTW, i've posted this question to comp.databeses.theory, as well.
Here is part of what they answered:

Q: For performance purposes, i could instead store those values in the "warehouse" table directly.
A: I don't think anyone would recommend you follow the "don't store derived data" rule if it would mean that your database manager would crash or be unable to process user requests in an acceptable time. I agree with your solution using "warehouse tables".

Any comment?

Ok, a bit long, but i'll be happy to increase the points for your additional efforts - if any ;)

TIA, julio.

Author Comment

ID: 1336948
cymbolic, i'm going to grade you in a couple of days if nothing new happens.
Anyway, please not that, while your answer is very knowledgable, i still lack something like a *mechanism* to apply the method. That is, the said "way to separate what should be rapidly accessed from what is to be considered more details". Without any indication - even a simple example - i won't be able to go from theory to practice... or am i just missing the point?

Thanks in any case, julio

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
This video shows how to quickly and easily add an email signature for all users on Exchange 2016. The resulting signature is applied on a server level by Exchange Online. The email signature template has been downloaded from: www.mail-signatures…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question