Warehouse system design

First, please note this is a *design* question; i don't need code - not yet at least :)

Schematically, in my warehouse, items (goods)  are charged upon receipt of "charge receipts", and are discharged upon issue of "discharge receipts".

In an RDBMS, to keep track of how many items in "warehouse" have been charged, discharged, and unsold (the difference), i could - on the fly - run proper queries on the "receipts" tables.

For performance purposes, i could instead store those values in the "warehouse" table directly, and update them each time a new receipt is created.

But, the question is: apart from performance, which is the best way to guarantee - or to enforce - consistency among "warehouse" and the two "receipts" tables?

I'm looking for *acquired* practices.

TIA, julio.
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

it seems that you want realtime processing. In a multi-user environment, like your warehouse, all data has to be consistant, even if multiple users work with the same data. So the first thing is to differ between read and write processes on the DB. Write processes have always to be verified, so if the data that is to be written has already been modified by another user in the meantime. This is a basic for transaction oriented systems. A transaction environment buffers all the data that is needed in a transaction buffer (say, a snapshot of a part of the DB on a special point of time). When the transaction is finished, the data to be written is verified by the transaction environment, or, if necessary, locked to other users.
If you use such an environment, you would not need to track the status in the DB, because the transaction will keep track of the current states by requerying the status when the transaction buffer is filled. This is more flexible, but more complex to code.
Your choice !

julio011597Author Commented:
Thanks dd,

BUT, i've quite a different situation.

My application is actually an Intranet, where i have a bunch of CGIs accessing a DB.
Transaction issues are solved in a very simple manner - please, don't kill me :) -: all my tables are set exclusive! (i have a "max-tries-to-attempt" system to avoid deadlocks)

My concern though remains: what if the "charge receipt" process adds a receipt, then (someway!?) there's a crash BEFORE the "charged" value into "warehouse" gets incremented?

I bet there's some sort of *general* approach to this kind of potential inconsistencies.

(i keep stressing on *acquired* approaches because, while i'm able to image a few ways to go, i'm not sure they are the best human beings have ever invented :)

julio011597Author Commented:
Hey d003303, is this question too dumb?... please, be frank.

BTW, there's a question in the Web Authoring area you might be interested in.

Title: "compress HTML"
HTML5 and CSS3 Fundamentals

Build a website from the ground up by first learning the fundamentals of HTML5 and CSS3, the two popular programming languages used to present content online. HTML deals with fonts, colors, graphics, and hyperlinks, while CSS describes how HTML elements are to be displayed.

Hi julio,
I gave a comment on the tread you pointed to. Let's see what's to do there.
No, your question is not too dumb. I think I misunderstood your goal. So, if I'm right, you want to make all billings safe against any crashes in the billing process. What about calling a stored procedure that is setting all values accordingly on the DB-server side ? Or, another thing, if your prog crashes, send an administrative EMail and log the current state of the prog into a file that a billing process can be recovered. What do you think ?

Generally, operational (transaction based) databases are kept separate from a data warehouse database, and an extract/load process is used to carefully update the data warehouse from the operational database.  One reason is that essential data structure and content vary.  A true data warehouse is designed for quick lookup and summary of essential data via multiple attributes.  Many modern data warehouses are stored in a star based design, with one central fact table and multiple dimension tables related back to the fact table.  

On the other hand, your operational database is structured for rapid access and high volume update/insert operations.  Much more detailed, and often long text data is saved in the database.

Your operational data is constantly changing and has an "accretion edge" of accumulating, unstable and changing data that you do not want into your data warehouse until it has stabilized and reached a point of valid content.

Typically, you would wait for a decent interval before loading the accumulated operational data into your warehouse.  In the process you might "scrub" the data to get say, the twelve different spellings of Cleveland down to one correct version.

Data warehouses need to have reliable, unchanging historical data so that your users can have a degree of confidence in the validity of the information, hence the time lag and careful loading process.  Generally, you would not go back and change data in your warehouse, only add to it as more operational data is gathered and time intervals pass.

For a good source, read "The Data Warehouse Toolkit" by Ralph Kimball, WIley press.  

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
julio011597Author Commented:
Hello back,

sorry for this long silence: the fact is that there are few things not yet clear to me, and i've tried to inquire further on the net before coming back.

First, given the original question, cymbolic's answer sounds already satisfactory. d003303's contribution is also much appreciated, but falls a bit too much outside the domain of *pure* DB design.

Anyway, i'm a beginner in this field, and some additional clarifications (examples? online white papers?) are needed. Basically, i miss the "operational DB" concept at all. A few sample questions follow:

> extract/load process is used to carefully update the
> data warehouse from the operational database

Any scheme (method) in order to decide what to extract to the operational DB?
And, is Structured Analysis (ERD and DFD) enough to model such an architecture?

> Many modern data warehouses are stored in a star based
> design, with one central fact table and multiple dimension
> tables related back to the fact table

What "dimension tables" are?

> On the other hand, your operational database is structured
> for rapid access and high volume update/insert operations.
> Much more detailed, and often long text data is saved in
> the database

The overall doubt: given usual charge/discharge operations - by far, the most accessed in my system -, i cannot see a way to separate what should be rapidly accessed from what is to be considered "more details"... am i completely missing the point?

And, a side question: are Delphi's transaction facilities to be considered equivalent to an operational "bridge"?

BTW, i've posted this question to comp.databeses.theory, as well.
Here is part of what they answered:

Q: For performance purposes, i could instead store those values in the "warehouse" table directly.
A: I don't think anyone would recommend you follow the "don't store derived data" rule if it would mean that your database manager would crash or be unable to process user requests in an acceptable time. I agree with your solution using "warehouse tables".

Any comment?

Ok, a bit long, but i'll be happy to increase the points for your additional efforts - if any ;)

TIA, julio.
julio011597Author Commented:
cymbolic, i'm going to grade you in a couple of days if nothing new happens.
Anyway, please not that, while your answer is very knowledgable, i still lack something like a *mechanism* to apply the method. That is, the said "way to separate what should be rapidly accessed from what is to be considered more details". Without any indication - even a simple example - i won't be able to go from theory to practice... or am i just missing the point?

Thanks in any case, julio
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.