Go Premium for a chance to win a PS4. Enter to Win


Warehouse system design

Posted on 1998-04-25
Medium Priority
Last Modified: 2010-04-06
First, please note this is a *design* question; i don't need code - not yet at least :)

Schematically, in my warehouse, items (goods)  are charged upon receipt of "charge receipts", and are discharged upon issue of "discharge receipts".

In an RDBMS, to keep track of how many items in "warehouse" have been charged, discharged, and unsold (the difference), i could - on the fly - run proper queries on the "receipts" tables.

For performance purposes, i could instead store those values in the "warehouse" table directly, and update them each time a new receipt is created.

But, the question is: apart from performance, which is the best way to guarantee - or to enforce - consistency among "warehouse" and the two "receipts" tables?

I'm looking for *acquired* practices.

TIA, julio.
Question by:julio011597
  • 4
  • 2

Expert Comment

ID: 1336942
it seems that you want realtime processing. In a multi-user environment, like your warehouse, all data has to be consistant, even if multiple users work with the same data. So the first thing is to differ between read and write processes on the DB. Write processes have always to be verified, so if the data that is to be written has already been modified by another user in the meantime. This is a basic for transaction oriented systems. A transaction environment buffers all the data that is needed in a transaction buffer (say, a snapshot of a part of the DB on a special point of time). When the transaction is finished, the data to be written is verified by the transaction environment, or, if necessary, locked to other users.
If you use such an environment, you would not need to track the status in the DB, because the transaction will keep track of the current states by requerying the status when the transaction buffer is filled. This is more flexible, but more complex to code.
Your choice !


Author Comment

ID: 1336943
Thanks dd,

BUT, i've quite a different situation.

My application is actually an Intranet, where i have a bunch of CGIs accessing a DB.
Transaction issues are solved in a very simple manner - please, don't kill me :) -: all my tables are set exclusive! (i have a "max-tries-to-attempt" system to avoid deadlocks)

My concern though remains: what if the "charge receipt" process adds a receipt, then (someway!?) there's a crash BEFORE the "charged" value into "warehouse" gets incremented?

I bet there's some sort of *general* approach to this kind of potential inconsistencies.

(i keep stressing on *acquired* approaches because, while i'm able to image a few ways to go, i'm not sure they are the best human beings have ever invented :)


Author Comment

ID: 1336944
Hey d003303, is this question too dumb?... please, be frank.

BTW, there's a question in the Web Authoring area you might be interested in.

Title: "compress HTML"

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.


Expert Comment

ID: 1336945
Hi julio,
I gave a comment on the tread you pointed to. Let's see what's to do there.
No, your question is not too dumb. I think I misunderstood your goal. So, if I'm right, you want to make all billings safe against any crashes in the billing process. What about calling a stored procedure that is setting all values accordingly on the DB-server side ? Or, another thing, if your prog crashes, send an administrative EMail and log the current state of the prog into a file that a billing process can be recovered. What do you think ?


Accepted Solution

cymbolic earned 300 total points
ID: 1336946
Generally, operational (transaction based) databases are kept separate from a data warehouse database, and an extract/load process is used to carefully update the data warehouse from the operational database.  One reason is that essential data structure and content vary.  A true data warehouse is designed for quick lookup and summary of essential data via multiple attributes.  Many modern data warehouses are stored in a star based design, with one central fact table and multiple dimension tables related back to the fact table.  

On the other hand, your operational database is structured for rapid access and high volume update/insert operations.  Much more detailed, and often long text data is saved in the database.

Your operational data is constantly changing and has an "accretion edge" of accumulating, unstable and changing data that you do not want into your data warehouse until it has stabilized and reached a point of valid content.

Typically, you would wait for a decent interval before loading the accumulated operational data into your warehouse.  In the process you might "scrub" the data to get say, the twelve different spellings of Cleveland down to one correct version.

Data warehouses need to have reliable, unchanging historical data so that your users can have a degree of confidence in the validity of the information, hence the time lag and careful loading process.  Generally, you would not go back and change data in your warehouse, only add to it as more operational data is gathered and time intervals pass.

For a good source, read "The Data Warehouse Toolkit" by Ralph Kimball, WIley press.  

Author Comment

ID: 1336947
Hello back,

sorry for this long silence: the fact is that there are few things not yet clear to me, and i've tried to inquire further on the net before coming back.

First, given the original question, cymbolic's answer sounds already satisfactory. d003303's contribution is also much appreciated, but falls a bit too much outside the domain of *pure* DB design.

Anyway, i'm a beginner in this field, and some additional clarifications (examples? online white papers?) are needed. Basically, i miss the "operational DB" concept at all. A few sample questions follow:

> extract/load process is used to carefully update the
> data warehouse from the operational database

Any scheme (method) in order to decide what to extract to the operational DB?
And, is Structured Analysis (ERD and DFD) enough to model such an architecture?

> Many modern data warehouses are stored in a star based
> design, with one central fact table and multiple dimension
> tables related back to the fact table

What "dimension tables" are?

> On the other hand, your operational database is structured
> for rapid access and high volume update/insert operations.
> Much more detailed, and often long text data is saved in
> the database

The overall doubt: given usual charge/discharge operations - by far, the most accessed in my system -, i cannot see a way to separate what should be rapidly accessed from what is to be considered "more details"... am i completely missing the point?

And, a side question: are Delphi's transaction facilities to be considered equivalent to an operational "bridge"?

BTW, i've posted this question to comp.databeses.theory, as well.
Here is part of what they answered:

Q: For performance purposes, i could instead store those values in the "warehouse" table directly.
A: I don't think anyone would recommend you follow the "don't store derived data" rule if it would mean that your database manager would crash or be unable to process user requests in an acceptable time. I agree with your solution using "warehouse tables".

Any comment?

Ok, a bit long, but i'll be happy to increase the points for your additional efforts - if any ;)

TIA, julio.

Author Comment

ID: 1336948
cymbolic, i'm going to grade you in a couple of days if nothing new happens.
Anyway, please not that, while your answer is very knowledgable, i still lack something like a *mechanism* to apply the method. That is, the said "way to separate what should be rapidly accessed from what is to be considered more details". Without any indication - even a simple example - i won't be able to go from theory to practice... or am i just missing the point?

Thanks in any case, julio

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
this video summaries big data hadoop online training demo (http://onlineitguru.com/big-data-hadoop-online-training-placement.html) , and covers basics in big data hadoop .
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Suggested Courses

963 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question