Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 203
  • Last Modified:

Warehouse system design

First, please note this is a *design* question; i don't need code - not yet at least :)

Schematically, in my warehouse, items (goods)  are charged upon receipt of "charge receipts", and are discharged upon issue of "discharge receipts".

In an RDBMS, to keep track of how many items in "warehouse" have been charged, discharged, and unsold (the difference), i could - on the fly - run proper queries on the "receipts" tables.

For performance purposes, i could instead store those values in the "warehouse" table directly, and update them each time a new receipt is created.

But, the question is: apart from performance, which is the best way to guarantee - or to enforce - consistency among "warehouse" and the two "receipts" tables?

I'm looking for *acquired* practices.

TIA, julio.
0
julio011597
Asked:
julio011597
  • 4
  • 2
1 Solution
 
d003303Commented:
Yo,
it seems that you want realtime processing. In a multi-user environment, like your warehouse, all data has to be consistant, even if multiple users work with the same data. So the first thing is to differ between read and write processes on the DB. Write processes have always to be verified, so if the data that is to be written has already been modified by another user in the meantime. This is a basic for transaction oriented systems. A transaction environment buffers all the data that is needed in a transaction buffer (say, a snapshot of a part of the DB on a special point of time). When the transaction is finished, the data to be written is verified by the transaction environment, or, if necessary, locked to other users.
If you use such an environment, you would not need to track the status in the DB, because the transaction will keep track of the current states by requerying the status when the transaction buffer is filled. This is more flexible, but more complex to code.
Your choice !

Slash/d003303
0
 
julio011597Author Commented:
Thanks dd,

BUT, i've quite a different situation.

My application is actually an Intranet, where i have a bunch of CGIs accessing a DB.
Transaction issues are solved in a very simple manner - please, don't kill me :) -: all my tables are set exclusive! (i have a "max-tries-to-attempt" system to avoid deadlocks)

My concern though remains: what if the "charge receipt" process adds a receipt, then (someway!?) there's a crash BEFORE the "charged" value into "warehouse" gets incremented?

I bet there's some sort of *general* approach to this kind of potential inconsistencies.

(i keep stressing on *acquired* approaches because, while i'm able to image a few ways to go, i'm not sure they are the best human beings have ever invented :)

Cheers.
0
 
julio011597Author Commented:
Hey d003303, is this question too dumb?... please, be frank.

BTW, there's a question in the Web Authoring area you might be interested in.

Title: "compress HTML"
Url: http://207.114.128.129/topics/comp/www/authoring/Q.10049765
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
d003303Commented:
Hi julio,
I gave a comment on the tread you pointed to. Let's see what's to do there.
No, your question is not too dumb. I think I misunderstood your goal. So, if I'm right, you want to make all billings safe against any crashes in the billing process. What about calling a stored procedure that is setting all values accordingly on the DB-server side ? Or, another thing, if your prog crashes, send an administrative EMail and log the current state of the prog into a file that a billing process can be recovered. What do you think ?

Slash/d003303
0
 
cymbolicCommented:
Generally, operational (transaction based) databases are kept separate from a data warehouse database, and an extract/load process is used to carefully update the data warehouse from the operational database.  One reason is that essential data structure and content vary.  A true data warehouse is designed for quick lookup and summary of essential data via multiple attributes.  Many modern data warehouses are stored in a star based design, with one central fact table and multiple dimension tables related back to the fact table.  

On the other hand, your operational database is structured for rapid access and high volume update/insert operations.  Much more detailed, and often long text data is saved in the database.

Your operational data is constantly changing and has an "accretion edge" of accumulating, unstable and changing data that you do not want into your data warehouse until it has stabilized and reached a point of valid content.

Typically, you would wait for a decent interval before loading the accumulated operational data into your warehouse.  In the process you might "scrub" the data to get say, the twelve different spellings of Cleveland down to one correct version.

Data warehouses need to have reliable, unchanging historical data so that your users can have a degree of confidence in the validity of the information, hence the time lag and careful loading process.  Generally, you would not go back and change data in your warehouse, only add to it as more operational data is gathered and time intervals pass.

For a good source, read "The Data Warehouse Toolkit" by Ralph Kimball, WIley press.  
0
 
julio011597Author Commented:
Hello back,

sorry for this long silence: the fact is that there are few things not yet clear to me, and i've tried to inquire further on the net before coming back.

First, given the original question, cymbolic's answer sounds already satisfactory. d003303's contribution is also much appreciated, but falls a bit too much outside the domain of *pure* DB design.

Anyway, i'm a beginner in this field, and some additional clarifications (examples? online white papers?) are needed. Basically, i miss the "operational DB" concept at all. A few sample questions follow:

> extract/load process is used to carefully update the
> data warehouse from the operational database

Any scheme (method) in order to decide what to extract to the operational DB?
And, is Structured Analysis (ERD and DFD) enough to model such an architecture?

> Many modern data warehouses are stored in a star based
> design, with one central fact table and multiple dimension
> tables related back to the fact table

What "dimension tables" are?

> On the other hand, your operational database is structured
> for rapid access and high volume update/insert operations.
> Much more detailed, and often long text data is saved in
> the database

The overall doubt: given usual charge/discharge operations - by far, the most accessed in my system -, i cannot see a way to separate what should be rapidly accessed from what is to be considered "more details"... am i completely missing the point?

And, a side question: are Delphi's transaction facilities to be considered equivalent to an operational "bridge"?


BTW, i've posted this question to comp.databeses.theory, as well.
Here is part of what they answered:

Q: For performance purposes, i could instead store those values in the "warehouse" table directly.
A: I don't think anyone would recommend you follow the "don't store derived data" rule if it would mean that your database manager would crash or be unable to process user requests in an acceptable time. I agree with your solution using "warehouse tables".

Any comment?


Ok, a bit long, but i'll be happy to increase the points for your additional efforts - if any ;)

TIA, julio.
0
 
julio011597Author Commented:
cymbolic, i'm going to grade you in a couple of days if nothing new happens.
Anyway, please not that, while your answer is very knowledgable, i still lack something like a *mechanism* to apply the method. That is, the said "way to separate what should be rapidly accessed from what is to be considered more details". Without any indication - even a simple example - i won't be able to go from theory to practice... or am i just missing the point?

Thanks in any case, julio
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now