Solved

Warehouse system design

Posted on 1998-04-25
7
187 Views
Last Modified: 2010-04-06
First, please note this is a *design* question; i don't need code - not yet at least :)

Schematically, in my warehouse, items (goods)  are charged upon receipt of "charge receipts", and are discharged upon issue of "discharge receipts".

In an RDBMS, to keep track of how many items in "warehouse" have been charged, discharged, and unsold (the difference), i could - on the fly - run proper queries on the "receipts" tables.

For performance purposes, i could instead store those values in the "warehouse" table directly, and update them each time a new receipt is created.

But, the question is: apart from performance, which is the best way to guarantee - or to enforce - consistency among "warehouse" and the two "receipts" tables?

I'm looking for *acquired* practices.

TIA, julio.
0
Comment
Question by:julio011597
  • 4
  • 2
7 Comments
 
LVL 4

Expert Comment

by:d003303
ID: 1336942
Yo,
it seems that you want realtime processing. In a multi-user environment, like your warehouse, all data has to be consistant, even if multiple users work with the same data. So the first thing is to differ between read and write processes on the DB. Write processes have always to be verified, so if the data that is to be written has already been modified by another user in the meantime. This is a basic for transaction oriented systems. A transaction environment buffers all the data that is needed in a transaction buffer (say, a snapshot of a part of the DB on a special point of time). When the transaction is finished, the data to be written is verified by the transaction environment, or, if necessary, locked to other users.
If you use such an environment, you would not need to track the status in the DB, because the transaction will keep track of the current states by requerying the status when the transaction buffer is filled. This is more flexible, but more complex to code.
Your choice !

Slash/d003303
0
 
LVL 5

Author Comment

by:julio011597
ID: 1336943
Thanks dd,

BUT, i've quite a different situation.

My application is actually an Intranet, where i have a bunch of CGIs accessing a DB.
Transaction issues are solved in a very simple manner - please, don't kill me :) -: all my tables are set exclusive! (i have a "max-tries-to-attempt" system to avoid deadlocks)

My concern though remains: what if the "charge receipt" process adds a receipt, then (someway!?) there's a crash BEFORE the "charged" value into "warehouse" gets incremented?

I bet there's some sort of *general* approach to this kind of potential inconsistencies.

(i keep stressing on *acquired* approaches because, while i'm able to image a few ways to go, i'm not sure they are the best human beings have ever invented :)

Cheers.
0
 
LVL 5

Author Comment

by:julio011597
ID: 1336944
Hey d003303, is this question too dumb?... please, be frank.

BTW, there's a question in the Web Authoring area you might be interested in.

Title: "compress HTML"
Url: http://207.114.128.129/topics/comp/www/authoring/Q.10049765
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 4

Expert Comment

by:d003303
ID: 1336945
Hi julio,
I gave a comment on the tread you pointed to. Let's see what's to do there.
No, your question is not too dumb. I think I misunderstood your goal. So, if I'm right, you want to make all billings safe against any crashes in the billing process. What about calling a stored procedure that is setting all values accordingly on the DB-server side ? Or, another thing, if your prog crashes, send an administrative EMail and log the current state of the prog into a file that a billing process can be recovered. What do you think ?

Slash/d003303
0
 
LVL 9

Accepted Solution

by:
cymbolic earned 100 total points
ID: 1336946
Generally, operational (transaction based) databases are kept separate from a data warehouse database, and an extract/load process is used to carefully update the data warehouse from the operational database.  One reason is that essential data structure and content vary.  A true data warehouse is designed for quick lookup and summary of essential data via multiple attributes.  Many modern data warehouses are stored in a star based design, with one central fact table and multiple dimension tables related back to the fact table.  

On the other hand, your operational database is structured for rapid access and high volume update/insert operations.  Much more detailed, and often long text data is saved in the database.

Your operational data is constantly changing and has an "accretion edge" of accumulating, unstable and changing data that you do not want into your data warehouse until it has stabilized and reached a point of valid content.

Typically, you would wait for a decent interval before loading the accumulated operational data into your warehouse.  In the process you might "scrub" the data to get say, the twelve different spellings of Cleveland down to one correct version.

Data warehouses need to have reliable, unchanging historical data so that your users can have a degree of confidence in the validity of the information, hence the time lag and careful loading process.  Generally, you would not go back and change data in your warehouse, only add to it as more operational data is gathered and time intervals pass.

For a good source, read "The Data Warehouse Toolkit" by Ralph Kimball, WIley press.  
0
 
LVL 5

Author Comment

by:julio011597
ID: 1336947
Hello back,

sorry for this long silence: the fact is that there are few things not yet clear to me, and i've tried to inquire further on the net before coming back.

First, given the original question, cymbolic's answer sounds already satisfactory. d003303's contribution is also much appreciated, but falls a bit too much outside the domain of *pure* DB design.

Anyway, i'm a beginner in this field, and some additional clarifications (examples? online white papers?) are needed. Basically, i miss the "operational DB" concept at all. A few sample questions follow:

> extract/load process is used to carefully update the
> data warehouse from the operational database

Any scheme (method) in order to decide what to extract to the operational DB?
And, is Structured Analysis (ERD and DFD) enough to model such an architecture?

> Many modern data warehouses are stored in a star based
> design, with one central fact table and multiple dimension
> tables related back to the fact table

What "dimension tables" are?

> On the other hand, your operational database is structured
> for rapid access and high volume update/insert operations.
> Much more detailed, and often long text data is saved in
> the database

The overall doubt: given usual charge/discharge operations - by far, the most accessed in my system -, i cannot see a way to separate what should be rapidly accessed from what is to be considered "more details"... am i completely missing the point?

And, a side question: are Delphi's transaction facilities to be considered equivalent to an operational "bridge"?


BTW, i've posted this question to comp.databeses.theory, as well.
Here is part of what they answered:

Q: For performance purposes, i could instead store those values in the "warehouse" table directly.
A: I don't think anyone would recommend you follow the "don't store derived data" rule if it would mean that your database manager would crash or be unable to process user requests in an acceptable time. I agree with your solution using "warehouse tables".

Any comment?


Ok, a bit long, but i'll be happy to increase the points for your additional efforts - if any ;)

TIA, julio.
0
 
LVL 5

Author Comment

by:julio011597
ID: 1336948
cymbolic, i'm going to grade you in a couple of days if nothing new happens.
Anyway, please not that, while your answer is very knowledgable, i still lack something like a *mechanism* to apply the method. That is, the said "way to separate what should be rapidly accessed from what is to be considered more details". Without any indication - even a simple example - i won't be able to go from theory to practice... or am i just missing the point?

Thanks in any case, julio
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now