Solved

Info about Hash method

Posted on 1997-09-11
6
169 Views
Last Modified: 2010-08-05
I need use VB 4 to managment a big database (maybe a 800 hundred to one million records), and someone said me that is better use the hash method than a access database.
What is the hash method, where I can get info about it, and why and must select it before a access format.
0
Comment
Question by:cano091197
  • 4
6 Comments
 

Author Comment

by:cano091197
ID: 1434794
Edited text of question
0
 

Author Comment

by:cano091197
ID: 1434795
Adjusted points to 200
0
 

Author Comment

by:cano091197
ID: 1434796
Edited text of question
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 2

Expert Comment

by:Sinclair
ID: 1434797
If you wish to make your own "database engine", you can use a hash table. A hash table uses some sort of a hash function to assign a number to every record in the database. Then, you can make a hash table which is an array whose indices correspond to the hash numbers, and whose elements contain pointers/file offsets/whatever which point to the records.
Example: suppose you have a database of strings, and your hash function assigns a number to every letter of the alphabet (A=1,B=2, etc.), taking the first character of every string and returning the number. So, if your database contains the following records:

File offset   Record contents
   1             Apple
   20            Book
   63            Doom
   48            Zargon

your hash table will look like this:
Array Index    Contents   (Implied record)
  [1]            1           Apple
  [2]           20           Book
  [3]         Nothing      
  [4]           63           Doom
  [5]         Nothing
  ...
 [26]           48           Zargon

Now, if the user wants to find "Zargon" in the database, all you have to do is run it through the hash function (which will give you 26), and go to the file offset specified in the 26th position of the array.
Problems arise when "collisions" occur, that is, your hash function returns the same number for two different records (in my example, this would occur if the database contained both "Book" and "Bread").
This is really all I know. A good, cheap hash method would be to compute a checksum of every record (XORing it in some way), but I have read that this is not very efficient... You will have to ask the real experts for more info.
0
 
LVL 9

Accepted Solution

by:
cymbolic earned 200 total points
ID: 1434798
Sinclair is correct.  Hashing algorithms are merely methods of quickly calculating record offsets by using some of the data in each record as input to any of a numbe of algoritms, but all of which are based on uniqueness in your data, and customizing the algorithm upon that uniqeness to get the minimum of collisions.  Pick up any basic computer science textbook to get a more complete description, with examples.

However, speed is traded off with space in thes implementations.  That is your saved space must be much larger than your actuall needed space to avoid colissions wen your hashing algoritm calculates the same record number for two differrent records.  also, you usually need some link pinter method to resolve duplicate record number calculations.

However, the picture is not that bad because your original advice is flawed.  Access will still work well for you, you just need to take care in how you define your keys and indexes.
He's right in that it is better not to assign sequential keys if/when doing mass inserts inot the database...but...how often do you do that?  The problem is that Access tries to allocate records contiguously on the same block or page based on the primary key for your table.  Since is locks by page, if you are updateing consecutive primary keys, you will lock many adjacent rows as well.  These potential limitations are really based upon you methods of adding, updating, and reading the records.  If you are selecting based upon criteria other than your primary key, these limitations won't be a problem.  SOOoo...it all depends!

Also, you can use a form of hashing algorithm in Access as well.
Just calculate primary keys (many experts suggest using large floating point number results because the possibility of duplicates is smaler with a larger number range) instead of assigning sequential numbers.  

This presupposes that you are using primary keys that are not also data items.  Hope this helps.
0
 

Author Comment

by:cano091197
ID: 1434799
I ask for source information and nobody wrote about it
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now