Solved

Info about Hash method

Posted on 1997-09-11
6
171 Views
Last Modified: 2010-08-05
I need use VB 4 to managment a big database (maybe a 800 hundred to one million records), and someone said me that is better use the hash method than a access database.
What is the hash method, where I can get info about it, and why and must select it before a access format.
0
Comment
Question by:cano091197
  • 4
6 Comments
 

Author Comment

by:cano091197
ID: 1434794
Edited text of question
0
 

Author Comment

by:cano091197
ID: 1434795
Adjusted points to 200
0
 

Author Comment

by:cano091197
ID: 1434796
Edited text of question
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 2

Expert Comment

by:Sinclair
ID: 1434797
If you wish to make your own "database engine", you can use a hash table. A hash table uses some sort of a hash function to assign a number to every record in the database. Then, you can make a hash table which is an array whose indices correspond to the hash numbers, and whose elements contain pointers/file offsets/whatever which point to the records.
Example: suppose you have a database of strings, and your hash function assigns a number to every letter of the alphabet (A=1,B=2, etc.), taking the first character of every string and returning the number. So, if your database contains the following records:

File offset   Record contents
   1             Apple
   20            Book
   63            Doom
   48            Zargon

your hash table will look like this:
Array Index    Contents   (Implied record)
  [1]            1           Apple
  [2]           20           Book
  [3]         Nothing      
  [4]           63           Doom
  [5]         Nothing
  ...
 [26]           48           Zargon

Now, if the user wants to find "Zargon" in the database, all you have to do is run it through the hash function (which will give you 26), and go to the file offset specified in the 26th position of the array.
Problems arise when "collisions" occur, that is, your hash function returns the same number for two different records (in my example, this would occur if the database contained both "Book" and "Bread").
This is really all I know. A good, cheap hash method would be to compute a checksum of every record (XORing it in some way), but I have read that this is not very efficient... You will have to ask the real experts for more info.
0
 
LVL 9

Accepted Solution

by:
cymbolic earned 200 total points
ID: 1434798
Sinclair is correct.  Hashing algorithms are merely methods of quickly calculating record offsets by using some of the data in each record as input to any of a numbe of algoritms, but all of which are based on uniqueness in your data, and customizing the algorithm upon that uniqeness to get the minimum of collisions.  Pick up any basic computer science textbook to get a more complete description, with examples.

However, speed is traded off with space in thes implementations.  That is your saved space must be much larger than your actuall needed space to avoid colissions wen your hashing algoritm calculates the same record number for two differrent records.  also, you usually need some link pinter method to resolve duplicate record number calculations.

However, the picture is not that bad because your original advice is flawed.  Access will still work well for you, you just need to take care in how you define your keys and indexes.
He's right in that it is better not to assign sequential keys if/when doing mass inserts inot the database...but...how often do you do that?  The problem is that Access tries to allocate records contiguously on the same block or page based on the primary key for your table.  Since is locks by page, if you are updateing consecutive primary keys, you will lock many adjacent rows as well.  These potential limitations are really based upon you methods of adding, updating, and reading the records.  If you are selecting based upon criteria other than your primary key, these limitations won't be a problem.  SOOoo...it all depends!

Also, you can use a form of hashing algorithm in Access as well.
Just calculate primary keys (many experts suggest using large floating point number results because the possibility of duplicates is smaler with a larger number range) instead of assigning sequential numbers.  

This presupposes that you are using primary keys that are not also data items.  Hope this helps.
0
 

Author Comment

by:cano091197
ID: 1434799
I ask for source information and nobody wrote about it
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction I needed to skip over some file processing within a For...Next loop in some old production code and wished that VB (classic) had a statement that would drop down to the end of the current iteration, bypassing the statements that were c…
If you have ever used Microsoft Word then you know that it has a good spell checker and it may have occurred to you that the ability to check spelling might be a nice piece of functionality to add to certain applications of yours. Well the code that…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question