Solved

Info about Hash method

Posted on 1997-09-11
6
170 Views
Last Modified: 2010-08-05
I need use VB 4 to managment a big database (maybe a 800 hundred to one million records), and someone said me that is better use the hash method than a access database.
What is the hash method, where I can get info about it, and why and must select it before a access format.
0
Comment
Question by:cano091197
  • 4
6 Comments
 

Author Comment

by:cano091197
ID: 1434794
Edited text of question
0
 

Author Comment

by:cano091197
ID: 1434795
Adjusted points to 200
0
 

Author Comment

by:cano091197
ID: 1434796
Edited text of question
0
Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

 
LVL 2

Expert Comment

by:Sinclair
ID: 1434797
If you wish to make your own "database engine", you can use a hash table. A hash table uses some sort of a hash function to assign a number to every record in the database. Then, you can make a hash table which is an array whose indices correspond to the hash numbers, and whose elements contain pointers/file offsets/whatever which point to the records.
Example: suppose you have a database of strings, and your hash function assigns a number to every letter of the alphabet (A=1,B=2, etc.), taking the first character of every string and returning the number. So, if your database contains the following records:

File offset   Record contents
   1             Apple
   20            Book
   63            Doom
   48            Zargon

your hash table will look like this:
Array Index    Contents   (Implied record)
  [1]            1           Apple
  [2]           20           Book
  [3]         Nothing      
  [4]           63           Doom
  [5]         Nothing
  ...
 [26]           48           Zargon

Now, if the user wants to find "Zargon" in the database, all you have to do is run it through the hash function (which will give you 26), and go to the file offset specified in the 26th position of the array.
Problems arise when "collisions" occur, that is, your hash function returns the same number for two different records (in my example, this would occur if the database contained both "Book" and "Bread").
This is really all I know. A good, cheap hash method would be to compute a checksum of every record (XORing it in some way), but I have read that this is not very efficient... You will have to ask the real experts for more info.
0
 
LVL 9

Accepted Solution

by:
cymbolic earned 200 total points
ID: 1434798
Sinclair is correct.  Hashing algorithms are merely methods of quickly calculating record offsets by using some of the data in each record as input to any of a numbe of algoritms, but all of which are based on uniqueness in your data, and customizing the algorithm upon that uniqeness to get the minimum of collisions.  Pick up any basic computer science textbook to get a more complete description, with examples.

However, speed is traded off with space in thes implementations.  That is your saved space must be much larger than your actuall needed space to avoid colissions wen your hashing algoritm calculates the same record number for two differrent records.  also, you usually need some link pinter method to resolve duplicate record number calculations.

However, the picture is not that bad because your original advice is flawed.  Access will still work well for you, you just need to take care in how you define your keys and indexes.
He's right in that it is better not to assign sequential keys if/when doing mass inserts inot the database...but...how often do you do that?  The problem is that Access tries to allocate records contiguously on the same block or page based on the primary key for your table.  Since is locks by page, if you are updateing consecutive primary keys, you will lock many adjacent rows as well.  These potential limitations are really based upon you methods of adding, updating, and reading the records.  If you are selecting based upon criteria other than your primary key, these limitations won't be a problem.  SOOoo...it all depends!

Also, you can use a form of hashing algorithm in Access as well.
Just calculate primary keys (many experts suggest using large floating point number results because the possibility of duplicates is smaler with a larger number range) instead of assigning sequential numbers.  

This presupposes that you are using primary keys that are not also data items.  Hope this helps.
0
 

Author Comment

by:cano091197
ID: 1434799
I ask for source information and nobody wrote about it
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

There are many ways to remove duplicate entries in an SQL or Access database. Most make you temporarily insert an ID field, make a temp table and copy data back and forth, and/or are slow. Here is an easy way in VB6 using ADO to remove duplicate row…
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now