Solved

String to Number

Posted on 2002-03-05
15
353 Views
Last Modified: 2010-04-05
http://www.experts-exchange.com/jsp/qManageQuestion.jsp?ta=delphi&qid=20205546

Same question however let's simplify it.

I have wrote a chat server and i would like to store information on every user using a hashing table, so all i need to do is take a username (any characters) and hash it, seek to that record number, write out the record and get out...

so craig_capel username for example = 33233 so i seek to that record and write the data, the routine i have is ok, but i know for a fact it's not good for it's size, can anyone come up with a better method which does not use 1.2gig  (my only limit is that the usernames HAVE to be <32 chars in length)

So HashStringToNumber(Username: String): Integer;

OpenMyDataBase;
SeekToRecordNumber(HashToString('craig_capel'));
WriteDataOut;
CloseMyDataBase;

Thanks People (P.S if any admin are reading, please delete that thread at the top so i can finally free it)

-

Craig C.
0
Comment
Question by:craig_capel
15 Comments
 
LVL 1

Expert Comment

by:martin_g
Comment Utility
If it may help, there are some great examples of hashing string tables and using them as keys at:
http://homepages.borland.com/efg2lab/Library/Delphi/Algorithms/
Look under "Hashing"...
0
 
LVL 2

Expert Comment

by:DelphiArchitect
Comment Utility
Well, the first problem I see is what happens if you have a hash collision?  It is very likely that at some point you will have 2 users who hash to the same value (unless you use an incredibly huge table, which is a waste of space, and you still have a chance of this).

When you refer to "database" what are you using?  If your using any existing database then it should already support indicies which would be much better than implementing your own hash table.  If your not using an existing database, why not?  At the very least use the BDE which comes with Delphi or Access which is common.  You'll be vastly better off going this way.

If you insist on implementing your own database look into B+ trees which is what many database indicies are based on.

The other option is to manually implement your own in-memory index using a hash table.  I'd recommend a more typical hash implementation that uses overflow buckets or similar to handle hash collisions.  At startup read in the user name for each database record.  Add these to your hash table using the record number as the value stored.  Then you can check the hash table to get the record number quickly.  The plus side here is that hash codes don't directly equate to record numbers, thus you don't have a huge problem with hash collisions and you will save a ton of space (since you only have to have records for users, not for every possible hash value).

Hope this helps.

Delphi Architect
James Higgins
0
 
LVL 9

Expert Comment

by:ITugay
Comment Utility
Hi craig_capel,

I saw your previous question. There is another solution, without any database engines.
As I understood, you need to have fast access to user's data. So, split your data into  two files.

First file keeps all users specific data excluding username. Second file keeps username and position of associated data in first file. When your application started, just load second file into TStringList, it able to keep strings and some associated data, offset of user's data in your case. Searching in stringlist is fast enought, you can even sort your strings and use some fast  algorithm. E.g. half-division method let you search in 1,000,000 records using only 20 string comparision operations.

-----
Igor.
0
 
LVL 14

Expert Comment

by:AvonWyss
Comment Utility
I agree with Igor.

Also, how many users are you expecting, or how many do you want to support?
0
 
LVL 2

Author Comment

by:craig_capel
Comment Utility
well before it was for my own server, now i am writing a Client based chat program for Yahoo - it would encounter around 200 - 300 users (different daily) i am actually using my old routine at the moment

Function NameToInt(Name: String): Longint;
 var
  num: integer;
  tmp: string;
  gennum: longint;
begin
 Name:=LowerCase(Name);
 tmp:=name;
 gennum:=0;
for num:=1 to length(tmp) do
 begin
   gennum:=gennum+((ord(tmp[num]))*num);
 end;
   Result:=Gennum;
end;

so if this is the best i am going to get, then i shall use this :)

Thanks
0
 
LVL 14

Expert Comment

by:AvonWyss
Comment Utility
With your routine, the users "Z3", "X4", "V5", "T6", "R7", "P8", "N9" will have the same value.Many, many other combinations will also result in the same result of your NameToInt.

So, why aren't you implementing the method proposed by Igor? It would guarantee unique numbering and only use as much space as needed. For some 300 Users, it wouldn't be a problem to hold the information completely in memory anyways, eliminating any need for hashing because a binary search would be just perfect.

If you are after a better hashing method, I can provide one, but the problem of multiple names yialding the same key will always persis as long as you don't want to use a really large hash (which would again be useless because it will not be small enough to serve as index in a file). Assuming that you support 44 different chars in your names (that is, the alphabet plus digits plus a very few special chars) and names up to 32 chars length, you'll get about 39'853'812'061'028'300'000'000'000'000'000'000'000'000'000'000'000'000 different possible combinations (give or take a few). If your hash is any smaller than that, there will be double identifiers. And that's really why hashing is NOT the solution for what you want to do (gettig an unique index for any valid name).
0
 
LVL 2

Author Comment

by:craig_capel
Comment Utility
Well ok - i have on offer 300 points, is 300 points worth an example of how to sort the TStringList;

i know how to use them, but i have never sorted the information - so if you can show me how to do that, i will award the points.

- Thanks


Craig C.
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 14

Expert Comment

by:AvonWyss
Comment Utility
var
    List: TStringList;

List:=TStringList.Create;
List.Sorted:=True;

Adding:
List.AddObject(UpperCase(NickToAdd),PointerToYourInformation);

Searching:
List.IndexOf(UpperCase(NickToFind));

As long as the list is marked as Sorted, the IndexOf will be VERY quick (binary search).
0
 
LVL 2

Author Comment

by:craig_capel
Comment Utility
List.AddObject(UpperCase(NickToAdd),PointerToYourInformation);

how does the Tobject work when used in conjunction with List.AddObject?

TObject is the ultimate ancestor of all VCL objects and components. - how do i declare my own information, it's just confusing with me, I mainly use straight pascal, Object Pascal is still confusing.

Thanks -


Craig C.
0
 
LVL 2

Author Comment

by:craig_capel
Comment Utility
List.AddObject(UpperCase(NickToAdd),PointerToYourInformation);

how does the Tobject work when used in conjunction with List.AddObject?

TObject is the ultimate ancestor of all VCL objects and components. - how do i declare my own information, it's just confusing with me, I mainly use straight pascal, Object Pascal is still confusing.

Thanks -


Craig C.
0
 
LVL 14

Expert Comment

by:AvonWyss
Comment Utility
Okay. I assume that your information is storen in a class, that is, inherits from TObject. This allows you to just insert you calss members to that list uaing AddObject, and you can read and write the values using the Objects[] array-style property.

List.AddObject(UpperCase(NickToAdd),YourInformationClassInstance);

This is the same code as before for adding. Let's assume that you're storeing your information in a class called TYourInformationClass.

YourInformationClassInstance:=List.Objects[List.IndexOf(NickToAdd)] as TYourInformationClass;

This gets an instance back, makes sure that it is a TYourInformationClass instance and not some other class type and then assigns it to the YourInformationClassInstance variable.
0
 
LVL 2

Author Comment

by:craig_capel
Comment Utility
Nope, i was using  Type MyRecord = Record
                                        SomeData: String[30];
                                        Username: String[30];
                                        Ignored: Boolean;
                                   End;

And So on. i don't think it would be too hard to convert it to an Object?
0
 
LVL 14

Expert Comment

by:AvonWyss
Comment Utility
Using objects would not be a problem, but it's not even really necessary here. You can use pointers and cast them to and from TObject. To simplify the identification of types, pointer types and variables (instances), I'd suggest that you use T as type prefix and P as pointer prefix. This is widely accepted as standard.

type
    PMyRecord = ^TMyRecord;
    TMyRecord = record
        SomeData: String[30];
        Username: String[30];
        Ignored: Boolean;
    end;

var
    MyRecord: PMyRecord;
    Index: Integer;

// adding
New(MyRecord);
MyRecord.SomeData:='Use The Record here';
MyRecord.Username:='Nickname';
List.Add(MyRecord.Username,TObject(MyRecord));

// searching
Index:=List.IndexOf(NickToFind);
MyRecord:=PMyRecord(List.Objects[Index]);

// deleting
Index:=List.IndexOf(NickToFind);
Dispose(PMyRecord(List.Objects[Index]));
List.Delete(Index);
0
 
LVL 2

Author Comment

by:craig_capel
Comment Utility
Why would i add the Record, if all i am going to do is search for the String to return the data to the other File which contains all the records...

Type UserData = Record //to new db
   Loggedin : String[30];
   Loggedout : String[30];
   StartedLogging: String[30];
   UserName: String[50];
   UserAccess: integer;
   Ignored: Boolean;
   IgnoredWhen: String[30];
   Ignoredby: String[30];
   SponsoredBy: string[45];
   ReasonIgnored: String[50];
   CustomName: String[15];
   ShoutCount: Integer;
   SecondsOnline: LongInt;
   PrivateUser: Boolean;
   LastUrl: String[155];
   SecondsInRoom: Integer;
   LastThingSaid: String[30];
   ThingsSaid: LongInt;
   LastRoom: String[40];
   UsingClient: String[30];
   ClientsAge: Integer;
   MostUsedLanguage: String[10];
   XB,Perl,vb,c,delphi,java,cpp,csharp,php: integer;
   Location: String[50];
   Gender: String[10];
   Marital: String[30];
   Occupation: String[60];
   Sweared: Integer;
   AutoFriendsList: Boolean;
   RecPos: Longint;
End;


at the moment i have a File Of UserData, and i seek, i can still continue to seek to get the information out, surely if i add that whole record to TStringList object, then it''s going to be far too slow?

esp if all it is going to do is pull out the Index?
0
 
LVL 14

Accepted Solution

by:
AvonWyss earned 300 total points
Comment Utility
Well, the technique shows how to store and retrieve a record in a quick fashion. The contents of the record does not matter. For instance, you can make a record where you store only very little information, such as the index to seek to in your file.

Note that keeping data in memory is always faster than readin from the disk, even for large quantities, as long as the data fits the memory. With an estimated 2000 bytes per user you're using, you can keep about 32000 users in 64mb of memory. Using the sorted TStringList, a search for *any* user by his nick will not take any more than about 15 steps (comparisions), which is done in virtually no time in memory. SO you may want to load all users into memory on startup, and save the users when necessary (dor instance, when a user information has changed or if a new user it to be added). Since you can also keep the poition of the user inside the file in memory, you'll have very quick access to the records in the file.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now