• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 261
  • Last Modified:

Looking for suggestions on in-memory searching...


I currently am using binary searching on an index which is completely maintained in memory.   This is for our intranet daemon which keeps an active list of logged on users.  This handles around 500 requests a second, and has a average size of 750,000 entries (or users logged in).

Basically we have a queue every time a user logs in, and every 10 seconds we "reindex" the main index with the queue.  We did it this way because 500 mutex locks a second on the index + time to search completely killed the CPU's and memory.  We have "markers" throughout the index to help find out where the users information is stored.

This isn't cutting it for us anymore, and I am looking for a solution that will be able to handle REAL-TIME inserts/deletes/searches.

Every 10 minutes we'll delete users who haven't been active in 24 hours.  Every second there will be 500 searches, 250 users will be already found in the index, and 250 will be added.

Basically our userinfo structure looks something like this:

struct _userinfo {
       unsigned uid;
       int position;
       char country[3];
       char *data;

Yes, small indeed.  But having quick lookups and real-time additions, deletion, and searching is a must, like I said before.

Can anyone suggest a library that can handle this?  We would rather use something tested and being used in the real world right now.  Developing an in-house library for this would take months to get something perfect. :)

Thank you
  • 7
  • 5
1 Solution
I think a small change in data structures will go a long way to help you

>struct _userinfo {
>      unsigned uid;
>      int position;
>      char country[3];
>      char *data;
Your struct will be 15 bytes on most platforms ... If you are malloc'ing structs, then on most common platforms, minimum memory malloc'ed is 64 bytes ... So you lose 49 bytes per struct you malloc ....

Since you are going to use around >750,000 entries <, why not have an array of structs with around 750,000 elements ... let the index of the struct in the array be the uid (so you save another 4 bytes per struct)
Moreover, this will make your search direct access instead of binay

We had a discussion about managing similar situation a bit differently, ... go though this link
rulerrAuthor Commented:

Thanks for the quick response, but like I said, I would much rather have a library that implements this all safely.  And also, this is a completely dynamic user list.  Sometimes it's 250,000, sometimes its 1,500,000.  It's unpredictable.

We need to fix this pretty fast, and that's why we really don't have time to develop an implementation ourselves.

Again, thanks for the response.
see if this helps
Get Cisco Certified in IT Security

There’s a high demand for IT security experts and network administrators who can safeguard the data that individuals, corporations, and governments rely on every day. Pursue your B.S. in Network Operations and Security and gain the credentials you need for this high-growth field.

If the ideas there appeal to you, then here is a library
rulerrAuthor Commented:

Seems that library is for Pascal ;)

Do you really think the best approach is to use an array of structs?
>Do you really think the best approach is to use an array of structs?
Depends on how much memory and processing power you have to spare ...
with binary search, in 750,000 elements, you need 14 comparisons in worst case ... sme for BSTs and other such data organizations ...

array of structs will reduce it to constant (direct access) ... but price paid is large amounts of memory required ... as you need a struct even for data that you do not have to store (unused ids)

B+ trees are a tradeoff that will reduce number of comparisons while not taking up so much memory

So it all boils down to
1. how much resources you have to spare
2. how much more efficient you want it to be

>Seems that library is for Pascal
My bad ... I am sorry, but I am sure you can find similar one easily using google or at codeproject.com or programmersheaven.com or sourceforge.net ..... list is endless
rulerrAuthor Commented:

So what "other" way besides array of structs would you suggest for real-time inserting, deleting, etc, unlike the queue system we are using now?

Basically this queue thing is really messing things up when someone comes back within 10 seconds (search returns nothing because it didn't get indexed yet)

I believe we tried using mutex locks and it completely crapped out.  Can't remember though, it was a long time ago.  If you can persuade me otherwise I'll go ahead and write the code again for that and see how it works.   Wouldn't be too hard to convert what we currently have to use that.

Thanks again, Sunny.
>So what "other" way besides array of structs would you suggest for real-time inserting, deleting, etc, unlike the queue system
>we are using now?
If you have memory to spare and want to make it real fast, use array of structs...

If you want to save some memory and can compromise a bit with the speed, something like B+ trees is in order ... if you do not have range searches and sequential access requirements, B trees or a variant will be better than B+ tree
Take a look at the organization of data in these trees and the algorithms (insertion/deletion) to get an idea of how things work with it...

btw ... were you malloc'ing the structs ?
rulerrAuthor Commented:
Yes we are currently mallocing the structs.
Then I guess you were using up a lot of memory due to internal fragmentation anyway ... An array of structs should be affordable
rulerrAuthor Commented:
Heh, okay.

Let's see what I can do now.

Thank you for the quick responses ;)
glad to help :o)
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Microsoft Windows 7 Basic

This introductory course to Windows 7 environment will teach you about working with the Windows operating system. You will learn about basic functions including start menu; the desktop; managing files, folders, and libraries.

  • 7
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now