[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now


Dbase indexed files/CGI

Posted on 1997-03-26
Medium Priority
Last Modified: 2013-12-25
BACKGROUND:  I am trying to create a database driven web page that accesses data from dbase files (on a Unix box)and puts it into reports.  To do this I am using perl CGI scripts.  I have found a perl module called Xbase.pm that allows me to open and access dbase files which is great.  The problem is that I can't open them by index and seek to particular records, and without this capability, I have to search through the files which is slow because the files are very large.

QUESTION:  How do I access .dbf files by index and be able to seek through them using a perl cgi script? (Preferably without buying comercial software such as WebBase or DBWeb, but those suggestions are welcome!)
Question by:webslider
  • 8
  • 4
  • 2
  • +1

Expert Comment

ID: 1827841
Then make a small C program to search the field. I think that programming only this moudle is not too hard though you don't know how to make C program. And you can ask others to do favor for you. It's not too hard with format of dbf file.
 Then perl calls it. And it returns record numbers of it.... Perl use them.
 I also can make the program for you if you tell me DBF format. or simply send me the "Xbase.prn". But I think there need more points to do it ;)


Author Comment

ID: 1827842
Adjusted points to 200

Author Comment

ID: 1827843
This is all the points I have to give!!  Sorry.
The two functions that need to be adjusted in Xbase.pm are open_dbf() and seek().  The Xbase.pm module can be downloaded at:

and helpful explanation of it can be found at:


An example of a file I would use looks like this (called data.dbf):
County       2         Numeric          (County #: 1-87)
MCD          3         Num.     (Cities in corresponding county)
Geo_type     1         Character      (S=State,C=County,M=mcd)
POP70        8         Num.  (actual data for population in 1970)
HOU70        8         Num.     (actual data for housing in 1970)

In using this in our current setting, a Dbase program opens it up on an index tagged "area", which is a complilation of county+mcd. Therefore, if you said SEEK "8712", it would go to the record where county=87 and mcd=12.  I need to be able to do this instead of sequentially searching a file that's not in any order.  Hope this helps.
Upgrade your Question Security!

Add Premium security features to your question to ensure its privacy or anonymity. Learn more about your ability to control Question Security today.


Expert Comment

ID: 1827844
Hey webslider, before making the program, I have a question of it. Are datas in file is sorted by County and MCD? Then we do not have to seek sequentially but very fast using Binary-search algorithm.


Author Comment

ID: 1827845
No the data in the file is not ordered by county or mcd.  Data gets added to the end of the file every year and things can be out of place.  That's why we have the index on the files so when using it, we can see the data in logical order, not its actual order.

Expert Comment

ID: 1827846
You cannot seek() because you don't have the corresponding IDX file?

If so, you say data gets updated on a yearly base, so you could just build up a new file from the DBF, getting rid of all the 'header' stuff (DBF has some header information in it), and storing records in increasing order of keys (the keys you need).
On the new file you may go with binary search, as Fordream suggests - which can't be as fast as IDX searching BTW.

Other chance: build up an index yourself:
scan the DBF file;
for each record, put in a array (or a structure) the 'area' information and the offset in the DBF file;
save in an 'idx' file the information collected, SORTED by 'area'.
When you need to seek, issue a binary search in the 'idx' file on the 'area' field, get the offset and do a standard seek in the DBF file.

If you can program C, i could give you some samples, but you may accomplish this in Perl too.
If you need samples, please tell where you are lost.

Cheers, julio

Author Comment

ID: 1827847
I'm  not totally sure what you are trying to say, Julio.  Maybe my knowledge of the interworkings of dbf files is limited.

The speed of the index doesn't matter.  What I'm trying to get away from is the amount of code I need to write to do a seek.  Right now I'm able to do most of what I need but the code is 700 lines long.  The seek that I was able to do in one line before, now takes me 30 lines.  

I've come apon another problem anyway.  I don't think I can have more than one file open at a time using the Xbase.pm and I need to be able to open more than one.

For the time being, we are going to try using CodeBase 6.2.  I haven't heard much about it but it seems to be able to do what we want.  But in case that doesn't work, I'd like to have a solution to my original problem to fall back on.  All of your help has been great so far.

Author Comment

ID: 1827848
Adjusted points to 235

Expert Comment

ID: 1827849
I've read Xbase.html.

First i think you can open more than one database with something like:
$db1 = new Xbase;
$db2 = new Xbase;

$db1->open_dbf($dbf_1_name, $idx_1_name);
$db2->open_dbf($dbf_2_name, $idx_2_name);

And, you didn't answer my question: "don't you have the corresponding IDX file?". I.e., do you just issue: "$database->open_dbf($dbf_name, (null));"?

The simpler way to solve your problem would be to ask the people who mantain the original DBF to build up a corresponding IDX for you, based on 'area' informations. They CAN, very easily (just a couple of actions in their dbf application environment).

Think also that the only matter here should be speed, since you'll need more than 30 lines (and a hard work) in order to build up your own index informations.

Last (maybe not least): my previous comment was indeed an answer to your original question. Would you let me submit that as an answer even if you're going to give up with DBF (maybe lowering the points before grading)? Think that answer may be of help to other people.

Ok, that's enough... good luck

Author Comment

ID: 1827850
I'd say feel free to submit your answer but Fordream has been a big help as well.  I guess, if you could tell me exactly what you mean and how to create an idx I'd be more than happy to let you have the points!!!

But before you do, I'll answer a couple of your questions.
Yes, right now I leave the idx part blank so I say:


As far as the index goes, I don't know what you mean by build up an IDX file. I didn't realize you could do that with a dbf file, atleast I haven't seen anything relating to it in the books I have.  Right now, we use the files in other Unix based dbase programs so using the indexes is easy.  But in transfering the data to a web page, I can't utilize the indexes that exist on the file.  So, I'm not sure what you mean by building my own index information.  If you want to, my points keep accumulating and as I get them I can add them to the total points if you want to wait another day or two.  


Oh yeah, where can I get those samples you were talking about.  Maybe that might help me understand more.

Expert Comment

ID: 1827851
Hi Stephanie,
it's not a matter of points, you're already giving enough (IMO).

Before starting working on some code, just a few notes:

> Right now, we use the files in other Unix based dbase programs so using the indexes is easy. But in transfering the data to a web page, I can't utilize the indexes that exist on the file.

WHY NOT? I still cannot get this!
Into the 'dbase program' (what i called a 'dbf application environment'), you may create all the indexes you need very easily.
So, why not create an 'area' index by the dbase program, then get both the DBF and the 'area' IDX and issue a:
$database->open_dbf($dbf_name, $area_index_name);

This said, if you still think you need some code (but please let me understand why), i must add that i can hardly program Perl, so probably will give more a sample code than a real working code.
That should be a good starting point for you anyway, but do you think it is enough?

Cheers, julio

Author Comment

ID: 1827852
Into the 'dbase program' (what i called a 'dbf application environment'), you may create all the indexes you need
very easily.

THIS IS TRUE!!  I've have an index on area for the dbf file.  But the statement:
    $database->open_dbf($dbf_name, $area_index_name);
DOESN'T WORK!!  This is what I initially tried.  The Xbase module only handles FoxPro indexed files.  Dbase creates a .mdx file that has information about the indexes associated with the .dbf file.

What we are going to attempt to do is write some c/c++(?) code to access the indexed files somehow.  We still need to find out what is in the .mdx file though.  So, unless you can come up with something about this topic, I'm not sure there is anything left to answer.  But if you do think of something that works, don't hesitate to tell me.  Thanks sooooooo much for all of your help.


Expert Comment

ID: 1827853
Ok, finally i've gotten the problem.

Writing some code to access dbf and mdx files is behond the scope of this forum: it would be like rewriting an Xbase lib in C able to handle mdx indexes: this is really involving, and should be paid for.

What i can give you is some Perl/C code which translates an original dbf archive to a bunch of dbm tables (1 for each field + 1 for the index), and a set of C functions to access those tables allowing the same queries as Xbase does.

If your system, as most unicies, has got the 'ndbm' (or 'dbm', in which case please tell me) library, i could write the code for you in a couple of days or so. Performance should be good enough, since dbm works very fast.

And, we may be running on different platforms (i'm on Digital Unix); will you be able to handle the (very little i think) porting issues?

If you want the code, please also post the full table structure and the index formula. (Writing an all purpose code could go with your next question ;)

Ok, waiting to hear from you...

Author Comment

ID: 1827854
As far as I know, we don't have either dbm or ndbm libraries so I don't know if that is going to affect what you can do.  

As far as platforms go, we have many.  Solaris, Sun OS, DG Unix, and Linux.  So far no porting problems have occured.  Its important to note that our web server is running Linux and the data is on a Sun OS machine.  These are the two that will be used.

Do you have any idea where I could find the format of the .mdx files?  I've been looking for a month and not come up with anything.

One of the files would look like this:

Field Name    Type     Width
County         Numer   2
Mcd            N       3
Tract          Char    6
Geo_type       C       1
Pop70          N       9
Pop80          (all the rest
Pop90          are N 9)
P2_1...P2_5 (eg. P2_1,P2_2,P2_3...)
Totals -- Fields=32, Width=256

The index formula is county+mcd+tract.BUT!!!...this is a string.
So if county=12, mcd=103, tract=231092 the seek command would be:
 seek "12103231092"
If county=1, the seek string would be " 1103230192".
If mcd=4, the seek string would be " 1004230192".

Hope this makes sense.  I've got one request...please comment your code well (if you can still do it)!!

Thanks so much for all the time you've spent!

Accepted Solution

richroth earned 480 total points
ID: 1827855
I'm going to ignore the perl part of the question based on
your various comments - you will find a full Xbase compiler
for unix with cgi support at www.on-the-net.com/x2c 

I've not updated the
linux version so don't download it unless you are using an
Xout Linux -- I do have one that we are using.

Take a look at www.trendonline.com -- that system has 2 gig of
DBF files that are searched as you are for info.  It uses it's
own indes format.  If this is of use, email me and I'll get the
latest linux version uploaded.  If you can nfs mount the
sun box or can recompile the libs on it, you will be set.

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this tutorial I will focus on how to use WhizBase as a tool for sending ICQ messages to ICQ. Here I will use a new technology in WhizBase, published in WhizBase 5.1 version. In this tutorial I will use 3 files, pager.wbsp for the processing, e…
A quick Powershell script I wrote to find old program installations and check versions of a specific file across the network.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
Suggested Courses
Course of the Month18 days, 21 hours left to enroll

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question