Solved

Dbase indexed files/CGI

Posted on 1997-03-26
15
523 Views
Last Modified: 2013-12-25
BACKGROUND:  I am trying to create a database driven web page that accesses data from dbase files (on a Unix box)and puts it into reports.  To do this I am using perl CGI scripts.  I have found a perl module called Xbase.pm that allows me to open and access dbase files which is great.  The problem is that I can't open them by index and seek to particular records, and without this capability, I have to search through the files which is slow because the files are very large.

QUESTION:  How do I access .dbf files by index and be able to seek through them using a perl cgi script? (Preferably without buying comercial software such as WebBase or DBWeb, but those suggestions are welcome!)
0
Comment
Question by:webslider
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 4
  • 2
  • +1
15 Comments
 
LVL 1

Expert Comment

by:Fordream
ID: 1827841
Then make a small C program to search the field. I think that programming only this moudle is not too hard though you don't know how to make C program. And you can ask others to do favor for you. It's not too hard with format of dbf file.
 Then perl calls it. And it returns record numbers of it.... Perl use them.
 I also can make the program for you if you tell me DBF format. or simply send me the "Xbase.prn". But I think there need more points to do it ;)
 Bye!

0
 

Author Comment

by:webslider
ID: 1827842
Adjusted points to 200
0
 

Author Comment

by:webslider
ID: 1827843
This is all the points I have to give!!  Sorry.
The two functions that need to be adjusted in Xbase.pm are open_dbf() and seek().  The Xbase.pm module can be downloaded at:
 
http://www.csuchico.edu/prnt/perl-lib/Xbase.pm

and helpful explanation of it can be found at:

http://ils.unc.edu/crenshaw/Xbase.html

An example of a file I would use looks like this (called data.dbf):
FIELD      LENGTH     TYPE
County       2         Numeric          (County #: 1-87)
MCD          3         Num.     (Cities in corresponding county)
Geo_type     1         Character      (S=State,C=County,M=mcd)
POP70        8         Num.  (actual data for population in 1970)
HOU70        8         Num.     (actual data for housing in 1970)

In using this in our current setting, a Dbase program opens it up on an index tagged "area", which is a complilation of county+mcd. Therefore, if you said SEEK "8712", it would go to the record where county=87 and mcd=12.  I need to be able to do this instead of sequentially searching a file that's not in any order.  Hope this helps.
0
How Do You Stack Up Against Your Peers?

With today’s modern enterprise so dependent on digital infrastructures, the impact of major incidents has increased dramatically. Grab the report now to gain insight into how your organization ranks against your peers and learn best-in-class strategies to resolve incidents.

 
LVL 1

Expert Comment

by:Fordream
ID: 1827844
Hey webslider, before making the program, I have a question of it. Are datas in file is sorted by County and MCD? Then we do not have to seek sequentially but very fast using Binary-search algorithm.

0
 

Author Comment

by:webslider
ID: 1827845
No the data in the file is not ordered by county or mcd.  Data gets added to the end of the file every year and things can be out of place.  That's why we have the index on the files so when using it, we can see the data in logical order, not its actual order.
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827846
You cannot seek() because you don't have the corresponding IDX file?

If so, you say data gets updated on a yearly base, so you could just build up a new file from the DBF, getting rid of all the 'header' stuff (DBF has some header information in it), and storing records in increasing order of keys (the keys you need).
On the new file you may go with binary search, as Fordream suggests - which can't be as fast as IDX searching BTW.

Other chance: build up an index yourself:
scan the DBF file;
for each record, put in a array (or a structure) the 'area' information and the offset in the DBF file;
save in an 'idx' file the information collected, SORTED by 'area'.
When you need to seek, issue a binary search in the 'idx' file on the 'area' field, get the offset and do a standard seek in the DBF file.

If you can program C, i could give you some samples, but you may accomplish this in Perl too.
If you need samples, please tell where you are lost.

Cheers, julio
0
 

Author Comment

by:webslider
ID: 1827847
I'm  not totally sure what you are trying to say, Julio.  Maybe my knowledge of the interworkings of dbf files is limited.

The speed of the index doesn't matter.  What I'm trying to get away from is the amount of code I need to write to do a seek.  Right now I'm able to do most of what I need but the code is 700 lines long.  The seek that I was able to do in one line before, now takes me 30 lines.  

I've come apon another problem anyway.  I don't think I can have more than one file open at a time using the Xbase.pm and I need to be able to open more than one.

For the time being, we are going to try using CodeBase 6.2.  I haven't heard much about it but it seems to be able to do what we want.  But in case that doesn't work, I'd like to have a solution to my original problem to fall back on.  All of your help has been great so far.
0
 

Author Comment

by:webslider
ID: 1827848
Adjusted points to 235
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827849
I've read Xbase.html.

First i think you can open more than one database with something like:
--//--
$db1 = new Xbase;
$db2 = new Xbase;

$db1->open_dbf($dbf_1_name, $idx_1_name);
$db2->open_dbf($dbf_2_name, $idx_2_name);
--//--

And, you didn't answer my question: "don't you have the corresponding IDX file?". I.e., do you just issue: "$database->open_dbf($dbf_name, (null));"?

The simpler way to solve your problem would be to ask the people who mantain the original DBF to build up a corresponding IDX for you, based on 'area' informations. They CAN, very easily (just a couple of actions in their dbf application environment).

Think also that the only matter here should be speed, since you'll need more than 30 lines (and a hard work) in order to build up your own index informations.

Last (maybe not least): my previous comment was indeed an answer to your original question. Would you let me submit that as an answer even if you're going to give up with DBF (maybe lowering the points before grading)? Think that answer may be of help to other people.

Ok, that's enough... good luck
-julio
0
 

Author Comment

by:webslider
ID: 1827850
Julio-
I'd say feel free to submit your answer but Fordream has been a big help as well.  I guess, if you could tell me exactly what you mean and how to create an idx I'd be more than happy to let you have the points!!!

But before you do, I'll answer a couple of your questions.
Yes, right now I leave the idx part blank so I say:

$database->open_dbf($file_name);

As far as the index goes, I don't know what you mean by build up an IDX file. I didn't realize you could do that with a dbf file, atleast I haven't seen anything relating to it in the books I have.  Right now, we use the files in other Unix based dbase programs so using the indexes is easy.  But in transfering the data to a web page, I can't utilize the indexes that exist on the file.  So, I'm not sure what you mean by building my own index information.  If you want to, my points keep accumulating and as I get them I can add them to the total points if you want to wait another day or two.  

Thanks,
Stephanie

Oh yeah, where can I get those samples you were talking about.  Maybe that might help me understand more.
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827851
Hi Stephanie,
it's not a matter of points, you're already giving enough (IMO).

Before starting working on some code, just a few notes:

> Right now, we use the files in other Unix based dbase programs so using the indexes is easy. But in transfering the data to a web page, I can't utilize the indexes that exist on the file.

WHY NOT? I still cannot get this!
Into the 'dbase program' (what i called a 'dbf application environment'), you may create all the indexes you need very easily.
So, why not create an 'area' index by the dbase program, then get both the DBF and the 'area' IDX and issue a:
$database->open_dbf($dbf_name, $area_index_name);
WHY NOT??

This said, if you still think you need some code (but please let me understand why), i must add that i can hardly program Perl, so probably will give more a sample code than a real working code.
That should be a good starting point for you anyway, but do you think it is enough?

Cheers, julio
0
 

Author Comment

by:webslider
ID: 1827852
Into the 'dbase program' (what i called a 'dbf application environment'), you may create all the indexes you need
very easily.

THIS IS TRUE!!  I've have an index on area for the dbf file.  But the statement:
                 
    $database->open_dbf($dbf_name, $area_index_name);
       
DOESN'T WORK!!  This is what I initially tried.  The Xbase module only handles FoxPro indexed files.  Dbase creates a .mdx file that has information about the indexes associated with the .dbf file.

What we are going to attempt to do is write some c/c++(?) code to access the indexed files somehow.  We still need to find out what is in the .mdx file though.  So, unless you can come up with something about this topic, I'm not sure there is anything left to answer.  But if you do think of something that works, don't hesitate to tell me.  Thanks sooooooo much for all of your help.

webslider
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827853
Ok, finally i've gotten the problem.

Writing some code to access dbf and mdx files is behond the scope of this forum: it would be like rewriting an Xbase lib in C able to handle mdx indexes: this is really involving, and should be paid for.

What i can give you is some Perl/C code which translates an original dbf archive to a bunch of dbm tables (1 for each field + 1 for the index), and a set of C functions to access those tables allowing the same queries as Xbase does.

If your system, as most unicies, has got the 'ndbm' (or 'dbm', in which case please tell me) library, i could write the code for you in a couple of days or so. Performance should be good enough, since dbm works very fast.

And, we may be running on different platforms (i'm on Digital Unix); will you be able to handle the (very little i think) porting issues?

If you want the code, please also post the full table structure and the index formula. (Writing an all purpose code could go with your next question ;)

Ok, waiting to hear from you...
0
 

Author Comment

by:webslider
ID: 1827854
As far as I know, we don't have either dbm or ndbm libraries so I don't know if that is going to affect what you can do.  

As far as platforms go, we have many.  Solaris, Sun OS, DG Unix, and Linux.  So far no porting problems have occured.  Its important to note that our web server is running Linux and the data is on a Sun OS machine.  These are the two that will be used.

Do you have any idea where I could find the format of the .mdx files?  I've been looking for a month and not come up with anything.

One of the files would look like this:

Field Name    Type     Width
----------------------------
County         Numer   2
Mcd            N       3
Tract          Char    6
Geo_type       C       1
Pop70          N       9
Pop80          (all the rest
Pop90          are N 9)
Hou70            
Hou80            
Hou90            
P2_1...P2_5 (eg. P2_1,P2_2,P2_3...)
P3_1...P3_5
P4_1...P4_6
P5_1...P5_6
Totals -- Fields=32, Width=256

The index formula is county+mcd+tract.BUT!!!...this is a string.
So if county=12, mcd=103, tract=231092 the seek command would be:
 seek "12103231092"
If county=1, the seek string would be " 1103230192".
If mcd=4, the seek string would be " 1004230192".

Hope this makes sense.  I've got one request...please comment your code well (if you can still do it)!!

Thanks so much for all the time you've spent!
-webslider
0
 

Accepted Solution

by:
richroth earned 240 total points
ID: 1827855
I'm going to ignore the perl part of the question based on
your various comments - you will find a full Xbase compiler
for unix with cgi support at www.on-the-net.com/x2c 

I've not updated the
linux version so don't download it unless you are using an
Xout Linux -- I do have one that we are using.

Take a look at www.trendonline.com -- that system has 2 gig of
DBF files that are searched as you are for info.  It uses it's
own indes format.  If this is of use, email me and I'll get the
latest linux version uploaded.  If you can nfs mount the
sun box or can recompile the libs on it, you will be set.
0

Featured Post

More Than Just A Video Library

Train for your certification. Learn the latest DevOps tools. Grow your skillset to do better work.

At Linux Academy, we release new training modules every week so you'll always be up to date on the latest tech.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you get a (Blue Screen of Death), your system writes a small file called a minidump. Your first step is to make certain your computer is setup to record memory dumps. Right click My Computer, choose properties. Click on the advanced tab, an…
Recently I have been answering a lot of questions like this in IT forums that I frequent. The question posed is usually something along the lines of "We have software X installed and need to uninstall it for reason Y" or some other variant of the sa…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

724 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question