Solved

Dbase indexed files/CGI

Posted on 1997-03-26
15
485 Views
Last Modified: 2013-12-25
BACKGROUND:  I am trying to create a database driven web page that accesses data from dbase files (on a Unix box)and puts it into reports.  To do this I am using perl CGI scripts.  I have found a perl module called Xbase.pm that allows me to open and access dbase files which is great.  The problem is that I can't open them by index and seek to particular records, and without this capability, I have to search through the files which is slow because the files are very large.

QUESTION:  How do I access .dbf files by index and be able to seek through them using a perl cgi script? (Preferably without buying comercial software such as WebBase or DBWeb, but those suggestions are welcome!)
0
Comment
Question by:webslider
  • 8
  • 4
  • 2
  • +1
15 Comments
 
LVL 1

Expert Comment

by:Fordream
ID: 1827841
Then make a small C program to search the field. I think that programming only this moudle is not too hard though you don't know how to make C program. And you can ask others to do favor for you. It's not too hard with format of dbf file.
 Then perl calls it. And it returns record numbers of it.... Perl use them.
 I also can make the program for you if you tell me DBF format. or simply send me the "Xbase.prn". But I think there need more points to do it ;)
 Bye!

0
 

Author Comment

by:webslider
ID: 1827842
Adjusted points to 200
0
 

Author Comment

by:webslider
ID: 1827843
This is all the points I have to give!!  Sorry.
The two functions that need to be adjusted in Xbase.pm are open_dbf() and seek().  The Xbase.pm module can be downloaded at:
 
http://www.csuchico.edu/prnt/perl-lib/Xbase.pm

and helpful explanation of it can be found at:

http://ils.unc.edu/crenshaw/Xbase.html

An example of a file I would use looks like this (called data.dbf):
FIELD      LENGTH     TYPE
County       2         Numeric          (County #: 1-87)
MCD          3         Num.     (Cities in corresponding county)
Geo_type     1         Character      (S=State,C=County,M=mcd)
POP70        8         Num.  (actual data for population in 1970)
HOU70        8         Num.     (actual data for housing in 1970)

In using this in our current setting, a Dbase program opens it up on an index tagged "area", which is a complilation of county+mcd. Therefore, if you said SEEK "8712", it would go to the record where county=87 and mcd=12.  I need to be able to do this instead of sequentially searching a file that's not in any order.  Hope this helps.
0
 
LVL 1

Expert Comment

by:Fordream
ID: 1827844
Hey webslider, before making the program, I have a question of it. Are datas in file is sorted by County and MCD? Then we do not have to seek sequentially but very fast using Binary-search algorithm.

0
 

Author Comment

by:webslider
ID: 1827845
No the data in the file is not ordered by county or mcd.  Data gets added to the end of the file every year and things can be out of place.  That's why we have the index on the files so when using it, we can see the data in logical order, not its actual order.
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827846
You cannot seek() because you don't have the corresponding IDX file?

If so, you say data gets updated on a yearly base, so you could just build up a new file from the DBF, getting rid of all the 'header' stuff (DBF has some header information in it), and storing records in increasing order of keys (the keys you need).
On the new file you may go with binary search, as Fordream suggests - which can't be as fast as IDX searching BTW.

Other chance: build up an index yourself:
scan the DBF file;
for each record, put in a array (or a structure) the 'area' information and the offset in the DBF file;
save in an 'idx' file the information collected, SORTED by 'area'.
When you need to seek, issue a binary search in the 'idx' file on the 'area' field, get the offset and do a standard seek in the DBF file.

If you can program C, i could give you some samples, but you may accomplish this in Perl too.
If you need samples, please tell where you are lost.

Cheers, julio
0
 

Author Comment

by:webslider
ID: 1827847
I'm  not totally sure what you are trying to say, Julio.  Maybe my knowledge of the interworkings of dbf files is limited.

The speed of the index doesn't matter.  What I'm trying to get away from is the amount of code I need to write to do a seek.  Right now I'm able to do most of what I need but the code is 700 lines long.  The seek that I was able to do in one line before, now takes me 30 lines.  

I've come apon another problem anyway.  I don't think I can have more than one file open at a time using the Xbase.pm and I need to be able to open more than one.

For the time being, we are going to try using CodeBase 6.2.  I haven't heard much about it but it seems to be able to do what we want.  But in case that doesn't work, I'd like to have a solution to my original problem to fall back on.  All of your help has been great so far.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:webslider
ID: 1827848
Adjusted points to 235
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827849
I've read Xbase.html.

First i think you can open more than one database with something like:
--//--
$db1 = new Xbase;
$db2 = new Xbase;

$db1->open_dbf($dbf_1_name, $idx_1_name);
$db2->open_dbf($dbf_2_name, $idx_2_name);
--//--

And, you didn't answer my question: "don't you have the corresponding IDX file?". I.e., do you just issue: "$database->open_dbf($dbf_name, (null));"?

The simpler way to solve your problem would be to ask the people who mantain the original DBF to build up a corresponding IDX for you, based on 'area' informations. They CAN, very easily (just a couple of actions in their dbf application environment).

Think also that the only matter here should be speed, since you'll need more than 30 lines (and a hard work) in order to build up your own index informations.

Last (maybe not least): my previous comment was indeed an answer to your original question. Would you let me submit that as an answer even if you're going to give up with DBF (maybe lowering the points before grading)? Think that answer may be of help to other people.

Ok, that's enough... good luck
-julio
0
 

Author Comment

by:webslider
ID: 1827850
Julio-
I'd say feel free to submit your answer but Fordream has been a big help as well.  I guess, if you could tell me exactly what you mean and how to create an idx I'd be more than happy to let you have the points!!!

But before you do, I'll answer a couple of your questions.
Yes, right now I leave the idx part blank so I say:

$database->open_dbf($file_name);

As far as the index goes, I don't know what you mean by build up an IDX file. I didn't realize you could do that with a dbf file, atleast I haven't seen anything relating to it in the books I have.  Right now, we use the files in other Unix based dbase programs so using the indexes is easy.  But in transfering the data to a web page, I can't utilize the indexes that exist on the file.  So, I'm not sure what you mean by building my own index information.  If you want to, my points keep accumulating and as I get them I can add them to the total points if you want to wait another day or two.  

Thanks,
Stephanie

Oh yeah, where can I get those samples you were talking about.  Maybe that might help me understand more.
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827851
Hi Stephanie,
it's not a matter of points, you're already giving enough (IMO).

Before starting working on some code, just a few notes:

> Right now, we use the files in other Unix based dbase programs so using the indexes is easy. But in transfering the data to a web page, I can't utilize the indexes that exist on the file.

WHY NOT? I still cannot get this!
Into the 'dbase program' (what i called a 'dbf application environment'), you may create all the indexes you need very easily.
So, why not create an 'area' index by the dbase program, then get both the DBF and the 'area' IDX and issue a:
$database->open_dbf($dbf_name, $area_index_name);
WHY NOT??

This said, if you still think you need some code (but please let me understand why), i must add that i can hardly program Perl, so probably will give more a sample code than a real working code.
That should be a good starting point for you anyway, but do you think it is enough?

Cheers, julio
0
 

Author Comment

by:webslider
ID: 1827852
Into the 'dbase program' (what i called a 'dbf application environment'), you may create all the indexes you need
very easily.

THIS IS TRUE!!  I've have an index on area for the dbf file.  But the statement:
                 
    $database->open_dbf($dbf_name, $area_index_name);
       
DOESN'T WORK!!  This is what I initially tried.  The Xbase module only handles FoxPro indexed files.  Dbase creates a .mdx file that has information about the indexes associated with the .dbf file.

What we are going to attempt to do is write some c/c++(?) code to access the indexed files somehow.  We still need to find out what is in the .mdx file though.  So, unless you can come up with something about this topic, I'm not sure there is anything left to answer.  But if you do think of something that works, don't hesitate to tell me.  Thanks sooooooo much for all of your help.

webslider
0
 
LVL 5

Expert Comment

by:julio011597
ID: 1827853
Ok, finally i've gotten the problem.

Writing some code to access dbf and mdx files is behond the scope of this forum: it would be like rewriting an Xbase lib in C able to handle mdx indexes: this is really involving, and should be paid for.

What i can give you is some Perl/C code which translates an original dbf archive to a bunch of dbm tables (1 for each field + 1 for the index), and a set of C functions to access those tables allowing the same queries as Xbase does.

If your system, as most unicies, has got the 'ndbm' (or 'dbm', in which case please tell me) library, i could write the code for you in a couple of days or so. Performance should be good enough, since dbm works very fast.

And, we may be running on different platforms (i'm on Digital Unix); will you be able to handle the (very little i think) porting issues?

If you want the code, please also post the full table structure and the index formula. (Writing an all purpose code could go with your next question ;)

Ok, waiting to hear from you...
0
 

Author Comment

by:webslider
ID: 1827854
As far as I know, we don't have either dbm or ndbm libraries so I don't know if that is going to affect what you can do.  

As far as platforms go, we have many.  Solaris, Sun OS, DG Unix, and Linux.  So far no porting problems have occured.  Its important to note that our web server is running Linux and the data is on a Sun OS machine.  These are the two that will be used.

Do you have any idea where I could find the format of the .mdx files?  I've been looking for a month and not come up with anything.

One of the files would look like this:

Field Name    Type     Width
----------------------------
County         Numer   2
Mcd            N       3
Tract          Char    6
Geo_type       C       1
Pop70          N       9
Pop80          (all the rest
Pop90          are N 9)
Hou70            
Hou80            
Hou90            
P2_1...P2_5 (eg. P2_1,P2_2,P2_3...)
P3_1...P3_5
P4_1...P4_6
P5_1...P5_6
Totals -- Fields=32, Width=256

The index formula is county+mcd+tract.BUT!!!...this is a string.
So if county=12, mcd=103, tract=231092 the seek command would be:
 seek "12103231092"
If county=1, the seek string would be " 1103230192".
If mcd=4, the seek string would be " 1004230192".

Hope this makes sense.  I've got one request...please comment your code well (if you can still do it)!!

Thanks so much for all the time you've spent!
-webslider
0
 

Accepted Solution

by:
richroth earned 240 total points
ID: 1827855
I'm going to ignore the perl part of the question based on
your various comments - you will find a full Xbase compiler
for unix with cgi support at www.on-the-net.com/x2c

I've not updated the
linux version so don't download it unless you are using an
Xout Linux -- I do have one that we are using.

Take a look at www.trendonline.com -- that system has 2 gig of
DBF files that are searched as you are for info.  It uses it's
own indes format.  If this is of use, email me and I'll get the
latest linux version uploaded.  If you can nfs mount the
sun box or can recompile the libs on it, you will be set.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

This tutorial will discuss fancy secure registration forms, with AJAX technology support. In this article I assume you already know HTML and some JS. I will write the code using WhizBase Server Pages, so you need to know some basics in WBSP (you mig…
Batch, VBS, and scripts in general are incredibly useful for repetitive tasks.  Some tasks can take a while to complete and it can be annoying to check back only to discover that your script finished 5 minutes ago.  Some scripts may complete nearly …
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Learn the basics of if, else, and elif statements in Python 2.7. Use "if" statements to test a specified condition.: The structure of an if statement is as follows: (CODE) Use "else" statements to allow the execution of an alternative, if the …

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now