asked on

Use .txt file as database

Hello,
I want to use a .txt file as a database (only for reading and selecting records). Every record in the .txt file is on a single line and looks like this:

"A 00010",Description of the article,"50"

The first field is a unique ID, the second a textfield and the third a number (quantity).

Can I read and select records from this .txt file and use it like a database (if so, how?) or do I have to read the lines into a MySQL db first (if so, how?) ??

axis_img

"Can I read and select records from this .txt file and use it like a database"

Not like a database, no. You can read from the text file, but matching records (selecting) is much more painful than using SQL on a true database.

"or do I have to read the lines into a MySQL db first (if so, how?) ??"

Do you have access to a MySQL database on your environment? If so, then I fully suggest you use this method. Here are a few reasons why you should use a database over a flat-file (.txt file).

1.) You stated: "The first field is a unique ID". What happens if you accidentally use the same ID twice? It happens, and it would be very difficult to monitor when using a text file. If you were using a database such as MySQL, you could just create a unique index on that field, and MySQL would _never_ allow you to duplicate a value. Much easier... eg: CREATE UNIQUE INDEX table_name_ux1 ON table_name(field_name);

2.) Selecting information from databases is very easy, and very quick (especially when indexes are used properly). If you are using a text file, then you possibly have to scan the entire file every time you are looking for something. eg:

If you wanted all of the records that contained the value "50" in quantity. For a text file, you would have to do something like this:

$fp = fopen("db.txt", "r");

// SCAN EVERY SINGLE LINE IN THE FILE, BECAUSE WE DONT KNOW IF THERE ARE MORE MATCHES
while(! feof($fp)) {

// This is just an example based on some pseudo-format
fscanf($fp, "%s,\"%s\",%d", &$id, &$desc, &$quantity);
}
fclose($fp);

Notice that you just scanned every single row of the text file looking for records that had the quantity of 50. If you were wanting to do the same in a database such as MySQL, then you would do the following:

SELECT * FROM table_name WHERE quantity = 50;

If you have built proper indexes on the table, then it will not have to search the entire database for the values. This is because indexes are sorted, so it looks something like this:

quantity
----------
47
47
48
49
50 --- SELECT statement found first match, it retrieves records from here
50
50
50
51 --- SELECT statement is done because it is no longer 50

The key to remember there is that once it finds a value that doesn't match anymore, it knows that it is no longer going to find a value that matches (since the index is sorted in that manner), so it can stop looking for matching values. This saves a _lot_ of time.

-----------------------------------------------------

So do you have access to MySQL? If so, then let us know. It is fairly trivial to import a text file into a database for use.

Regards,
Barry

MarcoDieleman

ASKER

Yes, I do have access to MySQL.

The reason why I wanted to use a text file is that the database is maintained with an old DOS-program (offline) and with that it is possible to export the catalogue to a text file. Then the exported file can be uploaded and used on the website.

When I want to use MySQL I have to write some admin pages so the user can/must start the proces of writing the text lines to a MySQL db. Or can I do this automaticly, let's say once a day? (Because the database changes every day).

I want a solution that is the most easy for the user...

axis_img

"When I want to use MySQL I have to write some admin pages so the user can/must start the proces of writing the text lines to a MySQL db. Or can I do this automaticly, let's say once a day?"

Sure you can... What server environment are you running with? Linux?

If so, you could just write an entry into the CRON script that automatically imports the text file every day, hour, whatever...

Give me a little more information regarding your server setup that contains mysql. Also, give a little more information regarding any specific needs for the import. When you upload the text file, will it always be the same name... or will it be a filename based on the current day, or what... Get the idea?

Thanks,
Barry

Hamlet081299

If you want to stick with the current format (which appears to be CSV format), then you cannot use SQL, but there are some features of PHP which could be helpful.

For example "fgetcsv()".

Depending on the size of your file, you could load the whole thing into an array and then do whatever selecting you need, or alternatively you could read it line by line selecting only those records you want.

Hamlet081299

Here's a simple example, that outputs lines where 'cat' is anywhere in the text, with a hyperlink to getarticle.php, which digs up the article.
...

<?php
$keyword = 'cat';
$fp = fopen ("test.txt","r");
while (list($id, $text, $qty) = fgetcsv($fp, 1000, ",")) {
if (stristr($text, $keyword)) {
echo "<a href='getarticle.php?id=$id'>$id</a> \n";
echo "$text \n";
echo "($qty lines)\n";
}
}
fclose ($fp);
?>

Working off this file...

"A 00010",Boring article,"50"
"A 00020",Everything you wanted to know about cats,"45"
"A 00030",Dogs are better than cats,"65"
"A 00040",Sleeping dogs,"5"
"A 00050",Catapults of medieval Europe,"15"

Produces something like (without hyperlinks) ...

A 00020
Everything you wanted to know about cats
(45 lines)

A 00030
Dogs are better than cats
(65 lines)

A 00050
Catapults of medieval Europe
(15 lines)

MarcoDieleman

ASKER

Sorry for the late reaction but EE was down yesterday...

For Barry:
I'm running PHP and MySQL on a Unix-server (Cobalt). Don't ask to much details because I'm not that familiar with Unix... It is a dedicated server...

The database (the text file) always has the same name.
The records always have the same fields. Nothing special.

For Hamlet:
The file is about 6000 lines long and about 300Kb. That is a problem I think when you want to load the whole file into an array...
If the CSV-format is a problem, it is possible to export the file to ASCII but I have to look at that first to see what format the records are.

Hamlet081299

How complex are the queries you would want to use?

Perhaps loading into memory is not the ideal solution(although when you think about it 300K is not a lot of memory.)

If you can give us more of an idea of what things you need to do with the data, that would help.

* In what ways would the data be displayed?
* Do you need to search?
* Do you need to sort?
* Do you need to insert/update/delete?

Hamlet081299

(I just reread the original question, so I know you don't need to update.)

MarcoDieleman

ASKER

I want the records to be displayed by catagory:
A .....
B .....
C .....

and I want to do a text search.

No updating, deleting or inserting of records.

axis_img

Use MySQL. Text files are for logs that are parsed, not searched.

I am writing up some info for you regarding how to do this. Let me know if this is what you are needing...

1.) You will manually upload the text file, which is comma-delimited, to the server on a daily basis.

2.) When it is uploaded, the data needs to be automatically imported into mysql. When it is imported, should the old data be totally removed? The new data that is being imported should be the only data retrievable by the user searches?

Basically, the way this would work is... You can either write a PHP script that will do a LOAD DATA INFILE query to import the text file into the database. You would then put the php script in the CRONTAB, so that it runs at specified periods automatically (eg: every morning at 9am) If you do it within a PHP script, then it will allow you to make sure that the text file has been modified before trying to import it (maybe you forgot to upload a new version, in which it would be pointless to import the old one again).

The other option is to add a CRONTAB entry that just does something like the following:

mysqlimport -d --fields-optionally-enclosed-by='\"' --fields-terminated-by=',' table_name textfile.txt

That command (although I have not tested the syntax yet) would go into the CRON. It would run at specified periods and import the text file "textfile.txt" into the database automatically.

Anyway... Let us know your thoughts, and we can tailor something more specific.

Regards,
Barry

Hamlet081299

Will you be displaying ALL the records on one page?

If you could provide a small indicative sample of the data that might help.

As far as I can tell the example I gave is along the right lines. Do you have any specific comments about the example?

axis_img

Just a note on something I said:

"1.) You will manually upload the text file, which is comma-delimited, to the server on a daily basis."

If the text file that you need to upload is reachable by the web (eg: on a fileserver that is password-protected or something), then you would not even need to upload the file manually. You could also set up a CRON entry to retrieve that file automatically as well. Automation is a beautiful (and necessary) thing.

MarcoDieleman

ASKER

Hamlet:
I want to show 20 products on a page. Your example was good. I think. I'm going to do a test with my textfile and see if I can make it show the right records.

Barry:
To retreive the file automaticly means that the server needs to have access to the computer of the user? That could be a problem because they don't have cable or something, just phone lines.

As far as I can see right now I think Hamlets solution is the best and easiest way to go. I will do some tests now and see if it works for me. Thanks for now you guys and I'll be back later today.

Hamlet081299

I thought that it might be useful to have an index file to help finding a specific line in the file.

Here's a simple example that builds an index file (only if the index file is out of date), and uses it to jump directly to a specified line (with the first line being numbered 0).

The idea could be extended to index the file according to certain field values.

I hope it'll give you some ideas...

<?php

define('MAX_LINE', 1024);

function build_index($dataname, $indexname) {
$fdata = fopen($dataname, 'r');
$findex = fopen($indexname, 'wb');
$i = 0;
while (fgets($fdata, MAX_LINE)) {
fwrite($findex, pack('V', $i));
$i = ftell($fdata);
}
fclose($findex);
fclose($fdata);
return true;
}

function update_index($dataname, $indexname) {
if (filectime($dataname) > filectime($indexname)) {
build_index($dataname, $indexname);
return true;
} else {
return false;
}
}

function goto_line($fdata, $indexname, $line) {
$findex = fopen($indexname, 'rb');
fseek($findex, $line * 4, SEEK_SET);
$i = unpack('Vx', fread($findex, 4));
fseek($fdata, $i['x'], SEEK_SET);
fclose($findex);
return true;
}

$dataname = 'test.txt';
$indexname = 'test.idx';
update_index($dataname, $indexname);

$fdata = fopen($dataname, 'r');
$line = 20;
goto_line($fdata, $indexname, $line);
while (($line < 30) and (list($id, $text, $qty) = fgetcsv($fdata, 1000, ","))) {
echo "<a href='getarticle.php?id=$id'>$id</a> \n";
echo "$text \n";
echo "($qty lines)\n";
$line++;
}
fclose ($fdata);

?>

axis_img

"As far as I can see right now I think Hamlets solution is the best and easiest way to go"

I agree that it may be the easiest, but I don't necessarily feel that it is the best. I am an advocate of looking at long-term issues, rather than instant gratification. The life-cycle of solutions are severely overlooked by a lot of developers.

Hamlet just suggested how to use/create an index file that will help with searching through the text file quicker. While it is a cool suggestion and neat trick (nice work, Hamlet), it kind of makes my point. You are reinventing the wheel here. As this thing grows in complexity or size, you are just asking for long-term problems. Using a true DBMS (mysql, oracle, access, whatever...) is not just a solution I proposed because "that's the way it SHOULD be done...". I am not that much of a purist. I just know that in the long run, you will wish you had used a database. The SQL language was written specifically for this type of thing, because of the limitations that are there without the SQL language. I speak from experience. I wrote a shopping cart about 3 or 4 years ago, which was using flat files (text files) for the product catalogue, instead of a true database. It was not a fun ordeal in the least.

Regardless... It's your choice. You know the full logistics involved with your project. If it is a long-term project, that can possibly grow in size, then I still strongly suggest putting a little more time into it initially (and honestly, it does not take that long to set up). Anyway... Good luck to you, regardless of which method you choose.

Regards,
Barry

Hamlet081299

I agree with most of what Barry says, and generally will try to look to longer term solutions, but I was assuming that in this case the idea is to work with the existing data which is produced by an old MSDOS app.

Hopefully one day the old app will be replaced?
(I had the mispleasure of a similar problem several years ago when writing a replacement for an old app. Every time the new app was started it loaded the entire old database into a new database, and then rewrote it on exit --- yeeeck!)

In the meantime though there are two ways to go...

1. Use the existing data as-is

This has the potential disadvantage of requiring more work to get the data out in a meaningful format - some wheel-reinventing required ;-)

2. Transform/import the data into a database

This leads to data replication issues. The database would need to be reloaded each time there were changes.

Depending on how often this happens that might be an acceptable solution, and could be done in the php code itself reasonably efficiently....

1. Get timestamp of database (timestamp stored in simple file perhaps).

2. Check timestamp of data file.

3. If data file timestamp is later than database timestamp then ...
(a) Empty the database
(b) Read the data file and insert records into database
(c) Update the database timestamp

4. Use the data from the database

All of this is very do-able in PHP.

Barry's points are very valid, and I would also strongly advise that you think of the full scope of what is currently and may eventually be required.

If your work is anything like mine, you will provide a nice solution according to a rather loose spec, only to find there's a dozen other things that they want later.

MarcoDieleman

ASKER

Your both right. I think I have to go with the MySQL solution. I just heard the user wants pictures with some of the products and when you list the products, the ones with the pictures are displayed first.

So I thought of building some admin pages to manualy upload the text file and after uploading trigger the script to write the records to a MySQL db.
Is it possible to only update the changed records so I can add a field for the picture name in the DB? Or is it easier to store this in a seperate DB?

And now the final question: what is the script to do this???

You'll both get points because you helped me both very well.

ASKER CERTIFIED SOLUTION

Hamlet081299

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

shmert

No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:
Accept: axis_img
Please leave any comments here within the next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

Sam Barnum
EE Cleanup Volunteer