Link to home
Start Free TrialLog in
Avatar of loop_until
loop_until

asked on

File storage suggestion

Hi everyone,

  I would need some suggestions about storage method with Java. I have specific needs. For those who knows, it is for development on TINI (www.ibutton.com/TINI). Most of the code for JDK 1.2 is portable so if you don't know this platform, just give your suggestion anyway, it will most likely work ;).

  - It must not take too much space nor memory as it is for embedded device.
  - Speed is not a must but it should not be too slow as it a small processor.
  - It must be searchable by a given key.
  - It will have multiple values stocked for each record.
  - My concern about databases is the ressources taken given the limited one there is on embedded device. I've seen some simplified DB but I think I would prefer working with flat files.
  - It must flexible (adding a field without loosing meaning of existing data). This means comma separated text file is not that good ;).
  - It doesn't need to handle millions of records. 10000 would be a good maximum.

I'm open to every suggestions. I have my ideas but want to compare and find better! Anyone has any idea with simple implementation?


Thanks for you help in advance everyone!
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Perhaps a serialized Hashtable? What would be in a record?
Avatar of loop_until
loop_until

ASKER

Something like:

  Unique location / Name / Description / Quantity / ...

Could you point me out some URL with example or some simple implementation?

Thanks, seems interesting.  :-)
XML? or is that too complex for the system.
example:
<location id="1">
 <name>blah blah blah</name>
 <description>This is my description</description>
 <quantity>10</quantity>
</location>

You could maybe use SAX to search for the right location so you dont have to load the whole document as with DOM.
A compressed serialization would give you less application overhead and your Hashtable would give you fast indexed look-ups.



Hashtable docs:
http://java.sun.com/products/jdk/1.2/docs/api/java/util/Hashtable.html

Serialization example:
http://developer.java.sun.com/developer/TechTips/2000/tt0229.html

Make your record a Serializable class.
CEHJ:

  Ok, this is very interesting. But I'm missing something. I'm not an experienced Java programmer so I think I understand the hashtable principle, I understand the serialization but my concerns are about gluing everything together. The second article about serialization was very good but it is more about having a class representing "1 record" serialized in 1 file. What about having 5000 records and searching thru them?


conick:

  This is very interesting also. Whatever I choose I'll double the points and split them for both of you if you can explain just a little deeper your solutions. One of my solutions were to use http://castor.exolab.org/ such as explained in http://www.onjava.com/pub/a/onjava/2001/10/24/xmldatabind.html. But you triggered mister curiosity in me with SAX. Would you have a good example such as the link above that explain the basics of XML with SAX? I would like it to be useful for multiple records.


Thanks both of you.
>>The second article about serialization was very good but it is more about having a class representing "1 record" serialized in 1 file. What about having 5000 records and searching thru them?
>>

It's the Hashtable *containing* the 'records' that you'd serialize. Searching through them would be done by their key.
Okay, got it, like it :-). You'll have the points. I'll just wait for more information from conick to know if I double the points or not.

I'll try to implement your solution right away.

As long as you can store the whole hashtable in memory, a serialized hashtable may be the way to go. Just make sure you read up on versioning (serialVersionUID) so you don't run into class versioning problems. I would only use a SAX implementation if I couldn't store all the data in memory and a DB was out of the question.

A few questions I would ask myself about the project:
1) Can I store all data in memory? Is it prohibitively expensive to do so?
2) How often will I need to look up information? Should it lag on startup or take longer for each query?
3) What technologies am I familiar with? ie. if you have to learn an API, this can be a steep learning curve.

Heres the SAX tutorial that I have bookmarked, however there may be more recent ones, this looks a couple years old:
http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/index.html

Don't worry about points for me, I was just throwing out an idea. I really don't know enough about the system or project to comment much. If the hashtable works for the project, run with it.
Actually this interested me - i've thought about doing this before. Try this - maybe conick can suggest some performance enhancements:

import java.util.Hashtable;
import java.io.Serializable;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileInputStream;
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.ObjectOutputStream;
import java.io.ObjectInputStream;
import java.io.IOException;
import java.util.Random;
import java.util.zip.ZipOutputStream;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;



public class DatabaseTest {
  final static int MAX_RECORDS = 100000;
  final static String DB_FILE_NAME = "db.zip";
  final static String DB_ENTRY_NAME = "db.ser";
  Database database;

  public DatabaseTest(){
    // This of course would not be hard-coded
    File databaseFile = new File(DB_FILE_NAME);
    if (databaseFile.exists()){
      // load from file
      //DEBUG
      System.out.println("Loading database from file...");
      database = loadDatabase();
      System.out.println("Done!");
    }
    else {
      //DEBUG
      System.out.println("Creating new database...");
      database = new Database(MAX_RECORDS);
      fillDatabase();
      System.out.println("Done!");
    }
  }

  void fillDatabase(){
    // Create 1000 records and add them to the database
    for(int i= 0;i < MAX_RECORDS;i++){
      Random rand = new Random();
      int key = i;
      int field2 = rand.nextInt(Integer.MAX_VALUE);
      double field3 = rand.nextDouble();
      String field4 = "asdasdgaksjhdgkajhsdg0difsdjfnksj";
      Record record = new Record(key,field2,field3,field4);
      database.put(new Integer(key),record);
    }
  }
/*
  Database loadDatabase(){
    Database database = null;
    File databaseFile = new File("DB_FILE_NAME");
    ObjectInputStream in = null;
    try {
      in = new ObjectInputStream(new FileInputStream(databaseFile));
      database = (Database)in.readObject();
      in.close();
    }
    catch(IOException e){
      e.printStackTrace();
    }
    catch(ClassNotFoundException e){
      e.printStackTrace();
    }
    return database;
  }
*/

/*
  Database loadDatabase(){
    Database database = null;
    File databaseFile = new File(DB_FILE_NAME);
    ObjectInputStream in = null;
    ZipEntry entry = null;
    try {
      ZipInputStream zis = new ZipInputStream(new BufferedInputStream(new FileInputStream(databaseFile)));
      in = new ObjectInputStream(zis);
      while((entry = zis.getNextEntry()) != null){
        database = (Database)in.readObject();
      }
      in.close();
    }
    catch(IOException e){
      e.printStackTrace();
    }
    catch(ClassNotFoundException e){
      e.printStackTrace();
    }
    return database;
  }
*/

  Database loadDatabase(){
    Database database = null;
    File databaseFile = new File(DB_FILE_NAME);
    ObjectInputStream in = null;
    ZipEntry entry = null;
    try {
      ZipFile zf = new ZipFile(DB_FILE_NAME);
      entry = zf.getEntry(DB_ENTRY_NAME);
      BufferedInputStream bis = new BufferedInputStream(zf.getInputStream(entry));
      in = new ObjectInputStream(bis);
      database = (Database)in.readObject();
      in.close();
    }
    catch(IOException e){
      e.printStackTrace();
    }
    catch(ClassNotFoundException e){
      e.printStackTrace();
    }
    return database;
  }

  void saveDatabase(){
    //DEBUG
    System.out.println("Saving database...");
    File databaseFile = new File(DB_FILE_NAME);
    ObjectOutputStream out = null;
    try {
      ZipOutputStream zos = new ZipOutputStream(new BufferedOutputStream(new FileOutputStream(databaseFile)));
      ZipEntry entry = new ZipEntry(DB_ENTRY_NAME);
      zos.putNextEntry(entry);
      out = new ObjectOutputStream(zos);
      out.writeObject(database);
      out.flush();
      out.close();
      //DEBUG
      System.out.println("Done!");
    }
    catch(IOException e){
      e.printStackTrace();
    }
  }

  public void lookUp(int key){
    // Look up a record in the db
    //DEBUG
    System.out.println("Looking up record with key 50...");
    System.out.println(database.get(new Integer(key)));
    //DEBUG
    System.out.println("Done!");
  }



  public static void main(String[] args){
    // Load or create database
    DatabaseTest dt = new DatabaseTest();
    // Look up the 51st record
    dt.lookUp(50);
    // You would only do this if it had been created or changed
    dt.saveDatabase();
  }

}

class Database extends Hashtable {
  public Database (int initialCapacity){
    super(initialCapacity);
  }
}

class Record implements Serializable {
  int key;
  int field2;
  double field3;
  String field4;

  public Record(int key, int field2, double field3, String field4){
    this.key = key;
    this.field2 = field2;
    this.field3 = field3;
    this.field4 = field4;
  }

  public int hashValue(){
    return key;
  }

  public String toString(){
    return  new StringBuffer().append("[")
                              .append("key=")
                              .append(key)
                              .append(",")
                              .append("field2=")
                              .append(field2)
                              .append(",")
                              .append("field3=")
                              .append(field3)
                              .append(",")
                              .append("field4=")
                              .append(field4)
                              .append("]")
                              .toString();
  }
}
What are these awful 30 x 50 (approx) windows that open when we post conick?
This is all very interesting. But as conick pointed out, yes, loading everything in memory might be a problem. As it is a small microcontroller, memory is limited. I'm confused :-\.

And, SAX seems a bit complicated to implement. Well, more than the serialized method... Am I wrong?

Does anyone have an idea about database for small embedded system with limited processor / memory power.

The "hard disk" is a 1MB flash card (will be expanded later) with a small filesystem. The "memory" is a 512 bytes page SRAM if I remember ok...

Would there be a possible modification of the serialized hashtable method to allow to fit only what's needed in memory? Like, what if I'd save the serialized hashtable containing pointers to individual files containing each records? This is not very clean, I know, and I'll have to see if the small filesystem can handle this but it would allow better "memory" management? It is not the fastest if I have to browse thru all records... Does this seems stupid?

Well we can think of ways to enhance memory management possibly, but you have been unclear about what you records are and how large they are. Do you not know this? If not it's hard to assess.
Records would like:

   Integer UniqueID (*)
   Byte Addr
   Byte X
   Byte Y
   Byte Z
   String Part (max 30 chars)
   String Description (optional if too big or truncated if necessary)
   Word Qty

This would be the biggest record type. Maximum record number would be 5000.

I have a small question along the way. When working with FileStreams to append to a text file, does java load the file in memory or append on the disk?

Thanks for your help :-)
Okay, sorry everyone, I think serialization is *not* supported by the current firmware of TINI we own. Remember I told you almost everything was supported as the JDK 1.2, well, we found one of the current limitation.

It is currently available in another firmware but as I can not prototype anything now, if we could find other solutions... :-\

Gee, I feel bad. I was loving the solutions we were getting here.

Any *other* ideas?

Thanks again.
>>I think serialization is *not* supported by the current firmware of TINI we own.

What does this mean - do you mean it does not support persistence? Serialization itself is specific to Java.
Yes, but the TINI is and embedded system using *almost* JDK 1.2. It has its own class which are similar (almost transparent we could say) so when you develop, you don't see the difference but you compile using their tini.jar files.

Taken from their Limitations.txt file:

-=---------------=-
TINI Firmware 1.02e
-=---------------=-

  [...]

- TINI does not currently support serialization.


Altought, it is supported in the Firmware 1.10 which does not work on every board...

http://www.ibutton.com/TINI/software/index.html

It is, despite all this, a very interesting microcontroller, compared to old microcontrollers programmed in ASM or C. TINI and Javelin (from http://www.parallax.com) are both Java-ready microcontrollers.

Pity. conick's xml might do it. Minimal markup would add 7 bytes per field.
Or if you just saved the fields to a file, with a Description field of 24 bytes, you could store a record in 64 bytes.
Wow, you guys have been busy.

I didn't realize the platform wasn't 1.2 compliant. If it doesnt support Serialization, I really doubt it supports a standard XML parser either.

If there are DB implementations for use with TINI I think that is your best bet. Make sure its not an in-memory db otherwise you run into the same memory problem as the hashtable and DOM implementations.

Another (ugly) possibility is to use flat fixed-width or delimited files in some way. You could probably come up with a proprietary file format. Maybe something based on a line in the file being a "record". You wouldn't load the whole file into memory but look for the first token on a line for your id. If it matches read the rest of the line and parse.

CEHJ: Thats an interesting database implementation. If you abstract out Record and give it a public API you could use it for just about anything.
>>If you abstract out Record and give it a public API you could use it for just about anything.

Shall be going into production tomorrow :-)

Actually i found i got 85% compression ratio on the data. Saving and loading were a bit slow though - about 10 seconds each on my machine. Maybe a bit of nio might help..?
Just out of interest, loop_until, 5000 of your records serialized to 28,433 bytes with my code. You'd probably get better compression with non-random data.
Wow CEHJ, very nice your implementation. I might not use it for this project but I'll keep that in memory.

=================================

Someone told me:

"This is probably your only hope:

http://hsqldb.sourceforge.net/

Hypersonic SQL was an unusual database in that it does not have a file store, i.e. all data is kept in memory.  This will work to your advantage on TINI because you will not take a double memory hit of a complete database in file space and the same data in RAM space. When HSQLDB quits, it can be setup to dump the database to a file, but what it dumps is the complete SQL commands to recreate the memory image.  The database originated in Java 1 days, so it may run--if not you get source code."

=================================

And, finally, I have *3* more small questions:

1. When working with FileStreams to append to a text file, does java load the file in memory or append on the disk?
2. Is it possible to "index" a line number and access it very quickly?
3. >> "You wouldn't load the whole file into memory but look for the first token on a line for your id. If it matches read the rest of the line and parse."
  Well, related to the first question. I guess I can only load line by line?

=================================

Conclusion:

  - I'll keep serialization in mind.
  - I'll have a look I think to Hypersonic SQL.
  - Finally, I might end doing my own database implementation using a proprietary file with an index file by its side to accelerate searching and keep track of what means each columns and their width. :-\ Not very beautiful but I think the simplest way might do the trick.

Don't forget my 3 little questions ;-). Thanks!
ASKER CERTIFIED SOLUTION
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks CEHJ. Your help was greatly appreciated. Your knowledge was of great help and I learned interesting things.

Have a nice day!

conick:

Please go ask your points: https://www.experts-exchange.com/questions/20411815/Points-for-conick.html.

I know you didn't mind about the points but your help was also greatly appreciated.

No problem. I'm quite interested in these kind of restrictions actually and would be keen to hear how you get on with TINI. Keep in touch if you want:
cj@DONTEVENTHINKOFSPAMMINGproteanit.net
[delete the obvious :-)]
Okay, I'll give you feedback on that :-).
Hypersonic SQL is sort of a memory hog. You would probably be better served going with CEHJ's suggestion to come up with a way yourself to store the information in files.

I like the idea of having an "index" file that can be read quickly and that points to an area of the larger data file(s).  If there are many records you could use a b-tree implementation for faster access to indexes.

I think you'll want to stay away from "in-memory" anything (whether it be DBs or hashtables etc).
Thanks conick, I think I'll follow your advise (and CEHJ's) and make my own implementation of a database with a little of everything we talk about here that's applicable.

:-)

Have a nice day everyone!