BACKGROUND:
I have written an engineering application that uses a ClientDataset with DBGrid and navigator components making the development process of easier and simpler (or "cleaner") in terms of both code and appearance on the screen. Ā Previously, I had been using my own scheme for storing the data sets, making use TurboPower's Orpheus data table to enter, edit, and view data.
The purpose of my application (or, more accurately, subcomponent of an application) is to manage several data sets consisting of terrain data points. The data sets can be manipulated by combining them in various ways to make new data sets. They are stored together in a single database file as nested data sets.
Ā
~~~~~~~~~~~~~~~~~~~~~~~~
INPUT DATA DESCRIPTION:
The raw "terrain" data are coordinate points obtained from ground surveys and saved in "point files" (using AutoCAD terminology) that come right out of survey instruments. The raw data files can have a variety of formats. Data for a "point" consists of at least three floating point numbers (X, Y, and Z coordinates), but may also include a unique point ID number (integer), and a point code (string). These files are in CSV format and frequently contain more than 200,000 coordinate points.
Ā
~~~~~~~~~~~~~~~~~~~~~~~~
PERFORMANCE PROBLEM:
Since the ClientDataset.LoadFromFile method only supports CDS and XML file formats, I am reading the CSV file and Inserting the data to the ClientDataset (see code snippet below). Ā After about 20k-30k Inserts, the performance slows down dramatically. Ā However, once the data have been inserted into the ClientDataset and saved to a file, in either CDS or XML format, the LoadFromFile only takes a couple of seconds. Ā So the length of time it takes to insert the CSV data into the ClientDataset more than offsets its convenience.
I am interested in finding a solution to the slow import of CSV files into a client data base. Ā For me, that would be the cleanest way of storing my data files, specifically by making use of nested data sets and storing them as binary (*.CDS) files. Ā I prefer saving the CDS in binary format because it makes it harder for people to fiddle with the data, outside of my application's control, if it is not in a text format. Ā As for preventing people from mucking up a text data file, they will do it if it is possible. Ā They can still alter a binary file, but if they do not understand what they see they seem to be less likely to touch it.
If there is no solution to the ClientDataset insert performance, I would like to know about alternative solutions.
My DB programming knowledge is scant and this is my first EE question.
Ā
~~~~~~~~~~~~~~~~~~~~~~~~
SAMPLE CODE (includes attempts to insert fields by name and by number):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Factor := 100/(TerrainDataSet.PointList.Count - 1);
dlgProgressBar.Show;
ClientDataSet2.LogChanges := False;
with ClientDataSet2 do begin
// Open;
Ā LogChanges := False;
Ā for i := 0 to TerrainDataSet.PointList.Count - 1 do begin
Ā Ā Application.ProcessMessages;
Ā Ā if ((GetKeyState(VK_Escape) and 128) = 128) then begin
Ā Ā Ā if MessageDlg('Stop importing data?', mtConfirmation, [mbYes, mbNo], 0) = mrYes then Break;
Ā Ā end;
Ā Ā TP := TerrainDataSet.PointList[i];
// Ā Append;
// Ā FieldByName('Number').AsInteger := i;
// Ā FieldByName('X').AsFloat := TerrainPoint^.X;
// Ā FieldByName('Y').AsFloat := TerrainPoint^.Y;
// Ā FieldByName('Z').AsFloat := TerrainPoint^.Z;
// Ā FieldByName('Description').AsString := TerrainPoint^.Description;
Ā Ā AppendRecord([TP^.Number,TP^.X,TP^.Y,TP^.Z,TP^.Description]);
Ā Ā if (i mod 100 = 0) then dlgProgressBar.ProgressBar.Percent := Round(Factor*i);
Ā
Ā Ā end;
end;
Ā
dlgProgressBar.Close;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~