We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now

x

Read a large database into memory more efficiently

riskassessor
riskassessor asked
on
Medium Priority
223 Views
Last Modified: 2010-04-05
I have a program that I would like to make more efficient. A large part of the processing time is spent reading two large databases into memory (about 150,000 records each) so that I can do computations on them. I am reading them in with the construction:

mydatabasetable.first;
while not mydatabasetable.eof do begin
<assign the database fields to various variables>.
mydatabasetable.next;
end;

I am finding that it takes several minutes to read each database. Is there a faster way to get the data into memory?
Comment
Watch Question

Commented:
i think part "<assign the database fields to various variables>" takes the largest amount of time. check that, e.g. just iterate through mydatabasetable. If so, think, maybe you can change the assigning part, e.g. some computations or operations should be done later on.

Commented:
If you use mydatabasetable['fieldname'], change it to mydatabasetable.FieldByName('fieldname').As...
It may be a little faster

Author

Commented:
My code already implements both the above suggestions. Thanks.

Commented:
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls, DB, ADODB;

type
  TForm1 = class(TForm)
    ado: TADOQuery;
    Button1: TButton;
    procedure FormCreate(Sender: TObject);
    procedure FormDestroy(Sender: TObject);
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    Buffer: TList;
  end;

type
  pDbData = ^rDbData;
  rDbData = record
    Name: string[255];
    Age: integer;
    Telephone: integer;
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.FormCreate(Sender: TObject);
begin
  Buffer := TList.Create;
end;

procedure TForm1.FormDestroy(Sender: TObject);
begin
  while Buffer.Count <> 0 do
  begin
{$I-}
    dispose(Buffer[0]);
    Buffer.Delete(0);
{$I+}
  end;
  freeandnil(Buffer);
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  k: pDbData;
begin
  ado.Open;
  while not (ado.Eof) do
  begin
    new(k);
    k.Name := ado.Fields.Fields[0].AsString;
    k.Age := ado.Fields.Fields[1].AsInteger;
    k.Telephone := ado.Fields.Fields[2].AsInteger;
    Buffer.Add(k);
    ado.Next;
  end;
end;

end.

ado.Fields.Fields[X].assomething my help a little bit. But the datasets are big have you tried to cut datasets in more then one part so mybe you could cut datasets in more parts and work with datasets in more than one thread.

Author

Commented:
Kristao, thanks for algorithm and ideas.

The algorithm you propose is essentially the same as that already in my program.

Regarding your suggestion of working with more than one thread, I have not tried it but would not have thought it would be any quicker on a single-processor machine.

In reply to several commenters, I find it makes little difference to the speed whether I use Fields.Fields[X].As... or FieldByName('fieldname').As... or FieldValues('fieldname').
Commented:
Ok is there is big datasets, i supose you need to use thouse data wich are in dataset. This idea could make your soft a litle bit quicker.

One process reads data from dataset in buffer

Second process take data from bufer and works with them, in this way you don't need to wait until all dataset is loaded in memory.

There will be litle collision in dataput(data:pointer) and dataget(var data:pointer), becouse in multithread u need to use TCriticalSection. > "TCriticalSection allows a thread in a multi-threaded application to temporarily block other threads from accessing a block of code."

I'm using this kind of tehnology my self. My soft get very big dataset there is more than 80 000 records in it, i can't wait until all data are in memory. I Start reading dataset, put the info in buffer, othere process just takes the data from and starts to work with data :). In my case there is 1 datareader and up to 10 dataworkers :)

regards
Kristao.

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.