Most efficient SQL Insert / update

AlexPonnath
AlexPonnath used Ask the Experts™
on
Hi, wondering what the most effective way would be to do the folowing

i have 2 tables, one is current status 2md one is history of a device. I always want to keep the current info in one table and the
history in a second table, that way the current info table is smaller and faster and large querys on the history dont bug it down.
My question now is what is the easyest way to do this since i ultimatly have to either insert or update the current table and then
also insert into history table.

Is it better to go and query the current table and if no match insert, if match update  or just delete the record and always insert a new one ?

As for the history table depending on the above solution i could either query the current one and insert it into history or go and
just always insert into history as well as current table.

Just as a note there will be between 50000 and 100000 in the current table and there will be up to a 1 Mio updates per hr so thats why i am looking
for performance.

Thanks

Alex  
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Director
Commented:
I wouldn't assume that using 2 tables is the best way to go to be honest... with the right indexes etc even a large table should run quickly.

Also just to note, you've listed with in SQL Server and MySQL which are radically different databases... it would help to know which you are using.

My preference would be to use a single table, with a field which points to the parent record, so the parent is the current data and the children (defined by those which have a link to a parent) are the history. You can build an index which will make that fast.

Then put a trigger on the table that inserts a history record whenever a parent record is changed.

This same principle can be used if you use a separate table for the history, and to be honest if you started with a single table and then found that performance really was an issue you could easily then modify the code to use a second table. You could even run as I've suggested and then move the history records to a second table in a batch job overnight or similar depending on whats best.

Hope that helps

Commented:
Please clarify a couple of things:

Do you intend to capture all changes or just a snapshot at periodic intervals?

How current does the history need to be?

Does the history contain change versions of the same entity over time or just the last version of each entity (especially after it gets "deleted")

Author

Commented:
As for as DB, for now in the Beta Stage we will use MS Sql Server 2008 but we are thinking going MYSQL down the road because of cost and some of the modules avail for mySQL.

Somehow i am not folowing with the single table and pointing to parent record and the children..
How to Generate Services Revenue the Easiest Way

This Tuesday! Learn key insights about modern cyber protection services & gain practical strategies to skyrocket business:

- What it takes to build a cloud service portfolio
- How to determine which services will help your unique business grow
- Various use-cases and examples

Author

Commented:
it is gps data which means there is a record in the current table which has the deviceid and the Lat and Long, in the history i want to store all historical values which
are send every 5 min on the position of the device so i can tell where it was the last hr's etc

 
Dale BurrellDirector
Commented:
create table MyTable
(
  Id uniqueidentifier
  .. all your fields
  ParentId uniqueidentifier
)

Parent (live) row: Id1, All your fields, null
Child 1 (history) row: Id2, All your fields, Id1
Child 2 (history) row: Id3, All your fields, Id1

--trigger (rough)

insert into MyTable (All your fields, Parent)
  select All your fields, Id from Deleted -- i.e. old data

Something along those lines...

Author

Commented:
that sample would store all data in a singe table which i dont think is good for performance. Specialy if i need to delete
records from history table everynight it would impact the query performance. keep in mind the history could have 10 million or more records

Dale BurrellDirector

Commented:
As I said with the correct design e.g.indexes etc I don't think the number of records will be an issue... it will depend on your database of choice, but enterprise database's e.g. SQL Server are designed to handle that quantity of data and more.

And also as I said, its trivial to use the trigger to insert into another table if thats what you prefer.

Author

Commented:
my question was more down the line what will be more efficient

doing a select row and if count > 0 do an update if not do an insert or do a delete record and a insert ..

What is more effective and faster ?
Commented:
I would create a single procedure for "capture_location"

In capture_location, your exists to not exists ratio should be very, very high.  Therefore, I propose you fire off the update followed by an insert only when the update fails to change any rows.  

begin trans
insert history
update current where
if no rows updated
    insert current  
commit trans

     



Dale BurrellDirector
Commented:
In general the less complexity the faster the database will run, so simple inserts/updates may run faster than checking if they exist first. There are other factors to consider as well, such as indexes. Especially if you have a clustered index that will slow down inserts, as will all indexes to some extend but it depends whether you need to be able to access the data out quickly or not. Foreign keys also slow down inserts/updates, so if you're not too worried about data integrity (or you're pretty confident it can't be wrong) then you could leave those off. Even wrapping the query in a stored procedure may slow it down a bit.

However the only way you are going to know for sure is to run the system and see what the performance is like. I would setup a couple of test systems and try the different approaches offered above and see which performs best under heavy load. I don't know of any other way you can be sure in advance of how it will perform.

In my experience databases can usually handle lots of simple data better than complex code and/or queries. Also you can usually optimise performance for insert/update or access but harder to optimise for both. So it does depend on whats most important.
Top Expert 2012

Commented:
>>What is more effective and faster ?<<
Perhaps Merge is more efficient than both of those options, unfortuantely as Dale has pointed out, only you are going to be able to determine the best approach to take.
Olaf DoschkeSoftware Developer
Commented:
In what case do you really need an update of an existing record? This is really just simple logging of GPS positions, so in case of a 1 table solution I'd also prefer and suggest, you only have inserts.

Assume you log the GPS positions of many devices, each having some device ID, you can find the last known postion when adding a datetime field, querying for the last record of some device will give you it's last known position.

select top 1 * from gpspositions where deviceid = ? order by datetime desc
So you need an index on deviceid (nonclustered) and have a clustered index on deviceid,datetime, that should do it.

In the other case you still insist on sticking with a seperate table for last positions and history, you only will have inserts once for each new device, so you do that in a seperate part of your application adding a new gps device.

Otherwise you will only have updates and then use an instead of update trigger to a) do the update 1:1 as intended and b) insert the last state into the history table.

I don't see why you would have a problem in first finding out if there is a record to update or not.

Bye, Olaf.
Olaf DoschkeSoftware Developer

Commented:
In direct answer to your direct question: Of course update is faster than delete+insert, as it's only one operationa dn only updating a record, in your case just the lat/long fields.

And since I assume you seldom will have the need of an insert and could do that in a seperate section of your application ading devices, if you don't want to add that, you could simply do as dgmg suggest, first update, then check @rowcount and if that's 0, then insert. You will only have that slow case in case of a new device once in it's whole lifetime.

Bye, Olaf.
I would do something like this:

CREATE TABLE CurrentData (DeviceId INT, lat INT, long INT, date DATETIME)

CREATE TABLE HistoryData (DeviceId INT, lat INT, long INT, date DATETIME)

CREATE PROCEDURE InsertData (@DeviceId INT, @lat INT, @long INT)
AS
BEGIN
      IF EXISTS (SELECT DeviceId FROM CurrentData (NOLOCK) WHERE DeviceId = @DeviceId) THEN
            --do a copy to the history then update
            BEGIN
                  INSERT INTO dbo.HistoryData (DeviceId, lat, long, date)
                  SELECT DeviceId, lat, long, date FROM CurrentData (NOLOCK) WHERE DeviceId = @DeviceId
                  
                  UPDATE CurrentData SET lat = @lat, long = @long, date = GETDATE() WHERE DeviceId = @DeviceId
            END
      ELSE
            --just insert
            INSERT INTO dbo.CurrentData (DeviceId, lat, long, date)
            VALUES (@DeviceId, @lat, @long, GETDATE())
      END
END
GO

On the current data table set deviceid to be your primary key and inserts and updates should be fast...

Commented:
I have no knowledge in MySQL. but in SQL2005 and above you can try partition tables. history and current data all will be in the same table but logically data is divided based on formula we define like date month etc.,

check the following articles
http://msdn.microsoft.com/en-us/library/ms345146(SQL.90).aspx
and http://msdn.microsoft.com/en-us/library/aa964122(SQL.90).aspx
we implemented the sliding window partitioning in one of our projects the partition was based on date. total partitions 60(current data + previous 59 days data)
G Trurab KhanSnr. Development Manager

Commented:
Define Instead .. triggers for update & insert for the Status table
G Trurab KhanSnr. Development Manager
Commented:
Also as there is huge number of updations, you can keep on adding in a single table & at some point run a procedure to transfer all entries other than the last ones to the history table & removing them from status table thus downsizing the currentstatus.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial