asked on

t-SQL Large table dumps

Hi there,
i am using SQL SERVER t-SQL.
I have a table using SSIS, i load the table in the warehouse from a live Database every 15mins.
Because there is so much information i would like to maybe split the table and create dumps for maybe every month but i do not really know where to begin.

Because when reporting using this table i could be pulling data for many months i didn't want to be including loads of tables in the query and causing run time errors.
I was trying to think of creating a table which would hold an id for each 'monthly table' and by calling the
id in this table it would pull the data from the other tables.

Has any one any ideas on how to do this, or of another idea that could be much better, please advice?
If I am not explaining myself please let me know and I will try again,
Thank you

dbaSQL

Possibly a horizontally partitioned view, sitting atop each of your monthly tables -- thus the query only references the view, but selects from the varying tables, as needed.

http://msdn2.microsoft.com/en-us/library/aa933141.aspx
http://www.sqlmag.com/Articles/ArticleID/8234/8234.html?Ad=1

this way your data appears as if it's all in one table

fesnyng

dbaSQL -- thank you for that answer. Does the partitioned view offer an advantage over a 2005 partitioned table set up along the same lines?

dbaSQL

To be honest, I think the partitioned table may be more advantageous. I say this, however, without hands on experience. I have only worked w/the v2000 partitioned view. In my opinion, go v2005 if you can.

see here: http://msdn2.microsoft.com/en-us/library/ms345146.aspx

To my knowledge, it will still be avail to you much like the v2000 p.views, in that you are physically separating your data, but also accessing it via a unified logical view

Putoch

ASKER

Thank you,
Can i still use Partitioned Views if i do not have indexing on my Table.
i cant really have a PK on my table as i don't really have one unique id, as the data looks as follows:

Mac address / Modemid /time inserted / site id
Now the modem ID is the driver behind the query, I want to see what mac address or time or site the modem has hit. and in this table the modem could have hit a different site at diffent times?

Also i was looking at the Check with the partition, i would like to do it with the time-inserted as i want to do it per month. but i read that you can't do the check on timestamps?

dbaSQL

for a paritioned view, your partition key must be part of the table primary key.
in the msdn2 link i sent before, scroll down to 'table rules' and 'column rules'

i suppose you could use modemID as your partition key/primary key. i say that because you say this is the 'driver'. otherwise, i'd have gone with siteID.

i am wondering, though -- do you really have enough data to warrant a partitioned structure?

Putoch

ASKER

Perhaps not, i have 11209830 rows in my table, and this is only since March 07.
I would like to do something for better management of the table rather then just letting it grow anyways.

the thing about using the modem id as the primary key i think, is that the same modem could have multiply appearances in the table because it has hit a different site id at a different time (if you follow me)
I have been trying to follow the example you sent from this link http://www.sqlmag.com/Articles/ArticleID/8234/8234.html?Ad=1 as this deals with months but i am finding it hard to follow, Sorry for all the questions and lack of understanding.

if you think that a partitioned view is not what i am looking for what else do you think i could use?

Thank you.

dbaSQL

I, personally, don't know that you really warrant a partitioned table structure. It won't hurt, but you may have some limitations w/the paritioned key values that push you a different direction.

let me ask this --- what are your retention requirements? how long do you need to keep how much data online/avail for use?

Putoch

ASKER

this is the whole reason for my table.
You see, i take this data from a live db which only holds information on a modem untill its updated.
so on the Live DB this table will have one modem in the whole db,
The reasoning for my table in the data warehouse db is so we can store the information flow on the modem for as long as possiable.

dbaSQL

well then, possibly the partitioned structure is where you want to go. i am still inclined to push you toward the v2005 structure, though.

>>the thing about using the modem id as the primary key i think, is that the same modem could have multiply appearances in the table because it has hit a different site id at a different time (if you follow me)

when you say 'primary key', if you're actually referring to the table pk, and not the partitioned key, why not use a composite key?

Mac address / Modemid /time inserted / site id

modemid can be in this table more than once on the same siteid, yet different time inserted, yes?
if so, make timeInserted,ModemID,SiteID your pk. timeInserted is first, as I assume it has the most selectivity of the three attributes.

if that is not possible, and the same modemID can be in there mutiple times for the varying sites, but with the same insertTime, then do this for your pk: timeInserted,ModemID,MacAddress,SiteID

I believe I would then try to make a check constraint based on date --- timeinserted -- assuming this is datetime and not timestamp data type. Is it?

Putoch

ASKER

Yes its datetime not timestamp.
Ok i am going to try this so and i will let you know how it goes, thank you so much for help and patience!!

dbaSQL

No problem at all, Putoch. I am just hoping I have been able to help.

Putoch

ASKER

Hi there,
Ok i created some partitioned tables and a view to hold these tables, however i know this sounds so silly but how to i insert information into the partiitioned view.

I don't know what way to set up the insert statement into the partitioned Tables ?

please help.


CREATE TABLE PreMarch2007
(mac_address varchar(32),
eid varchar(32) NOT NULL,
time_inserted datetime NOT NULL,
btsid varchar(32),
CONSTRAINT UQ_PreMarch2007 UNIQUE  (time_inserted,eid) , 
CONSTRAINT CHK_PreMarch2007_time_inserted 
	CHECK (time_inserted <= '2007-03-01'))
 
 
CREATE TABLE MarchApril2007
(mac_address varchar(32),
eid varchar(32) NOT NULL,
time_inserted datetime NOT NULL,
btsid varchar(32),
CONSTRAINT UQ_MarchApril2007 UNIQUE  (time_inserted,eid) , 
CONSTRAINT CHK_MarchApril2007_time_inserted 
	CHECK (time_inserted between '2007-03-01'and '2007-05-01'))
 
 
 
CREATE TABLE May2007
(mac_address varchar(32),
eid varchar(32) NOT NULL,
time_inserted datetime NOT NULL,
btsid varchar(32),
CONSTRAINT UQ_May2007 UNIQUE  (time_inserted,eid) , 
CONSTRAINT CHK_May2007_time_inserted 
	CHECK (time_inserted between '2007-05-01'and '2007-06-01'))
 
 
 
CREATE TABLE June2007
(mac_address varchar(32),
eid varchar(32) NOT NULL,
time_inserted datetime NOT NULL,
btsid varchar(32),
CONSTRAINT UQ_June2007 UNIQUE  (time_inserted,eid) , 
CONSTRAINT CHK_June2007_time_inserted 
	CHECK (time_inserted between '2007-06-01'and '2007-07-01'))
 
--select * from radius_cpetracking where time_inserted < = '20070301' order by time_inserted desc
 
 
 
GO
CREATE VIEW CEPTRACKING
AS
SELECT * FROM PreMarch2007
UNION ALL
SELECT * FROM MarchApril2007
UNION ALL
SELECT * FROM May2007
UNION ALL
SELECT * FROM June2007
GO

Open in new window

Putoch

ASKER

Can i just do this:
INSERT INTO PreMarch2007 (mac_address,eid,time_inserted,btsid)
SELECT mac_address,eid,time_inserted,btsid
FROM RADIUS_CPETRACKING where time_inserted <= '2007-03-01'

Is there any way i can create a stored procedure to do this to insert monthy dates from now on so i do not have to continue to run this every month or how would i do this?

ASKER CERTIFIED SOLUTION

dbaSQL

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

dbaSQL

>>Is there any way i can create a stored procedure to do this to insert monthy dates from now on so i do not have to continue to run this every month or how would i do this?

yes, of course.

IF OBJECT_ID('insert_proc','p') IS NOT NULL
DROP PROC insert_proc
GO
create procedure insert_proc (
@start datetime,
@end datetime
)
AS

DECLARE @counter int
DECLARE @lastreturned int

SET @counter = 0
SET NOCOUNT ON
SET ROWCOUNT 5000 --- use batches so you don't blow the transaction log
SET @lastreturned = 1 --- just so you can enter the loop
WHILE @lastreturned >0
BEGIN
      INSERT databasename.dbo.CEPTRACKING
      SELECT mac_address, eid, time_inserted,btsid
      FROM databasename.dbo.RADIUS_CPETRACKING
      WHERE time_inserted >= @start
      AND time_inserted < @end

      SET @lastreturned = @@ROWCOUNT -- you're trapping the rowcount from the prev insert
      SET @counter = @counter + @lastreturned -- add it into the counter
END
SET ROWCOUNT 0
SET NOCOUNT OFF
SELECT @counter
go

then you just fire it like this:
exec databasename.dbo.insert_proc @start='whateverdate',@end='whateverdate'

i'm pretty sure that will work