gbzhhu
asked on
SQL Server - create aggregated data from a table
Hi,
I have a table sales history which has about 1 million records but could go up to 3 or so millions.
CREATE TABLE salesData(
[id] [int] IDENTITY(1,1) NOT NULL,
[originId] [nvarchar](20) NOT NULL,
[orderId] [nvarchar](250) NOT NULL,
[orderDate] [datetime] NOT NULL,
[statusRef] [int] NOT NULL,
[orderSequence] [int] NOT NULL,
[productNumber] [int] NOT NULL,
[customerNumber] [int] NOT NULL,
[finalQty] [int] NOT NULL,
[finalValue] [decimal](18, 2) NOT NULL,
[invoiceNumber] [nvarchar](20) NOT NULL,
[modifiedDate] [datetime] NOT NULL,
[deleted] [bit] NOT NULL,
CONSTRAINT [PK_salesData] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
The data on this table needs to be available to mobile devices. Basically a sales agent will select a customer in the mobile app then a list of products will be displayed. Now along this product list, for each product, I need to display how many that customer bought last 12 months, month by month. Something like this.
Sales : Jan Feb Mar Apr etc (for rest of months)
20 11 0 4
Product: Dyson Hoover blah blah
1. The salesData is updated daily from another system.
2. The mobile apps access a stored procedure (not written yet) that needs to return the data above (customerNumber, ProductNumber, then the months and values/counts). It returns only 12 months (from current month to the same month previous year)
3. Lots of mobiles will be accessing stored procedure
4. The returning of the data from the stored procedure needs to be fast
5. The data is big as I mentioned
So I thought this:
I would create a dedicated aggregated data table aggregatedSalesData. Not sure how to structure this table, the part that displays months and values. Not sure whether to keep them flat columns m1, m1value, m2, m2Value etc or columns like monthNum, MonthValue so that each month goes as a row.
I saw this
http://sqlhints.com/2014/03/09/how-to-get-monthly-data-in-sql-server/
Is this the best way for my situation.
I need advice as to whether I am going the right way with this and if I am help with the query to move data everyday from the salesData to the aggregatedSalesData
Many thanks
I have a table sales history which has about 1 million records but could go up to 3 or so millions.
CREATE TABLE salesData(
[id] [int] IDENTITY(1,1) NOT NULL,
[originId] [nvarchar](20) NOT NULL,
[orderId] [nvarchar](250) NOT NULL,
[orderDate] [datetime] NOT NULL,
[statusRef] [int] NOT NULL,
[orderSequence] [int] NOT NULL,
[productNumber] [int] NOT NULL,
[customerNumber] [int] NOT NULL,
[finalQty] [int] NOT NULL,
[finalValue] [decimal](18, 2) NOT NULL,
[invoiceNumber] [nvarchar](20) NOT NULL,
[modifiedDate] [datetime] NOT NULL,
[deleted] [bit] NOT NULL,
CONSTRAINT [PK_salesData] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
The data on this table needs to be available to mobile devices. Basically a sales agent will select a customer in the mobile app then a list of products will be displayed. Now along this product list, for each product, I need to display how many that customer bought last 12 months, month by month. Something like this.
Sales : Jan Feb Mar Apr etc (for rest of months)
20 11 0 4
Product: Dyson Hoover blah blah
1. The salesData is updated daily from another system.
2. The mobile apps access a stored procedure (not written yet) that needs to return the data above (customerNumber, ProductNumber, then the months and values/counts). It returns only 12 months (from current month to the same month previous year)
3. Lots of mobiles will be accessing stored procedure
4. The returning of the data from the stored procedure needs to be fast
5. The data is big as I mentioned
So I thought this:
I would create a dedicated aggregated data table aggregatedSalesData. Not sure how to structure this table, the part that displays months and values. Not sure whether to keep them flat columns m1, m1value, m2, m2Value etc or columns like monthNum, MonthValue so that each month goes as a row.
I saw this
http://sqlhints.com/2014/03/09/how-to-get-monthly-data-in-sql-server/
Is this the best way for my situation.
I need advice as to whether I am going the right way with this and if I am help with the query to move data everyday from the salesData to the aggregatedSalesData
Many thanks
ASKER
Hi ste5an,
I don't know what is meant by transactional table. I simply created the table with the SQL Server defaults and populate it from an import_salesData table which gets truncated everyday. Note that performance for the copying of data from salesData to aggregatedSalesData is not an issue. This task happens 4 am in the morning and will not affect the mobile devices. It is the reading of the data for the mobile device (from aggregatedSalesData table unless I am corrected) that needs to be fast. That is why I am thinking of pre creating the aggregated data
I will do some reading on the article you suggested
Thank you
I don't know what is meant by transactional table. I simply created the table with the SQL Server defaults and populate it from an import_salesData table which gets truncated everyday. Note that performance for the copying of data from salesData to aggregatedSalesData is not an issue. This task happens 4 am in the morning and will not affect the mobile devices. It is the reading of the data for the mobile device (from aggregatedSalesData table unless I am corrected) that needs to be fast. That is why I am thinking of pre creating the aggregated data
I will do some reading on the article you suggested
Thank you
The key factor for speed is to cluster the data the way you need to use it. In particular, get rid of the false notion that id "should" be the "default" clustering key -- that's just an AWFUL myth, the worst one in dbs.
Based on the limited info so far, it sounds as if you need to cluster the aggregatedSalesData on:
( customerNumber, productNumber, orderMonth )
Btw, yes, you do need to go back and better cluster the salesData table itself too. It might need clustered by orderDate first, or by customerNumber first, I can't tell without more details, but it should NOT be clustered on id, that much I know already.
Based on the limited info so far, it sounds as if you need to cluster the aggregatedSalesData on:
( customerNumber, productNumber, orderMonth )
Btw, yes, you do need to go back and better cluster the salesData table itself too. It might need clustered by orderDate first, or by customerNumber first, I can't tell without more details, but it should NOT be clustered on id, that much I know already.
ASKER
Hi Scott,
First, I would provide any info to enable to help me better, so please just ask. I just do not know what info is useful for you to help me
I get the clustering of the aggregatedSalesData. The only data that I really want on that table (if I end up using that table) is customerNumber, productNumber, monthValues (aggregated)
For salesData, how can I be sure that orderDate is unique (order can be sent the same second) also customerNumber is repeated so not unique. Perhaps I could cluster on orderDate and invoiceNumber combined?
First, I would provide any info to enable to help me better, so please just ask. I just do not know what info is useful for you to help me
I get the clustering of the aggregatedSalesData. The only data that I really want on that table (if I end up using that table) is customerNumber, productNumber, monthValues (aggregated)
For salesData, how can I be sure that orderDate is unique (order can be sent the same second) also customerNumber is repeated so not unique. Perhaps I could cluster on orderDate and invoiceNumber combined?
The clustering key you specify doesn't have to be unique. Although, if it isn't, it can't be a constraint, so you'd need to use a:
CREATE CLUSTERED INDEX salesData__CL ON dbo.salesData ( orderDate )
statement instead of a CONSTRAINT to create the index.
If you want to make it unique yourself, just add id to the end of the index.
CREATE UNIQUE CLUSTERED INDEX ... ( orderDate, id )
Typically order data is mostly viewed by date, with the most recent data read the most often by far.
If you frequently / most always use that data by customer# also, it may be better to cluster by that.
Running the query below will show you what SQL considers to be "missing indexes" and usage stats for existing indexes. That will help a lot in seeing what's really going on. Post those results and we can review them as well.
USE [<your_db_name_here>]
SET DEADLOCK_PRIORITY LOW --probably irrelevant, but just in case
DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname
--NOTE: showing missing indexes can take some time; set to 0 if you don't want to wait.
SET @list_missing_indexes = 1
SET @table_name_pattern = 'salesData'
PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)
IF @list_missing_indexes = 1
BEGIN
SELECT
GETDATE() AS capture_date,
DB_NAME(mid.database_id) AS Db_Name,
dps.row_count,
OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
mid.equality_columns, mid.inequality_columns, mid.included_columns,
user_seeks, user_scans, ca1.max_days_active, unique_compiles,
last_user_seek, last_user_scan, avg_total_user_cost, avg_user_impact,
system_seeks, system_scans, last_system_seek, last_system_scan, avg_total_system_cost, avg_system_impact,
mid.statement, mid.object_id, mid.index_handle
FROM sys.dm_db_missing_index_de tails mid WITH (NOLOCK)
CROSS APPLY (
SELECT DATEDIFF(DAY, create_date, GETDATE()) AS max_days_active FROM sys.databases WHERE name = 'tempdb'
) AS ca1
LEFT OUTER JOIN sys.dm_db_missing_index_gr oups mig WITH (NOLOCK) ON
mig.index_handle = mid.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_gr oup_stats migs WITH (NOLOCK) ON
migs.group_handle = mig.index_group_handle
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = mid.object_id AND
dps.index_id IN (0, 1)
--order by
--DB_NAME, Table_Name, equality_columns
WHERE
1 = 1
AND mid.database_id = DB_ID() --only current db
AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern
--AND mid.object_id IN (OBJECT_ID('<table_name_1> '), OBJECT_ID('<table_name_2>' ))
ORDER BY
--avg_total_user_cost * (user_seeks + user_scans) DESC,
Db_Name, Table_Name, equality_columns, inequality_columns
END --IF
PRINT 'Midpoint @ ' + CONVERT(varchar(30), GETDATE(), 120)
-- list index usage stats (seeks, scans, etc.)
SELECT
ius2.row_num, DB_NAME() AS db_name,
CASE WHEN i.name LIKE ca2.table_name + '%'
THEN '~' + SUBSTRING(i.name, LEN(ca2.table_name) + 1, 200)
ELSE i.name END AS index_name,
CASE WHEN i.is_unique = 0 THEN 'N' ELSE 'Y' END + '.' +
CASE WHEN i.is_primary_key = 0 AND i.is_unique_constraint = 0 THEN 'N' ELSE 'Y' END AS [uniq?],
ca2.table_name,
i.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
dps.row_count,
SUBSTRING(key_cols, 3, 8000) AS key_cols, SUBSTRING(nonkey_cols, 3, 8000) AS nonkey_cols,
ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
fk.Reference_Count AS fk_ref_count,
DATEDIFF(DAY, CASE WHEN o.create_date > ca1.sql_startup_date THEN o.create_date
ELSE ca1.sql_startup_date END, GETDATE()) AS max_days_active,
FILEGROUP_NAME(i.data_spac e_id) AS filegroup_name,
ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
o.object_id = i.object_id
CROSS JOIN (
SELECT create_date AS sql_startup_date FROM sys.databases WHERE name = 'tempdb'
) AS ca1
CROSS APPLY (
SELECT OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name
) AS ca2
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal > 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
ic.key_ordinal
FOR XML PATH('')
) AS key_cols (key_cols)
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal = 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
COL_NAME(object_id, ic.column_id)
FOR XML PATH('')
) AS nonkey_cols (nonkey_cols)
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = i.object_id AND
dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stat s ius WITH (NOLOCK) ON
ius.database_id = DB_ID() AND
ius.object_id = i.object_id AND
ius.index_id = i.index_id
LEFT OUTER JOIN (
SELECT
database_id, object_id, MAX(user_scans) AS user_scans,
ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+us er_scans
FROM sys.dm_db_index_usage_stat s WITH (NOLOCK)
WHERE
database_id = DB_ID()
--AND index_id > 0
GROUP BY
database_id, object_id
) AS ius2 ON
ius2.database_id = DB_ID() AND
ius2.object_id = i.object_id
LEFT OUTER JOIN (
SELECT
referenced_object_id, COUNT(*) AS Reference_Count
FROM sys.foreign_keys
WHERE
is_disabled = 0
GROUP BY
referenced_object_id
) AS fk ON
fk.referenced_object_id = i.object_id
WHERE
i.object_id > 100 AND
i.is_hypothetical = 0 AND
i.type IN (0, 1, 2) AND
o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
(
o.name LIKE @table_name_pattern AND
o.name NOT LIKE 'dtprop%' AND
o.name NOT LIKE 'filestream[_]' AND
o.name NOT LIKE 'MSpeer%' AND
o.name NOT LIKE 'MSpub%' AND
--o.name NOT LIKE 'queue[_]%' AND
o.name NOT LIKE 'sys%'
)
--AND OBJECT_NAME(i.object_id /*, DB_ID()*/) IN ('tbl1', 'tbl2', 'tbl3')
ORDER BY
--row_count DESC,
--ius.user_scans DESC,
--ius2.row_num, --user_scans&|user_seeks
db_name, table_name,
-- list clustered index first, if any, then other index(es)
CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END,
key_cols
PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)
SET DEADLOCK_PRIORITY NORMAL
CREATE CLUSTERED INDEX salesData__CL ON dbo.salesData ( orderDate )
statement instead of a CONSTRAINT to create the index.
If you want to make it unique yourself, just add id to the end of the index.
CREATE UNIQUE CLUSTERED INDEX ... ( orderDate, id )
Typically order data is mostly viewed by date, with the most recent data read the most often by far.
If you frequently / most always use that data by customer# also, it may be better to cluster by that.
Running the query below will show you what SQL considers to be "missing indexes" and usage stats for existing indexes. That will help a lot in seeing what's really going on. Post those results and we can review them as well.
USE [<your_db_name_here>]
SET DEADLOCK_PRIORITY LOW --probably irrelevant, but just in case
DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname
--NOTE: showing missing indexes can take some time; set to 0 if you don't want to wait.
SET @list_missing_indexes = 1
SET @table_name_pattern = 'salesData'
PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)
IF @list_missing_indexes = 1
BEGIN
SELECT
GETDATE() AS capture_date,
DB_NAME(mid.database_id) AS Db_Name,
dps.row_count,
OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
mid.equality_columns, mid.inequality_columns, mid.included_columns,
user_seeks, user_scans, ca1.max_days_active, unique_compiles,
last_user_seek, last_user_scan, avg_total_user_cost, avg_user_impact,
system_seeks, system_scans, last_system_seek, last_system_scan, avg_total_system_cost, avg_system_impact,
mid.statement, mid.object_id, mid.index_handle
FROM sys.dm_db_missing_index_de
CROSS APPLY (
SELECT DATEDIFF(DAY, create_date, GETDATE()) AS max_days_active FROM sys.databases WHERE name = 'tempdb'
) AS ca1
LEFT OUTER JOIN sys.dm_db_missing_index_gr
mig.index_handle = mid.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_gr
migs.group_handle = mig.index_group_handle
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = mid.object_id AND
dps.index_id IN (0, 1)
--order by
--DB_NAME, Table_Name, equality_columns
WHERE
1 = 1
AND mid.database_id = DB_ID() --only current db
AND OBJECT_NAME(mid.object_id)
--AND mid.object_id IN (OBJECT_ID('<table_name_1>
ORDER BY
--avg_total_user_cost * (user_seeks + user_scans) DESC,
Db_Name, Table_Name, equality_columns, inequality_columns
END --IF
PRINT 'Midpoint @ ' + CONVERT(varchar(30), GETDATE(), 120)
-- list index usage stats (seeks, scans, etc.)
SELECT
ius2.row_num, DB_NAME() AS db_name,
CASE WHEN i.name LIKE ca2.table_name + '%'
THEN '~' + SUBSTRING(i.name, LEN(ca2.table_name) + 1, 200)
ELSE i.name END AS index_name,
CASE WHEN i.is_unique = 0 THEN 'N' ELSE 'Y' END + '.' +
CASE WHEN i.is_primary_key = 0 AND i.is_unique_constraint = 0 THEN 'N' ELSE 'Y' END AS [uniq?],
ca2.table_name,
i.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
dps.row_count,
SUBSTRING(key_cols, 3, 8000) AS key_cols, SUBSTRING(nonkey_cols, 3, 8000) AS nonkey_cols,
ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
fk.Reference_Count AS fk_ref_count,
DATEDIFF(DAY, CASE WHEN o.create_date > ca1.sql_startup_date THEN o.create_date
ELSE ca1.sql_startup_date END, GETDATE()) AS max_days_active,
FILEGROUP_NAME(i.data_spac
ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
o.object_id = i.object_id
CROSS JOIN (
SELECT create_date AS sql_startup_date FROM sys.databases WHERE name = 'tempdb'
) AS ca1
CROSS APPLY (
SELECT OBJECT_NAME(i.object_id/*,
) AS ca2
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal > 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
ic.key_ordinal
FOR XML PATH('')
) AS key_cols (key_cols)
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal = 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
COL_NAME(object_id, ic.column_id)
FOR XML PATH('')
) AS nonkey_cols (nonkey_cols)
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = i.object_id AND
dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stat
ius.database_id = DB_ID() AND
ius.object_id = i.object_id AND
ius.index_id = i.index_id
LEFT OUTER JOIN (
SELECT
database_id, object_id, MAX(user_scans) AS user_scans,
ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+us
FROM sys.dm_db_index_usage_stat
WHERE
database_id = DB_ID()
--AND index_id > 0
GROUP BY
database_id, object_id
) AS ius2 ON
ius2.database_id = DB_ID() AND
ius2.object_id = i.object_id
LEFT OUTER JOIN (
SELECT
referenced_object_id, COUNT(*) AS Reference_Count
FROM sys.foreign_keys
WHERE
is_disabled = 0
GROUP BY
referenced_object_id
) AS fk ON
fk.referenced_object_id = i.object_id
WHERE
i.object_id > 100 AND
i.is_hypothetical = 0 AND
i.type IN (0, 1, 2) AND
o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
(
o.name LIKE @table_name_pattern AND
o.name NOT LIKE 'dtprop%' AND
o.name NOT LIKE 'filestream[_]' AND
o.name NOT LIKE 'MSpeer%' AND
o.name NOT LIKE 'MSpub%' AND
--o.name NOT LIKE 'queue[_]%' AND
o.name NOT LIKE 'sys%'
)
--AND OBJECT_NAME(i.object_id /*, DB_ID()*/) IN ('tbl1', 'tbl2', 'tbl3')
ORDER BY
--row_count DESC,
--ius.user_scans DESC,
--ius2.row_num, --user_scans&|user_seeks
db_name, table_name,
-- list clustered index first, if any, then other index(es)
CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END,
key_cols
PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)
SET DEADLOCK_PRIORITY NORMAL
ASKER
Many thanks Scott.
Working through this now and will feedback
Working through this now and will feedback
ASKER
I got 2 outputs as follows. What does that tell you?
2015-01-12 11:30:45.107 pah_promotionPortal_DE_Sta ge 479848 salesData [customerNumber], [efssCustomerNumber] NULL [orderDate], [productNumber] 1 0 101 1 2015-01-09 17:50:36.473 NULL 9.29434155590326 97.15 0 0 NULL NULL 0 0 [pah_promotionPortal_DE_St age].[dbo] .[salesDat a] 2016726237 16159
2015-01-12 11:30:45.107 pah_promotionPortal_DE_Sta ge 479848 salesData [efssCustomerNumber] NULL [orderDate], [productNumber], [customerNumber] 1 0 101 1 2015-01-09 17:50:27.120 NULL 6.70718491954374 96.07 0 0 NULL NULL 0 0 [pah_promotionPortal_DE_St age].[dbo] .[salesDat a] 2016726237 16157
21 pah_promotionPortal_DE_Sta ge PK_salesData Y.Y salesData 1 479848 id NULL 0 22 0 2 NULL 2015-01-09 17:50:36.473 NULL 2015-01-07 17:18:21.140 NULL 5 PRIMARY 0 4 0 0 NULL 2015-01-09 17:35:05.453 NULL NULL
2015-01-12 11:30:45.107 pah_promotionPortal_DE_Sta
2015-01-12 11:30:45.107 pah_promotionPortal_DE_Sta
21 pah_promotionPortal_DE_Sta
ASKER
I have now created the aggregated table.
CREATE TABLE [dbo].[salesDataAggregated ](
[customerRef] [int] NOT NULL,
[productRef] [int] NOT NULL,
[m1_totalCount] [int] NULL,
[m1_totalValue] [decimal](18, 2) NULL,
[m2_totalCount] [int] NULL,
[m2_totalValue] [decimal](18, 2) NULL,
[m3_totalCount] [int] NULL,
[m3_totalValue] [decimal](18, 2) NULL,
[m4_totalCount] [int] NULL,
[m4_totalValue] [decimal](18, 2) NULL,
[m5_totalCount] [int] NULL,
[m5_totalValue] [decimal](18, 2) NULL,
[m6_totalCount] [int] NULL,
[m6_totalValue] [decimal](18, 2) NULL,
[m7_totalCount] [int] NULL,
[m7_totalValue] [decimal](18, 2) NULL,
[m8_totalCount] [int] NULL,
[m8_totalValue] [decimal](18, 2) NULL,
[m9_totalCount] [int] NULL,
[m9_totalValue] [decimal](18, 2) NULL,
[m10_totalCount] [int] NULL,
[m10_totalValue] [decimal](18, 2) NULL,
[m11_totalCount] [int] NULL,
[m11_totalValue] [decimal](18, 2) NULL,
[m12_totalCount] [int] NULL,
[m12_totalValue] [decimal](18, 2) NULL,
[modifiedDate] [datetime] NOT NULL,
CONSTRAINT [PK_salesDataToday_1] PRIMARY KEY CLUSTERED
(
[customerRef] ASC,
[productRef] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[salesDataAggregated ] WITH CHECK ADD CONSTRAINT [FK_salesDataToday_custome r] FOREIGN KEY([customerRef])
REFERENCES [dbo].[customer] ([id])
GO
ALTER TABLE [dbo].[salesDataAggregated ] CHECK CONSTRAINT [FK_salesDataToday_custome r]
GO
ALTER TABLE [dbo].[salesDataAggregated ] WITH CHECK ADD CONSTRAINT [FK_salesDataToday_product ] FOREIGN KEY([productRef])
REFERENCES [dbo].[product] ([id])
GO
ALTER TABLE [dbo].[salesDataAggregated ] CHECK CONSTRAINT [FK_salesDataToday_product ]
GO
ALTER TABLE [dbo].[salesDataAggregated ] ADD CONSTRAINT [DF_salesDataToday_modifie dDate] DEFAULT (getdate()) FOR [modifiedDate]
GO
CREATE TABLE [dbo].[salesDataAggregated
[customerRef] [int] NOT NULL,
[productRef] [int] NOT NULL,
[m1_totalCount] [int] NULL,
[m1_totalValue] [decimal](18, 2) NULL,
[m2_totalCount] [int] NULL,
[m2_totalValue] [decimal](18, 2) NULL,
[m3_totalCount] [int] NULL,
[m3_totalValue] [decimal](18, 2) NULL,
[m4_totalCount] [int] NULL,
[m4_totalValue] [decimal](18, 2) NULL,
[m5_totalCount] [int] NULL,
[m5_totalValue] [decimal](18, 2) NULL,
[m6_totalCount] [int] NULL,
[m6_totalValue] [decimal](18, 2) NULL,
[m7_totalCount] [int] NULL,
[m7_totalValue] [decimal](18, 2) NULL,
[m8_totalCount] [int] NULL,
[m8_totalValue] [decimal](18, 2) NULL,
[m9_totalCount] [int] NULL,
[m9_totalValue] [decimal](18, 2) NULL,
[m10_totalCount] [int] NULL,
[m10_totalValue] [decimal](18, 2) NULL,
[m11_totalCount] [int] NULL,
[m11_totalValue] [decimal](18, 2) NULL,
[m12_totalCount] [int] NULL,
[m12_totalValue] [decimal](18, 2) NULL,
[modifiedDate] [datetime] NOT NULL,
CONSTRAINT [PK_salesDataToday_1] PRIMARY KEY CLUSTERED
(
[customerRef] ASC,
[productRef] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[salesDataAggregated
REFERENCES [dbo].[customer] ([id])
GO
ALTER TABLE [dbo].[salesDataAggregated
GO
ALTER TABLE [dbo].[salesDataAggregated
REFERENCES [dbo].[product] ([id])
GO
ALTER TABLE [dbo].[salesDataAggregated
GO
ALTER TABLE [dbo].[salesDataAggregated
GO
ASKER
I have this query that will display the data to be copied to it daily. Because the month data is created by the pivot (query copied from some website and adopted) I am struggling with the syntax of inserting into the defined table even though columns count and data type are equal
SELECT * FROM
(SELECT c.id [Customer],
p.id [Product],
DATENAME(MONTH, orderDate) [Month],
COUNT(1) [SalesCount]
FROM salesData sd
INNER JOIN customer c ON c.efssNumber = sd.efssCustomerNumber AND c.number = sd.customerNumber
INNER JOIN product p ON p.productNumber = sd.productNumber
WHERE orderDate BETWEEN DATEADD(m,-12,GETDATE()) AND GETDATE() and c.id = 5428
GROUP BY c.id, p.id, DATENAME(MONTH, orderDate)) AS MontlySalesData
PIVOT( SUM([SalesCount])
FOR Month IN ([January],[February],[Mar ch],[April ],[May],
[June],[July],[August],[Se ptember],[ October],[ November],
[December])) AS MNamePivot
SELECT * FROM
(SELECT c.id [Customer],
p.id [Product],
DATENAME(MONTH, orderDate) [Month],
COUNT(1) [SalesCount]
FROM salesData sd
INNER JOIN customer c ON c.efssNumber = sd.efssCustomerNumber AND c.number = sd.customerNumber
INNER JOIN product p ON p.productNumber = sd.productNumber
WHERE orderDate BETWEEN DATEADD(m,-12,GETDATE()) AND GETDATE() and c.id = 5428
GROUP BY c.id, p.id, DATENAME(MONTH, orderDate)) AS MontlySalesData
PIVOT( SUM([SalesCount])
FOR Month IN ([January],[February],[Mar
[June],[July],[August],[Se
[December])) AS MNamePivot
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks Scott.
This is new work and the tables are new with data pulled from old system. These minimal activity you noticed is just me. They do not even exists in live system yet.
I'll use the suggested cluster. Thank you.
This is new work and the tables are new with data pulled from old system. These minimal activity you noticed is just me. They do not even exists in live system yet.
I'll use the suggested cluster. Thank you.
So the first question is: is your table salesData a transactional table?
You should read Kimball: Dimensional Modeling Techniques.
And especially Transaction Fact Tables andPeriodic Snapshot Fact Tables.