dbaSQL
asked on
Must optimize this procedure, need an assist
This is v2008 R2. A customer asked me to optimize a script. Typically not a problem, but I've actually never seen anything structured like this before. It is an insert into a very poorly designed table that unfortunately, cannot be altered at this time. The table has 219 NVARCHAR columns, all drastically oversized when the column length is compared to the actual datalength. But again, I can't alter the table right now.
I am not certain, but I believe the NVARCHAR(MAX) columns may be one of the bigger problems. There are 11 of them. This one, for example --
Attributes.value('(/cpColl ection/gro up/propert y[@id="Rel atedAssets "]/value)[ 1]', 'NVarChar(MAX)'),
These are the relevant data stats:
select count(*) from Products 37736
select max(len(relatedassets)) from Products 0
select count(*) from Products where relatedAssets IS NOT NULL 37736
select count(*) from Products where relatedAssets = '' 37736
I've checked all 11 NVARCHAR(MAX) columns. They are all similar in that most of the data values in the columns are blank/empty strings. But again, I can't change the table def right now. I've let them know it is very questionable, but I've got to get the script optimized before we can look at revising the table, if in fact, they're going to do it. Side question; how much storage is reserved, if any, on a NVARCHAR(MAX) column? Say it's a single blank character/empty space. What is the actual overhead on that in a NVARCHAR(MAX) column?
Regardless, one would think I could still improve the DML to get the data into the table. These are very small datasets. It's an insert of about 37K records, which is running upwards of 5hrs. I have the MAX datalength for every column. With that information, what is the most effective approach for improving the performance of this insert?
The attached is just a draft. I'm not worried about the TRUNCATE. It is the INSERT that I'm looking for tips on improving.
ProcedureName.sql
I am not certain, but I believe the NVARCHAR(MAX) columns may be one of the bigger problems. There are 11 of them. This one, for example --
Attributes.value('(/cpColl
These are the relevant data stats:
select count(*) from Products 37736
select max(len(relatedassets)) from Products 0
select count(*) from Products where relatedAssets IS NOT NULL 37736
select count(*) from Products where relatedAssets = '' 37736
I've checked all 11 NVARCHAR(MAX) columns. They are all similar in that most of the data values in the columns are blank/empty strings. But again, I can't change the table def right now. I've let them know it is very questionable, but I've got to get the script optimized before we can look at revising the table, if in fact, they're going to do it. Side question; how much storage is reserved, if any, on a NVARCHAR(MAX) column? Say it's a single blank character/empty space. What is the actual overhead on that in a NVARCHAR(MAX) column?
Regardless, one would think I could still improve the DML to get the data into the table. These are very small datasets. It's an insert of about 37K records, which is running upwards of 5hrs. I have the MAX datalength for every column. With that information, what is the most effective approach for improving the performance of this insert?
The attached is just a draft. I'm not worried about the TRUNCATE. It is the INSERT that I'm looking for tips on improving.
ProcedureName.sql
Instead of parsing the column over and over, This should be faster:
Extract the values first into a temp table:
Then do a pivot on it for you insert operation.
Extract the values first into a temp table:
DECLARE @OtherTable TABLE
(
DataId INT ,
AutoNumber INT ,
Attributes XML
);
INSERT INTO @OtherTable
( DataId ,
AutoNumber ,
Attributes
)
VALUES ( 1 ,
1 ,
N'<cpCollection><group><property id="RelatedAssets">a</property><property id="PleaseNote">test</property></group></cpCollection>'
);
SELECT DataId ,
AutoNumber ,
Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue
FROM @OtherTable
OUTER APPLY Attributes.nodes('/cpCollection/group/property') A ( Property );
Then do a pivot on it for you insert operation.
ASKER
A pivot for the insert ?
Yup.
DECLARE @OtherTable TABLE
(
DataId INT ,
AutoNumber INT ,
Attributes XML
);
INSERT INTO @OtherTable
( DataId ,
AutoNumber ,
Attributes
)
VALUES ( 1 ,
1 ,
N'<cpCollection><group><property id="RelatedAssets">a</property><property id="PleaseNote">test</property></group></cpCollection>'
);
DECLARE @Temp TABLE
(
DataId INT ,
AutoNumber INT ,
PropertyID NVARCHAR(MAX) ,
PropertyValue NVARCHAR(MAX)
);
INSERT INTO @Temp
SELECT DataId ,
AutoNumber ,
Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue
FROM @OtherTable
OUTER APPLY Attributes.nodes('/cpCollection/group/property') A ( Property );
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets ,
P.PleaseNote
FROM @Temp T PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( [RelatedAssets], [PleaseNote] ) ) P;
ASKER
Olaf, there are no XML columns in the table. We do not have to use this as the method for insertion, but again, because I have not seen this structured INSERT before, I'm not entirely sure of the correct translation. I could just remove all of the ATTRIBUTE paths in the insert, and just use the column names as per a normal insert, no?
ste5an, because there are no columns in the targeted table of XML datatype, would you still suggest the method you've posted?
ste5an, because there are no columns in the targeted table of XML datatype, would you still suggest the method you've posted?
ASKER
Nope. While there are no XML columns in the table that is inserted to, the data is coming from an XML column in the source instance. I am not an XML parsing guru by any means. I will attempt your suggestion now, ste5an.
ASKER
as I mentioned, ste5an, I am not an XML guru. The statement you've suggested ; Presumably I am adding all of the other values in to the Attributes string for the inser to @OtherTable. Where then am I joining the @OtherTable to the actual OtherTable before the Products table insert, to get the values that are NOT in the OtherTable..XML column?
>the data is coming from an XML column in the source instance
Well, that was, what I told you initially:
Code like Attribute.value(...) makes no sense, if Attribute would not be an XML string or field.
While ste5ans solution might work, I'd try to go up that rain of data merging and see, if you could not even change how the data arrives as XML, there are better tools than SQL Server to handle XML and convert it to the single attributes you need, so this could already be addressed outside of SQL Server.
Bye, Olaf.
Well, that was, what I told you initially:
in the XML field (named Attribute) of the othertable
Code like Attribute.value(...) makes no sense, if Attribute would not be an XML string or field.
While ste5ans solution might work, I'd try to go up that rain of data merging and see, if you could not even change how the data arrives as XML, there are better tools than SQL Server to handle XML and convert it to the single attributes you need, so this could already be addressed outside of SQL Server.
Bye, Olaf.
In regard of the part of the problem not having similar schema, you have to use default values for the non existant columns, eg via COALESCE or ISNULL or simply via specifying literal values.
Along the lines of:
or
In this version with COALESCE the expression1 can be anything, that eventually has a value for the newcol column, but if it's NULL the value is set to 'text' (or a number, date or whatever type you need.
Bye, Olaf.
Along the lines of:
INSERT INTO TABLE_B (col1, col2, newcol, col3) SELECT col1, col2, 'text' as newcol, col3 FROM TABLE_A;
or
INSERT INTO TABLE_B (col1, col2, newcol, col3) SELECT col1, col2, COALESCE(expression1,'text') as newcol, col3 FROM TABLE_A;
In this version with COALESCE the expression1 can be anything, that eventually has a value for the newcol column, but if it's NULL the value is set to 'text' (or a number, date or whatever type you need.
Bye, Olaf.
ASKER
>>there are better tools than SQL Server to handle XML and convert it to the single attributes you need, so this could already be addressed outside of SQL Server.
As I mentioned, Olaf, I am not in a position to do anything other than optimize this INSERT the absolute best that I can right now. There is also no reason to introduce a new, non-SQL tool to perform this manipulation, because this database and application is on it's way out. My goal is simple. I need to improve this statement as best as possible. A 5 hour runtime is simply not acceptable for inserting 37K records from a different table in the same database -- even if the XML parsing is a part of the process.
As I mentioned, Olaf, I am not in a position to do anything other than optimize this INSERT the absolute best that I can right now. There is also no reason to introduce a new, non-SQL tool to perform this manipulation, because this database and application is on it's way out. My goal is simple. I need to improve this statement as best as possible. A 5 hour runtime is simply not acceptable for inserting 37K records from a different table in the same database -- even if the XML parsing is a part of the process.
If you don't even look at the previous step you have no chance to see, whether there is a much simpler solution than to use a pivot here at this step.
If you get the task of parting peas and lentils and you find a few steps earlier in the process peas and lentils are mixed together, then this is the step of the process to change.
I'm hinting: There is some place the XML is generated. It might come in as the import files generated from data, then export/import step could be changed much easier and with much less effort.
Bye, Olaf.
If you get the task of parting peas and lentils and you find a few steps earlier in the process peas and lentils are mixed together, then this is the step of the process to change.
I'm hinting: There is some place the XML is generated. It might come in as the import files generated from data, then export/import step could be changed much easier and with much less effort.
Bye, Olaf.
I could not help further, when you're not posting your actual code and a concise and complete example. This includes table DDL and sample data INSERT statements as runnable T-SQL script.
ASKER
>>If you don't even look at the previous step you have no chance to see, whether there is a much simpler solution than to use a pivot here at this step.
Olaf, I have already discussed this with the customer. It is not an option.
I love lentils.
ste5an, the sample was good. I appreciate it. I am not 100% clear, but I will try to start modeling the actual construct with the other properties pulled from the XML. If you could help me with the join between @OtherTable and OtherTable, I would be very grateful.
Also, the script in the procedure is the statement entirely. The actual code, I mean. Aside from the genericized object names: ProcedureName, Products, OtherTable. Other than these names, the code is actual.
I have attached the table definition for both the Products and OtherTable tables.
ProductsAndOtherTableDef.sql
Olaf, I have already discussed this with the customer. It is not an option.
I love lentils.
ste5an, the sample was good. I appreciate it. I am not 100% clear, but I will try to start modeling the actual construct with the other properties pulled from the XML. If you could help me with the join between @OtherTable and OtherTable, I would be very grateful.
Also, the script in the procedure is the statement entirely. The actual code, I mean. Aside from the genericized object names: ProcedureName, Products, OtherTable. Other than these names, the code is actual.
I have attached the table definition for both the Products and OtherTable tables.
ProductsAndOtherTableDef.sql
ASKER
ste5an, Your example is good, but it is passing in the test values explicitly. it is not clear to me how to model the attributes into @OtherTable for the actual select from the real table.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
>>@OtherTable is just a placeholder for your table named otherTable.
Now that's embarrassing. Let me see if I can do this. Back soon.
Now that's embarrassing. Let me see if I can do this. Back soon.
ASKER
>> Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue,
This part of the insert into @Temp puzzles me, ste5an. Would you mind adding one or even two more of the properties from the XML, so that I can see more clearly how to structure this?
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue,
This part of the insert into @Temp puzzles me, ste5an. Would you mind adding one or even two more of the properties from the XML, so that I can see more clearly how to structure this?
Just run this against your table:
SELECT DataId ,
AutoNumber ,
Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue
FROM dbo.OtherTable
OUTER APPLY Attributes.nodes('/cpCollection/group/property') A ( Property );
ASKER
very helpful. thank you. back soon, hopefully
ASKER
this is becoming huge... I have to step out soon, ste5an. hopefully you will be around later. I'll get back on this tonight. I am finally at this point in the logic. just now beginning it.
INSERT dbo.Products
( DataId ,
AutoNumber ,
RelatedAssets ,
InsertionTime
)
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets ,
CASE WHEN P.InsertionTime = '' THEN NULL
ELSE CAST(P.InsertionTime AS DATETIME)
END
FROM @Temp T PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( RelatedAssets, InsertionTime ) ) P;
INSERT dbo.Products
( DataId ,
AutoNumber ,
RelatedAssets ,
InsertionTime
)
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets ,
CASE WHEN P.InsertionTime = '' THEN NULL
ELSE CAST(P.InsertionTime AS DATETIME)
END
FROM @Temp T PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( RelatedAssets, InsertionTime ) ) P;
ASKER
Well, that SELECT you suggested here ID: 41510140 worked fine. I even added several other columns to the output in addition to yours, it looked good. From there I spend a ton of time getting the other values into the process -- but when I attempt to run it, I just get a whole lot of 'invalid column names..'
Like this:
Msg 207, Level 16, State 1, Line 11
Invalid column name 'RelatedAssets'.
Msg 207, Level 16, State 1, Line 12
Invalid column name 'InsertionTime'.
Msg 207, Level 16, State 1, Line 13
Invalid column name 'AvailableInKitSetAsst'.
Msg 207, Level 16, State 1, Line 14
Invalid column name 'KitSetAsst'.
Msg 207, Level 16, State 1, Line 15
Invalid column name 'NewItemMeetingDate'.
This is my sequence of events:
DECLARE @Temp table
Load it from the OtherTable with OUTER APPLY on my Attributes.nodes
Load the final table with pivoted select on @Temp
Are you available, ste5an?
Like this:
Msg 207, Level 16, State 1, Line 11
Invalid column name 'RelatedAssets'.
Msg 207, Level 16, State 1, Line 12
Invalid column name 'InsertionTime'.
Msg 207, Level 16, State 1, Line 13
Invalid column name 'AvailableInKitSetAsst'.
Msg 207, Level 16, State 1, Line 14
Invalid column name 'KitSetAsst'.
Msg 207, Level 16, State 1, Line 15
Invalid column name 'NewItemMeetingDate'.
This is my sequence of events:
DECLARE @Temp table
Load it from the OtherTable with OUTER APPLY on my Attributes.nodes
Load the final table with pivoted select on @Temp
Are you available, ste5an?
ASKER
I just realized 'it worked fine' because when I ran it before, it was just all of the columns on the table that were not the XML properties. Now that I've added in all of the XML nodes (if that's what they're called), the columns are not being found.
I will keep working it.
I will keep working it.
ASKER
Msg 207, Level 16, State 1, Line 12
Invalid column name 'InsertionTime'.
For the above error, I've added this to the statement:
Property.value('@insertion time','dat etime') InsertionTime,
No longer errors as invalid, but the returned data is just 'NULL'
Invalid column name 'InsertionTime'.
For the above error, I've added this to the statement:
Property.value('@insertion
No longer errors as invalid, but the returned data is just 'NULL'
No, because the Property table, which ste5ans query generates, is not having XML in it, it has tow columns PropertyID and PropertyValue, so looking up @insertiontype in Property, means looking for the record with PropertyID="InsertionTime" and reading its PropertyValue, maybe casting to the right result type.
Ste5ans query takes the XML and creates a classic table from it, but it can't go as far as having InsertionTime etc columns, all the XML nodes are transferred to a record each, consisting of a pair of PropertyID (or name) and PropertyValue.
Bye, Olaf.
Ste5ans query takes the XML and creates a classic table from it, but it can't go as far as having InsertionTime etc columns, all the XML nodes are transferred to a record each, consisting of a pair of PropertyID (or name) and PropertyValue.
Bye, Olaf.
The only thing you need to do is to add the columns here. For each attribute parsing call in you original SQL like
Attributes.value('(/cpColl ection/gro up/propert y[@id="RelatedAssets"]/value)[1]', 'CHAR(1)')
You need to add the id as column in the PIVOT clause
PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( RelatedAssets ) ) P;
Then you can use this name in the SELECT for the INSERT as:
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets
FROM @Temp T PIVOT [..]
You just need to apply the type cast as Olaf said.
Attributes.value('(/cpColl
You need to add the id as column in the PIVOT clause
PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( RelatedAssets ) ) P;
Then you can use this name in the SELECT for the INSERT as:
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets
FROM @Temp T PIVOT [..]
You just need to apply the type cast as Olaf said.
ASKER
I have done this for all of the values:
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets
FROM @Temp T PIVOT [..]
but this part is still unclear to me. can you add one or two additional columns into the shortened example you gave me, ste5an?
PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( RelatedAssets ) ) P;
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets
FROM @Temp T PIVOT [..]
but this part is still unclear to me. can you add one or two additional columns into the shortened example you gave me, ste5an?
PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( RelatedAssets ) ) P;
HHCIB? Just copy every id value from the XPaths of your original query into the column clause:
DECLARE @Temp TABLE
(
DataId INT ,
AutoNumber INT ,
PropertyID NVARCHAR(MAX) ,
PropertyValue NVARCHAR(MAX)
);
INSERT INTO @Temp
SELECT DataId ,
AutoNumber ,
Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue
FROM dbo.OtherTable
OUTER APPLY Attributes.nodes('/cpCollection/group/property') A ( Property )
WHERE ( ModuleId = '3c3cfc3d-4c00-4f0b-bf68-ca2d9f42b352' )
SELECT *
FROM @Temp T PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( RelatedAssets, TextBox1, ProjSalesPerItem, ProdCatSection ) ) P;
ASKER
I wasn't sure what 'HHCIB' meant.... Google defined it for me right away. I must not be expressing or conveying myself well at all. I will keep at it and provide status when complete. Thank you, ste5an.
Well, sometimes patience is not a friend of mine.. sry.
Indeed pivot is not in everybodies toolset - mentally. Every functionality SQL Server offers is there, but you also have to comprehend it.
In the end the IN clause has to have all the property names in quiotes, PropertyID is a char value and can only be IN a list of char values.
...FOR PropertyID IN ( 'RelatedAssets', 'TextBox1', 'ProjSalesP'...)
It was no wonder you got the series of invalid column name errors. Those column names only result in the pivot, but not beforehand. So @Temp does not have these columns, P, the alias of the pivot result - has the columns P.RelatedAssets, P.TextBox1, ...
...in the end it makes me think, if you really need to name all the properties, couldn't you simply do FOR 1=1 or FOR PropertyID LIKE '%' as short for any propertyID, which would put all the original IDs as column names of P?
It's really hard to do without any sample data available.
Bye, Olaf.
In the end the IN clause has to have all the property names in quiotes, PropertyID is a char value and can only be IN a list of char values.
...FOR PropertyID IN ( 'RelatedAssets', 'TextBox1', 'ProjSalesP'...)
It was no wonder you got the series of invalid column name errors. Those column names only result in the pivot, but not beforehand. So @Temp does not have these columns, P, the alias of the pivot result - has the columns P.RelatedAssets, P.TextBox1, ...
...in the end it makes me think, if you really need to name all the properties, couldn't you simply do FOR 1=1 or FOR PropertyID LIKE '%' as short for any propertyID, which would put all the original IDs as column names of P?
It's really hard to do without any sample data available.
Bye, Olaf.
ASKER
Are you able to point out the flaws in the attached? This has become terribly critical.
working.sql
working.sql
The part
Only has to have the "column" names coming from the XML values, not all the otehr fields. And all those names have to be in quotes. They are just the literal name values of PropertID and the pivoting makes those name values columns.
All other normal columns don't come out of the pivoting of @Temp data.
Bye, Olaf.
FOR PropertyID IN(
DataId,
AutoNumber,
ModifyDate,
StartDate,
EndDate,
IsEnabled,
...
Only has to have the "column" names coming from the XML values, not all the otehr fields. And all those names have to be in quotes. They are just the literal name values of PropertID and the pivoting makes those name values columns.
All other normal columns don't come out of the pivoting of @Temp data.
Bye, Olaf.
I'm wrong about putting the PropertyID values in quotes. This sample put together from already posted samples results in a final record having the XML nodes as columns:
Not PropertyID only has the names coming from the XML snippet, not the DataId and AutoNumber columns already being single columns of @Temp before pivoting.
Do you see now how this works? The two records of @Temp are transposed to the two fields of the final result.
The PropertyValue column is NVARCHAR(MAX), for this can hold any textual part of the XML, the conversion to whatever end result type has to be done, where NVARCHAR(MAX) does not convert implicitly to the dbo.Products destination table column type.
We can't do this for you without knowing table definitions and XML snippets.
Bye, Olaf.
DECLARE @Temp TABLE
(
DataId INT ,
AutoNumber INT ,
PropertyID NVARCHAR(MAX) ,
PropertyValue NVARCHAR(MAX)
);
DECLARE @OtherTable TABLE
(
DataId INT ,
AutoNumber INT ,
Attributes XML
);
INSERT INTO @OtherTable
( DataId ,
AutoNumber ,
Attributes
)
VALUES ( 1 ,
1 ,
N'<cpCollection><group><property id="RelatedAssets">a</property><property id="PleaseNote">test</property></group></cpCollection>'
);
Insert Into @Temp
SELECT DataId ,
AutoNumber ,
Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue
FROM @OtherTable
OUTER APPLY Attributes.nodes('/cpCollection/group/property') A ( Property );
Select * From @Temp
SELECT P.DataId ,
P.AutoNumber ,
P.RelatedAssets ,
CASE WHEN P.PleaseNote = '' THEN NULL ELSE CAST(P.PleaseNote AS char(4)) END as PleaseNote
FROM @Temp T PIVOT ( MIN(PropertyValue) FOR PropertyID IN ( [b]RelatedAssets, PleaseNote [/b]) ) P;
Not PropertyID only has the names coming from the XML snippet, not the DataId and AutoNumber columns already being single columns of @Temp before pivoting.
Do you see now how this works? The two records of @Temp are transposed to the two fields of the final result.
The PropertyValue column is NVARCHAR(MAX), for this can hold any textual part of the XML, the conversion to whatever end result type has to be done, where NVARCHAR(MAX) does not convert implicitly to the dbo.Products destination table column type.
We can't do this for you without knowing table definitions and XML snippets.
Bye, Olaf.
ASKER
>>We can't do this for you without knowing table definitions and XML snippets.
I have already sent the table definition in this post: ID: 41509902
I don't have any snippets, but the script that i provided is exactly what is being run.
my attempt to run the piece you just posted fails w/this:
Msg 102, Level 15, State 1, Line 40
Incorrect syntax near 'RelatedAssets'.
if i take the b's out from around the values in the pivot, it works fine. I think one of the problems here is that what works is always the samples with test data values. When I am selecting the data from the XML column in the OtherTable, I cannot bring this together. Yet.
I have already sent the table definition in this post: ID: 41509902
I don't have any snippets, but the script that i provided is exactly what is being run.
my attempt to run the piece you just posted fails w/this:
Msg 102, Level 15, State 1, Line 40
Incorrect syntax near 'RelatedAssets'.
if i take the b's out from around the values in the pivot, it works fine. I think one of the problems here is that what works is always the samples with test data values. When I am selecting the data from the XML column in the OtherTable, I cannot bring this together. Yet.
ASKER
Aside from removing the non-XML columns from the PropertyID IN list, where are the columns supposed to be? I've got them all in there, yet the logic still won't even compile. Seemingly the columns are referenced too many times.
I've got a ton of these:
Msg 207, Level 16, State 1, Procedure procedurename, Line 282
Invalid column name 'RelatedAssets'.
Msg 207, Level 16, State 1, Procedure procedurename, Line 283
Invalid column name 'InsertionTime'.
and a whole bunch of these;
Msg 265, Level 16, State 1, Procedure procedurename, Line 999
The column name "RelatedAssets" specified in the PIVOT operator conflicts with the existing column name in the PIVOT argument.
Msg 265, Level 16, State 1, Procedure procedurename, Line 1000
The column name "InsertionTime" specified in the PIVOT operator conflicts with the existing column name in the PIVOT argument.
and just one of these:
Msg 8156, Level 16, State 1, Procedure procedurename, Line 1228
The column 'RelatedAssets' was specified multiple times for 'p'.
Are the xml column names supposed to be in the select from @Temp or just the PropertyID IN list?
I've got a ton of these:
Msg 207, Level 16, State 1, Procedure procedurename, Line 282
Invalid column name 'RelatedAssets'.
Msg 207, Level 16, State 1, Procedure procedurename, Line 283
Invalid column name 'InsertionTime'.
and a whole bunch of these;
Msg 265, Level 16, State 1, Procedure procedurename, Line 999
The column name "RelatedAssets" specified in the PIVOT operator conflicts with the existing column name in the PIVOT argument.
Msg 265, Level 16, State 1, Procedure procedurename, Line 1000
The column name "InsertionTime" specified in the PIVOT operator conflicts with the existing column name in the PIVOT argument.
and just one of these:
Msg 8156, Level 16, State 1, Procedure procedurename, Line 1228
The column 'RelatedAssets' was specified multiple times for 'p'.
Are the xml column names supposed to be in the select from @Temp or just the PropertyID IN list?
Well, sorry, the part between [ b ] and [ /b ] should have been rendered bold, that wasn't intended part of the code that was just code to render text bold, it doesn't work inside a code section.
I see the table definitions, but I don't see othertable data in post 41509902, the dbo.Othertable contains the XML that is parsed ... AS PropertyID and ... AS PropertyValue. I'd be interested in values of the Attribute field, we already knew it has to be an XML field, but only knowing the XML values in it could make the whole process testable.
Bye, Olaf.
I see the table definitions, but I don't see othertable data in post 41509902, the dbo.Othertable contains the XML that is parsed ... AS PropertyID and ... AS PropertyValue. I'd be interested in values of the Attribute field, we already knew it has to be an XML field, but only knowing the XML values in it could make the whole process testable.
Bye, Olaf.
ASKER
>> I don't see othertable data in post 41509902, the dbo.Othertable contains the XML that is parsed
It's at the bottom of the script. Both tables, same script, i just checked.
<<but only knowing the XML values in it could make the whole process testable.
is that not in the original script that I attached? All of this --
Attributes.value('(/cpColl ection/gro up/propert y[@id="Rel atedAssets "]/value)[ 1]', 'NVarChar(MAX)'),
CASE Attributes.value('(/cpColl ection/gro up/propert y[@id="Ins ertionTime "]/value)[ 1]', 'VarChar(36)') WHEN '' THEN NULL ELSE Attributes.value('(/cpColl ection/gro up/propert y[@id="Ins ertionTime "]/value)[ 1]', 'DateTime') END, Attributes.value('(/cpColl ection/gro up/propert y[@id="Ava ilableInKi tSetAsst"] /value)[1] ', 'NVarChar(256)'),
It's at the bottom of the script. Both tables, same script, i just checked.
<<but only knowing the XML values in it could make the whole process testable.
is that not in the original script that I attached? All of this --
Attributes.value('(/cpColl
CASE Attributes.value('(/cpColl
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
No,
Attributes.value('(/cpColl ection/gro up/propert y[@id="Rel atedAssets "]/value)[ 1]', 'NVarChar(MAX)'),
This is not XML, it is a expression parsing some single node from the XML in the Atttributes field.
All I/we can see from that is, that the XML contains a strucutre with cpCollection elements containing group elements containing property elements.
Bye, Olaf.
Attributes.value('(/cpColl
This is not XML, it is a expression parsing some single node from the XML in the Atttributes field.
All I/we can see from that is, that the XML contains a strucutre with cpCollection elements containing group elements containing property elements.
Bye, Olaf.
ASKER
>>All the column, which come from the XML can't be part of @Temp.
Yes! I was just typing that, thanking you both for all of your help, and saying that the insert into @temp is the problem, because the columns are not known yet. let me look at your examples more closely.
Yes! I was just typing that, thanking you both for all of your help, and saying that the insert into @temp is the problem, because the columns are not known yet. let me look at your examples more closely.
Simply do
Bye, Olaf.
SELECT Top 2 Attributes FROM dbo.Othertable
Bye, Olaf.
ASKER
Attached.
top2.txt
top2.txt
ASKER
i think i see the light.... joke. sort of.
below is the new @temp declaration and insert. I believe I finally see where my mistake was. unfortunately, it is not clear to me yet whether my approach is going to be the solution that I need (tomorrow at 4AM).
I say that only because it is still running.... i'm not trying to do anything other than this piece right now. i suppose i could TOP it to see if it functions properly. Assuming this is the correct construct, I would immediately insert into the OtherTable the the pivot.
The logic we are trying to remedy ran more than 5 hours. I simply must do better than that.
what would you advise?
DECLARE @Temp TABLE (
DataId UNIQUEIDENTIFIER,
AutoNumber INT,
ModifyDate DATETIME,
StartDate DATETIME,
EndDate DATETIME,
IsEnabled BIT,
Title NVARCHAR(256),
[Target] NVARCHAR(256),
Summary NVARCHAR(512),
PropertyID NVARCHAR(MAX),
PropertyValue NVARCHAR(MAX)
);
INSERT @Temp
SELECT
DataId,
AutoNumber,
ModifyDate,
StartDate,
EndDate,
IsEnabled,
Title,
[Target],
Summary,
Property.value('./@id', 'NVARCHAR(MAX)') PropertyID,
Property.value('.', 'NVARCHAR(MAX)') PropertyValue
FROM
dbo.OtherTable OUTER APPLY Attributes.nodes('/cpColle ction/grou p/property ') a (Property)
WHERE
(ModuleId = '3c3cfc3d-4c00-4f0b-bf68-c a2d9f42b35 2');
below is the new @temp declaration and insert. I believe I finally see where my mistake was. unfortunately, it is not clear to me yet whether my approach is going to be the solution that I need (tomorrow at 4AM).
I say that only because it is still running.... i'm not trying to do anything other than this piece right now. i suppose i could TOP it to see if it functions properly. Assuming this is the correct construct, I would immediately insert into the OtherTable the the pivot.
The logic we are trying to remedy ran more than 5 hours. I simply must do better than that.
what would you advise?
DECLARE @Temp TABLE (
DataId UNIQUEIDENTIFIER,
AutoNumber INT,
ModifyDate DATETIME,
StartDate DATETIME,
EndDate DATETIME,
IsEnabled BIT,
Title NVARCHAR(256),
[Target] NVARCHAR(256),
Summary NVARCHAR(512),
PropertyID NVARCHAR(MAX),
PropertyValue NVARCHAR(MAX)
);
INSERT @Temp
SELECT
DataId,
AutoNumber,
ModifyDate,
StartDate,
EndDate,
IsEnabled,
Title,
[Target],
Summary,
Property.value('./@id', 'NVARCHAR(MAX)') PropertyID,
Property.value('.', 'NVARCHAR(MAX)') PropertyValue
FROM
dbo.OtherTable OUTER APPLY Attributes.nodes('/cpColle
WHERE
(ModuleId = '3c3cfc3d-4c00-4f0b-bf68-c
ASKER
definitely making headway! wow. I cannot believe how I did not see this last night. it is right there in your code samples. i really am so sorry for the wasted time. truly.
i'm almost there. back shortly with status.
i'm almost there. back shortly with status.
Execute this as another sample:
I took part of the XML and the IN cluase only has those names coming from that XML.
Ignore all the nulls, that's just because I don't have othertable sample data aside of the XML, but it's sufficient to show how the XML is split into several records and then pivoted to create one row with all the needed columns.
If you still don't see how it works, I can't help you further, also because my time now runs out.
I wish you good luck.
Bye, Olaf.
DECLARE @Temp TABLE
(
[DataId] [uniqueidentifier] NULL,
[ModuleId] [uniqueidentifier] NULL,
[BatchId] [uniqueidentifier] NULL,
[AutoNumber] [int] NULL,
[IntegrationId] [nvarchar](256) NULL,
[StartDate] [datetime] NULL,
[EndDate] [datetime] NULL,
[Title] [nvarchar](256) NULL,
[Summary] [nvarchar](512) NULL,
[IsEnabled] [bit] NULL,
[IsFullTextIndexed] [bit] NULL,
[IsEcommerce] [bit] NULL,
[AllowDataSync] [bit] NULL,
[CreateDate] [datetime] NULL,
[ModifyDate] [datetime] NULL,
[Target] [nvarchar](256) NULL,
PropertyID NVARCHAR(MAX) ,
PropertyValue NVARCHAR(MAX)
);
DECLARE @OtherTable TABLE
(
[DataId] [uniqueidentifier] NULL,
[ModuleId] [uniqueidentifier] NULL,
[BatchId] [uniqueidentifier] NULL,
[AutoNumber] [int] NULL,
[IntegrationId] [nvarchar](256) NULL,
[StartDate] [datetime] NULL,
[EndDate] [datetime] NULL,
[Title] [nvarchar](256) NULL,
[Summary] [nvarchar](512) NULL,
[IsEnabled] [bit] NULL,
[IsFullTextIndexed] [bit] NULL,
[IsEcommerce] [bit] NULL,
[AllowDataSync] [bit] NULL,
[Attributes] [xml] NULL,
[CreateDate] [datetime] NULL,
[ModifyDate] [datetime] NULL,
[Target] [nvarchar](256) NULL
);
INSERT INTO @OtherTable
(
Attributes
)
VALUES ( N'<cpCollection moduleId="38460957-9aea-4165-8f68-4d7391afe1de" dataId="119108ac-01ab-4126-9bd3-0000acda2844">
<group id="Serialize" name="Serialize">
<property id="RelatedProducts">
<value />
</property>
<property id="LegacyData">
<value />
</property>
<property id="Instructor">
<value />
</property>
<property id="Taxonomy">
<value>bf83a5f7-52d3-47b2-a82f-4f4fd3f184dd</value>
</property>
<property id="Keywords">
<value>test, test2</value>
</property>
</group>
</cpCollection>'
);
Insert Into @Temp
SELECT [DataId],
[ModuleId],
[BatchId],
[AutoNumber],
[IntegrationId],
[StartDate],
[EndDate],
[Title],
[Summary],
[IsEnabled],
[IsFullTextIndexed],
[IsEcommerce],
[AllowDataSync],
[CreateDate],
[ModifyDate],
[Target],
Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue
FROM @OtherTable
OUTER APPLY Attributes.nodes('/cpCollection/group/property') A ( Property );
Select * From @Temp
SELECT P.*
FROM @Temp T PIVOT ( MIN(PropertyValue) FOR PropertyID IN (RelatedProducts,LegacyData,Instructor,Taxonomy,Keywords)) P;
I took part of the XML and the IN cluase only has those names coming from that XML.
Ignore all the nulls, that's just because I don't have othertable sample data aside of the XML, but it's sufficient to show how the XML is split into several records and then pivoted to create one row with all the needed columns.
If you still don't see how it works, I can't help you further, also because my time now runs out.
I wish you good luck.
Bye, Olaf.
E.g.
Use the result set for your INSERT statement. Just add the necessary CASTs and CASE statements as in your original INSERT.
-- Use the correct data types instead of INT:
DECLARE @Temp TABLE
(
DataId INT ,
AutoNumber INT ,
ModifyDate INT ,
CreateDate INT ,
StartDate INT ,
EndDate INT ,
IsEnabled INT ,
Title INT ,
[Target] INT ,
Summary INT ,
PropertyID NVARCHAR(MAX) ,
PropertyValue NVARCHAR(MAX)
);
INSERT INTO @Temp
SELECT DataId ,
AutoNumber ,
ModifyDate,
CreateDate ,
StartDate ,
EndDate ,
IsEnabled ,
Title ,
[Target] ,
Summary ,
Property.value('./@id', 'NVARCHAR(MAX)') AS PropertyID ,
Property.value('.', 'NVARCHAR(MAX)') AS PropertyValue
FROM dbo.OtherTable
OUTER APPLY Attributes.nodes('/cpCollection/group/property') A ( Property );
-- This is how to get the result set:
SELECT P.*
FROM @Temp T PIVOT ( MIN(PropertyValue) FOR PropertyID IN (
[RelatedAssets],
[InsertionTime],
[AvailableInKitSetAsst],
[KitSetAsst],
[NewItemMeetingDate],
[PleaseNote],
[RelatedProductsTitle],
[RelatedContent],
[RelatedProducts1],
[RelatedProducts2],
[RelatedProducts3],
[AdditionalInfo],
[CopyOK],
[CountryOfOrigin],
[YesNo1],
[YesNo2],
[GroupHeader],
[GroupingAttribute],
[GroupID],
[InfoTemplate],
[Abrasive_Mesh],
[Accuracy],
[ArticleNo],
[Author_Host],
[BailOpening],
[BandShankWidth],
[BindingStyle],
[Brand],
[Capacity],
[CapInsideDimension],
[CaratWeight],
[CastTemperature],
[ChannelWidth],
[Clarity],
[ClaspLength],
[ClaspType],
[COE], [Color],
[ColorFamily],
[ColorNo],
[CupSize],
[CureTemperature],
[CureTime],
[CustomService],
[CuttingSurfaceDimension],
[Density],
[Diameter],
[Dimensions],
[DrillStyle],
[Durometer],
[FabricationMethod],
[FaceShape],
[FaceSize],
[Finished],
[FitsCordChainSize],
[FocalLength],
[Form_CastingMetal],
[Form_Clay],
[Form_Fabrication],
[Form_Modeling],
[Gauge],
[GermanCutNo],
[Grade_Bead_Stone_Cab],
[Grade_Clay],
[HandleLength],
[Hardness_Clay],
[Hardness_Fabrication],
[HeadLength],
[HeadShape],
[HeadSize],
[HeadType],
[HeadWeight],
[HoleSize],
[HotFoilStampingType],
[ImpressionPlacement],
[InsideDimension],
[InteriorDimensions],
[JawLength],
[JawStyle],
[JewelryType],
[KaratPurity],
[LargeSetName],
[LengthPerWeight],
[LengthRange],
[LensDiameter],
[Letter],
[LinkGaThickness],
[LinkID],
[LinkLength],
[LiquidusTemperature],
[MagnificationPower],
[Material_Chain],
[Material_General_Kits_Sets],
[Material_Bead],
[Material_GiftBox_Display],
[Material_Modeling_Carving],
[Material_Pendant_Bail_Component],
[Material_Ring],
[Material_Rotary_Cutting_Polishing],
[Material_Stone_Cab_Cameo],
[Material_Stringing],
[MaximumCuttingCapacity],
[MaximumLength],
[Media],
[MeltTemperature],
[MetalColor],
[MetalColor_Fabrication],
[MetalType],
[MetalType_Fabrication],
[ModelNo],
[Modes],
[MohsHardness],
[MountedStoneSizes],
[MountingBezelSizes],
[NetWeight],
[NoOfStrands],
[OutsideDimension],
[OverallLength],
[PackSize],
[PadSize],
[PageCount],
[PegSize],
[PinSize],
[PlatformDimensions],
[PostSize],
[Power],
[ProngHeight],
[Range],
[RefractiveIndex],
[RingID],
[RingSizeUS],
[RunningTime],
[SectionDimensions],
[SettingType],
[ShankSize],
[Shape_Bead],
[Shape_Disc_Stamping_Tag],
[Shape_Fabrication_Metal],
[Shape_File],
[Shape_General_Kits_Sets],
[Shape_Jumpring_Springring_Headpin],
[Shape_Pendant_Bail_Component],
[Shape_Ring],
[Shape_Stone_Cab_Cameo],
[ShipWeight],
[Shrinkage],
[SignetPlateSize],
[Size],
[SizeNo],
[SolidusTemperature],
[SpecificGravity],
[SpoolPkgQuantity],
[StepDimension],
[StoneDepth],
[StoneShape],
[StoneSize],
[StoneType_Bead],
[StoneType_Stone_Cab_Cameo],
[Style_Chain_Necklace_Braclet],
[Style_Clasp],
[Style_Disc_Stamping_Tag],
[Style_Earring_Nut],
[Style_General_Kits_Sets],
[Style_Giftbox_Display],
[Style_Incidentals],
[Style_Jumpring_Springring_Headpin],
[Style_Pendant_Bail_Component],
[Style_Ring],
[Style_Setting],
[Subject],
[SubSetName],
[Sustainable],
[SwissCut],
[TensileStrength],
[TestStrength],
[TipWidth],
[Treatment],
[TubeInsideDimension],
[TweezerStyle],
[Type_Clay_Enamel_Glass],
[Type_File],
[Type_Giftbox_Display],
[Type_Optical],
[VulcanizationTemperature],
[WallHeight],
[WallThickness],
[WeightPerLength],
[WheelDiameter],
[Width],
[WidthRange],
[KitSetIncludes],
[MarketingKeywords],
[Keywords],
[OK],
[OrigPrice],
[PhotoOK],
[PM],
[ProdCatSection],
[ProdPageNmbr],
[RTFEditor1],
[RTFEditor2],
[ProductName],
[ProductionStage],
[ProductNamePrint],
[RTFEditor3],
[SM],
[IntegrationId],
[UnitCost],
[WebEAV],
[CatalogEAV],
[CatalogPage],
[CatalogDate],
[InitOrder],
[ProjSalesPerItem],
[TtlProjSales],
[Has360],
[Colorspace],
[Dimensionspixels],
[FileName],
[ICCProfileIdentifier],
[ModificationUser],
[ModifiedTime],
[OriginalFileSize],
[YesNo3],
[TextBox1],
[Resolutionppi],
[CheckedOut],
[Taxonomy],
[QualityMarked]
) ) P;
Use the result set for your INSERT statement. Just add the necessary CASTs and CASE statements as in your original INSERT.
ASKER
New script is attached. I did TOP it, just to see it work; the @Temp was loaded fine, but it failed to insert due to NULL values not allowed in RelatedAssets. However, that column does have a default:
ALTER TABLE dbo.Products ADD DEFAULT (N'') FOR [RelatedAssets]
This is also the column that i mentioned in the very first post -- RelatedAssets. There is no data in the column, in any of the production records. it is a blank space that the very wise table designer is inserting with a default constraint. but that's another story...
anyway, @Temp is being loaded, I've done a select just on that data, and it looks good. Many of the propertyvalue's are blank, but that is the way they designed this thing. given the default constraint, I am not sure why the process is failing.
do you see my flaw?
This one, for example --
Attributes.value('(/cpColl ection/gro up/propert y[@id="Rel atedAssets "]/value)[ 1]', 'NVarChar(MAX)'),
These are the relevant data stats:
select count(*) from Products 37736
select max(len(relatedassets)) from Products 0
select count(*) from Products where relatedAssets IS NOT NULL 37736
select count(*) from Products where relatedAssets = '' 37736
newScript.sql
ALTER TABLE dbo.Products ADD DEFAULT (N'') FOR [RelatedAssets]
This is also the column that i mentioned in the very first post -- RelatedAssets. There is no data in the column, in any of the production records. it is a blank space that the very wise table designer is inserting with a default constraint. but that's another story...
anyway, @Temp is being loaded, I've done a select just on that data, and it looks good. Many of the propertyvalue's are blank, but that is the way they designed this thing. given the default constraint, I am not sure why the process is failing.
do you see my flaw?
This one, for example --
Attributes.value('(/cpColl
These are the relevant data stats:
select count(*) from Products 37736
select max(len(relatedassets)) from Products 0
select count(*) from Products where relatedAssets IS NOT NULL 37736
select count(*) from Products where relatedAssets = '' 37736
newScript.sql
ASKER
I changed the TOP to 500, and ran it again. It loaded @Temp immediatly, but failed with the no NULLs in RelatedAssets error. I think it's just that the default will only apply if we DON'T insert into it explicitly. yet we are attempting to do so, because i am trying to optimize this script and building it out properly... maybe that's it.
this is the only reference to RelatedAssets in the raw script that they gave me:
Attributes.value('(/cpColl ection/gro up/propert y[@id="Rel atedAssets "]/value)[ 1]', 'NVarChar(MAX)'),
i don't know XML manipulation well enough to understand why that, too, is not producing the same error.
this is the only reference to RelatedAssets in the raw script that they gave me:
Attributes.value('(/cpColl
i don't know XML manipulation well enough to understand why that, too, is not producing the same error.
Take a look at the XML colum for the failing row. Does the property with ID RelatedAssets exists and has a value?
When not, then this maybe another reason for the long running process, cause a rollback could happen.
When it has a value, which does not match the data type of your Products table RelatedAssets column try to cast it, maybe you even need a CASE to do it properly.
p.s. can you post at least some data of the Attribute column?
When not, then this maybe another reason for the long running process, cause a rollback could happen.
When it has a value, which does not match the data type of your Products table RelatedAssets column try to cast it, maybe you even need a CASE to do it properly.
p.s. can you post at least some data of the Attribute column?
ASKER
Is there any way to use a default parm to input the XML column if the propertyvalue is a blank space?
ASKER
>>Take a look at the XML colum for the failing row. Does the property with ID RelatedAssets exists and has a value?
That's what I was saying earlier, ste5an. RelatedAssets has no data in any of the records. The default constraint is a blank space. They are writing it in that way. I've worked around that by declaring a parm in the logic, and using that in the insert, but this is only possible because again, that column has no data anywhere, and they load it with a blank space via default.
DECLARE @RelatedAssets CHAR(2) = ''
A similar problem occurred with another column - Pleasenote. This value is populated in many cases, but in just as many, it is a blank space. I cannot work around it like I did with RelatedAssets, because in many cases the data is there. This column also has the default constraint of a blank space -- (N'') -- but I can't work around it like I did RelatedAssets, because that will overwrite all of the valid data.
I hate to even consider this, as I am sure there will be overhead, but can I check the p.Pleasenote and pass in the blank space only when it is NULL ?
<<p.s. can you post at least some data of the Attribute column?
I did up there in ID: 41512522. I can post more if you need.
That's what I was saying earlier, ste5an. RelatedAssets has no data in any of the records. The default constraint is a blank space. They are writing it in that way. I've worked around that by declaring a parm in the logic, and using that in the insert, but this is only possible because again, that column has no data anywhere, and they load it with a blank space via default.
DECLARE @RelatedAssets CHAR(2) = ''
A similar problem occurred with another column - Pleasenote. This value is populated in many cases, but in just as many, it is a blank space. I cannot work around it like I did with RelatedAssets, because in many cases the data is there. This column also has the default constraint of a blank space -- (N'') -- but I can't work around it like I did RelatedAssets, because that will overwrite all of the valid data.
I hate to even consider this, as I am sure there will be overhead, but can I check the p.Pleasenote and pass in the blank space only when it is NULL ?
<<p.s. can you post at least some data of the Attribute column?
I did up there in ID: 41512522. I can post more if you need.
Use ISNULL(columnName, ' ') to get a blank.
ASKER
If not for the blank spaces, I could do this, and coalesce it with the p.Pleasenote
DECLARE @PleaseNote CHAR(2) = ''
DECLARE @PleaseNote CHAR(2) = ''
ASKER
a lot of them, same thing. i'm going through it one at a time with the ISNULLs. back soon.
ASKER
I really need to become much more versed with XML manipulation. this is nuts.
i finally got it functional (see attached), and it only insert three records into the Products table.
I know this system is questionable. It could be just that, but I don't know the XML back to front enough to know whether that's the case.
I topped it at 500. Would you say the attached is correct?
maybe.sql
i finally got it functional (see attached), and it only insert three records into the Products table.
I know this system is questionable. It could be just that, but I don't know the XML back to front enough to know whether that's the case.
I topped it at 500. Would you say the attached is correct?
maybe.sql
ASKER
I also wonder how you might approach it to minimize logging during the execution? It's been running about 17 minutes. The log isn't too bad, but it is growing. Both the tempdb and the database.
ASKER
@#*$''
(9189814 row(s) affected)
Msg 8152, Level 16, State 10, Line 48
String or binary data would be truncated.
The statement has been terminated.
(9189814 row(s) affected)
Msg 8152, Level 16, State 10, Line 48
String or binary data would be truncated.
The statement has been terminated.
ASKER
lovely error output. would be great if it could tell me where the problem is...
I am selecting the whole construct into a new table. Assuming it completes successfully, I will compare the column lengths and figure out where the problem is. I welcome other all suggestions. :-)
I am selecting the whole construct into a new table. Assuming it completes successfully, I will compare the column lengths and figure out where the problem is. I welcome other all suggestions. :-)
ASKER
It ran for 00:28:52 and completed successfully with this output:
(9189814 row(s) affected)
(37736 row(s) affected)
Presumably, that is the 1st insert into @Temp, followed by the final insert into Products. But - it wasn't Products, it was my test table, so I can try to find the source of the 8152 error.
If I can find that, I feel very, very good about this. 28 minutes vs 5 hours is a godsend.
(9189814 row(s) affected)
(37736 row(s) affected)
Presumably, that is the 1st insert into @Temp, followed by the final insert into Products. But - it wasn't Products, it was my test table, so I can try to find the source of the 8152 error.
If I can find that, I feel very, very good about this. 28 minutes vs 5 hours is a godsend.
ASKER
Hopefully one of you is still awake.... I figured out the 8152 error. All of the columns that we collected from the XML were defined as NVARCHAR(MAX). This is because we set it up that way here:
PropertyID NVARCHAR(MAX),
PropertyValue NVARCHAR(MAX)
In all but 11 cases, this will exceed the column datalength in the table that is targeted for the final dataset.
What is the correct way to address this? Is there anyway to defined the output column dynamically, based on that which is retrieved from the XML ?
PropertyID NVARCHAR(MAX),
PropertyValue NVARCHAR(MAX)
In all but 11 cases, this will exceed the column datalength in the table that is targeted for the final dataset.
What is the correct way to address this? Is there anyway to defined the output column dynamically, based on that which is retrieved from the XML ?
ASKER
ste5an or Olaf, are either of you still available?
ASKER
I left the PropertyID and PropertyValue at NVARCHAR(MAX), but I LTRIM(RTRIM med every one of the columns on the final insert.
It ran this time for 00:20:01, and failed with this:
(9189814 row(s) affected)
Msg 8152, Level 16, State 10, Line 48
String or binary data would be truncated.
The statement has been terminated.
Still working it. All suggestions are appreciated. This must be functional before morning.
It ran this time for 00:20:01, and failed with this:
(9189814 row(s) affected)
Msg 8152, Level 16, State 10, Line 48
String or binary data would be truncated.
The statement has been terminated.
Still working it. All suggestions are appreciated. This must be functional before morning.
You're almost there, you just need the final conversions of the Nvarchar values to the dbo.product datatypes now.
You have the needed Code in your original attachment.
This can't be automatically done by neither the pivot step and even less by the step extracting all the. XML properties into the PropertyValue column. A column can't have different types for different rows. And the pivot step putting all the Nvarchar values into separate columns can't know the destination type you need.
Your original Code must have had CASTs or CONVERTs of all the single parsed values. Look back at it.
Bye, Olaf.
You have the needed Code in your original attachment.
This can't be automatically done by neither the pivot step and even less by the step extracting all the. XML properties into the PropertyValue column. A column can't have different types for different rows. And the pivot step putting all the Nvarchar values into separate columns can't know the destination type you need.
Your original Code must have had CASTs or CONVERTs of all the single parsed values. Look back at it.
Bye, Olaf.
ASKER
I am so glad that you are still awake, Olaf. Thank you! You say the original code must have had CASTS or CONVERTS, but no. It doesn't. I've pasted a few of the values below. You'll see there are no explicit CASTS/CONVERTS, but at the end of each Attribues.value, the correct datatype and length is there. InsertionTime, for example, is varchar(36), then AvailableKitSetAsst is NVARCHAR(256), NewItemMeetingDate is VARCHAR(36), etc...
How would you handle it, Olaf? Please.
CASE Attributes.value('(/cpColl ection/gro up/propert y[@id="Ins ertionTime "]/value)[ 1]', 'VarChar(36)') WHEN '' THEN NULL ELSEAttributes.value('(/cp Collection /group/pro perty[@id= "Insertion Time"]/val ue)[1]', 'DateTime') END AS [InsertionTime],
Attributes.value('(/cpColl ection/gro up/propert y[@id="Ava ilableInKi tSetAsst"] /value)[1] ', 'NVarChar(256)') AS [AvailableInKitSetAsst],
CASE Attributes.value('(/cpColl ection/gro up/propert y[@id="New ItemMeetin gDate"]/va lue)[1]', 'VarChar(36)') WHEN '' THEN NULL ELSE Attributes.value('(/cpColl ection/gro up/propert y[@id= "NewItemMeetingDate"]/valu e)[1]', 'DateTime') END AS [NewItemMeetingDate],
Attributes.value('(/cpColl ection/gro up/propert y[@id="Ple aseNote"]/ value)[1]' , 'NVarChar(MAX)') AS [PleaseNote],
Attributes.value('(/cpColl ection/gro up/propert y[@id="Rel atedProduc tsTitle"]/ value)[1]' , 'NVarChar(256)') AS [RelatedProductsTitle],
How would you handle it, Olaf? Please.
CASE Attributes.value('(/cpColl
Attributes.value('(/cpColl
CASE Attributes.value('(/cpColl
Attributes.value('(/cpColl
Attributes.value('(/cpColl
ASKER
I suppose I could just take this information, and do the CAST/CONVERT myself. but not without some decent overhead, i'm guessing.
Overhead? No. Diligence? Yes.
The overhead that took the time was doing all these separate Attributes.value() expressions. You're already past any need of XML knowledge, you now just have all the columns you need, just as nvarchar(max) values and need to convert them as needed.
We've already pointed out this in many posts pf me and ste5an:
41511226 "casting to the right result type."
41511335 "You just need to apply the type cast as Olaf said."
41512352 "...the conversion to whatever end result type has to be done, where NVARCHAR(MAX) does not convert implicitly to the dbo.Products destination table column type."
...
41512562 "Use the result set for your INSERT statement. Just add the necessary CASTs and CASE statements as in your original INSERT."
Bye, Olaf.
The overhead that took the time was doing all these separate Attributes.value() expressions. You're already past any need of XML knowledge, you now just have all the columns you need, just as nvarchar(max) values and need to convert them as needed.
We've already pointed out this in many posts pf me and ste5an:
41511226 "casting to the right result type."
41511335 "You just need to apply the type cast as Olaf said."
41512352 "...the conversion to whatever end result type has to be done, where NVARCHAR(MAX) does not convert implicitly to the dbo.Products destination table column type."
...
41512562 "Use the result set for your INSERT statement. Just add the necessary CASTs and CASE statements as in your original INSERT."
Bye, Olaf.
Ste5an also gave an example for the cast of a XML datetime string snippet put into p.InsertionTime to the SQL Server datetimetype:
Bye, Olaf.
CASE WHEN P.InsertionTime = '' THEN NULL
ELSE CAST(P.InsertionTime AS DATETIME)
END As InsertionTime
Bye, Olaf.
A final note: You should keep in mind XML is nothing but a string.
nvarchar(max) was choosen as type of the PropertyValue, because it can hold any string part of the xml node in a property element. It may be oversized in many cases, but it's autotrimmed to the value length, many things can automatically convert implicitly without a CAST or CONVERT, eg values being a uniqueidentifier should go over as they are.
PropertyValue can't have the right datatype, which you should know as dba, it is one column and a column has one type - for all rows, you can't change type per row. And the pivot can create the columns from the records, but it can only populate them with what it has in PropertyValue, so all columns are nvarchar in that P result of the pivot.
If you had a xml conversion tool or the Attributes column would already be expanded in the othertable this could be done much more elegant for sure. SQL Server is not the tool for this.
Bye, Olaf.
nvarchar(max) was choosen as type of the PropertyValue, because it can hold any string part of the xml node in a property element. It may be oversized in many cases, but it's autotrimmed to the value length, many things can automatically convert implicitly without a CAST or CONVERT, eg values being a uniqueidentifier should go over as they are.
PropertyValue can't have the right datatype, which you should know as dba, it is one column and a column has one type - for all rows, you can't change type per row. And the pivot can create the columns from the records, but it can only populate them with what it has in PropertyValue, so all columns are nvarchar in that P result of the pivot.
If you had a xml conversion tool or the Attributes column would already be expanded in the othertable this could be done much more elegant for sure. SQL Server is not the tool for this.
Bye, Olaf.
ASKER
Hoping either of you are still up. We have success. The attached completed in 00:16:25. I've looked at the data in my test table, comparison to to the production data -- it looks good.
I'm going to wrap it into a procedure and do it once more. i welcome all suggestions.
16 minutes vs 5 hours is a win.
withCASTS.sql
I'm going to wrap it into a procedure and do it once more. i welcome all suggestions.
16 minutes vs 5 hours is a win.
withCASTS.sql
Caveat: Use NVARCHAR() for the property value column. XML files use in almost every case Unicode. Using VARCHAR as you did will result in loss of characters not matching that charset.
And belief me, sooner or later on of those columns will contain such content.
And belief me, sooner or later on of those columns will contain such content.
ASKER
be that the case, the casts i perform later in the construct should be all that's necessary to come back from the NVARCHAR(MAX). no?
No, cause you already stored the information in @Temp using the wrong data type. Those casts cannot restore lost information.
Just change the type in @Temp to NVARCHAR(MAX).
Just change the type in @Temp to NVARCHAR(MAX).
First of all congratulations.
I only have the same critizism as Ste5an: Why Change from NVARCHAR(MAX)? This type always will be filled with whatever length the XML property element has, not more, not less.
Any errors you had because of truncated data were from the step inserting in dbo.Products, not because of the MAX size, so there is no reason to limit this to 6035, your target fields are often much shorter anyway.
I also think you'll only ever have NULL values in P when a property is absent in some Attributes XML. You can't bee too cautious, when target fields are not nullable.
While we're at that topic: At one step you thought a default value of dbo.Products would be taken instead of NULLs. Well, No. The Default values come in effect, if a column isn't targetted in an INSERT or if you insert the keyword DEFAULT, but not, when you insert a NULL literally or from a source column, which is NULL, therefore you indeed need all the ISNULL or COALESCE expressions. That also has nothing to do with XML or pivoting at all.
Bye, Olaf.
I only have the same critizism as Ste5an: Why Change from NVARCHAR(MAX)? This type always will be filled with whatever length the XML property element has, not more, not less.
Any errors you had because of truncated data were from the step inserting in dbo.Products, not because of the MAX size, so there is no reason to limit this to 6035, your target fields are often much shorter anyway.
I also think you'll only ever have NULL values in P when a property is absent in some Attributes XML. You can't bee too cautious, when target fields are not nullable.
While we're at that topic: At one step you thought a default value of dbo.Products would be taken instead of NULLs. Well, No. The Default values come in effect, if a column isn't targetted in an INSERT or if you insert the keyword DEFAULT, but not, when you insert a NULL literally or from a source column, which is NULL, therefore you indeed need all the ISNULL or COALESCE expressions. That also has nothing to do with XML or pivoting at all.
Bye, Olaf.
ASKER
Will do, ste5an. I ran it 6 different times myself since my last post, and they just ran it in production. It completed in 00:10:22, which is a tremendous gain over the 5+ hours it was running before, and the've confirmed the data is good. I cannot thank you both enough. Lot learned on this. Definitely appreciated.
ASKER
Understood about the default, Olaf. As I mentioned in this one: ID: 41512589
Regardless, I still can't thank you both enough. I learned a great deal today. If you don't mind, I will split the points.
Regardless, I still can't thank you both enough. I learned a great deal today. If you don't mind, I will split the points.
Ste5an deserves most points for the main ideas of the solution.
hmm, where are the good ol' times of beerware?
ASKER
All of your input and your patience is greatly appreciated. Thank you both for the assist.
ASKER
ste5an, you are correct. I did strip off the tailend of a few values. I put your suggested correction in, but not until afterwards. Thinking surely this will be fine.... but no, it wasn't. I'm running a skinny version of what you both helped me to model the other day, and am trying to dump the truncated and identifiers into a temp table. it's running forever, in comparison to the final product that I used the other day. May I show you? I don't mind opening another ticket, but I am here first, in hopes of getting your attention.
ASKER
Disregard. I got it. But thank you anyway.
Bye, Olaf.