Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

BULK INSERT / FORMAT FILE / COMPUTED COLUMNS

Posted on 2016-07-27
14
Medium Priority
?
162 Views
Last Modified: 2016-10-07
Simple explanation: I have a table with several computed columns. I am attempting to BULK INSERT data into the table. For simplicity sake, assume the table schema and format file look like this:
CREATE TABLE t1(
    col1 varchar(10),
    col2 varchar(10),
    col3 as col1 + '-' + col2,
    col4 varchar(10)
    col5 varchar(10)
)

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="10" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="10" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="10" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
  <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="10" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="col1" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="2" NAME="col2" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="3" NAME="col4" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="4" NAME="col5" xsi:type="SQLVARYCHAR"/>
 </ROW>
</BCPFORMAT>

Open in new window

When data is loaded, it is as if col3 tries to load into the computed column (no errors) and col4 contains data for col5 and col5 is null. For each computed column in the table, data is "shifted" to the left by one column/field, which leaves the columns at the end with null values.
Any idea how to fix this in a BULK INSERT? I know I can use INSERT / SELECT FROM OPENROWSET(..., but would rather stay with the bulk insert as I'd have to revise code for 25 files/tables.

Thanks!
0
Comment
Question by:dbbishop
  • 5
  • 3
  • 2
  • +2
14 Comments
 
LVL 41

Assisted Solution

by:Sharath
Sharath earned 332 total points (awarded by participants)
ID: 41731635
If possible, can you load the data with BULK INSERT into a stage table without col3 (computed column). Then use the stage table and load data into your target table.
0
 
LVL 70

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 664 total points (awarded by participants)
ID: 41731637
With xml, it's probably easiest to create a view of the table with only non-computed columns, then import into the view.  SQL should be able to handle that just fine.

CREATE VIEW t1_import_view
AS
SELECT col1, col2, col4, col5
FROM t1
0
 
LVL 66

Assisted Solution

by:Jim Horn
Jim Horn earned 672 total points (awarded by participants)
ID: 41731641
Similar to the above, how 'bout an Execute SQL statement that drops the computed columns, then do your load, then another Execute SQL statement that re-adds the computed columns?

>col3 tries to load into the computed column (no errors)
Sounds like a mapping error that can easily be avoided if the computed column doesn't exist in the first place.

>For each computed column in the table, data is "shifted" to the left by one column/field
This type of error will happen if your file uses text file delimeters, say "this is a text field", and inside that text field is the same character, say "this "comment" throws an error".
0
Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

 
LVL 15

Author Comment

by:dbbishop
ID: 41731642
So basically, are you both saying that there is no way to explicitly import into a table with computed columns?
0
 
LVL 70

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 664 total points (awarded by participants)
ID: 41731646
I don't believe so using xml, unless it's the last column(s) in the table that you want to skip.  I guess, then, another option would be to move all computed columns to the end of the table.
0
 
LVL 66

Accepted Solution

by:
Jim Horn earned 672 total points (awarded by participants)
ID: 41731656
>So basically, are you both saying that there is no way to explicitly import into a table with computed columns?
I can't speak to the XML aspects of this question, but my response is 'Yes you can, but it's not really a good idea'.

Perhaps a better idea would be to create a 'staging' data with only the columns needed, and import into that table.  Then do all your data quality checks, and when that's done INSERT from that table into your final destination table.
0
 
LVL 15

Author Comment

by:dbbishop
ID: 41731665
I'm bulk loading into a staging table. I don't really want to create a pre-staging table, load it, then transfer data (millions of rows) into the 'staging' table. Views might be doable. Moving computed columns to the end of the table might be doable, and just changing the query to an INSERT INTO/SELECT FROM OPENROWSET(BULK...) might be the best way to go (no additional objects to maintain.

It just sucks that the documentation tells you how to skip columns in the XML, but it apparently doesn't work for computed columns. YEAH M$!!
0
 
LVL 15

Author Comment

by:dbbishop
ID: 41731674
St. Jimbo: two things, the columns on the end being null are a result of the 'shifting' that is taking place because of computed columns. If there is 1 computed column in the table, the last column of the table will be NULL. If there are 4 computed columns, the last 4 will be null. It has nothing to do with delimiters in the files.
Second, you say, "I can't speak to the XML aspects of this question, but my response is 'Yes you can, but it's not really a good idea'. " Why would it not be a good idea to skip computed columns? That statement doesn't make sense to me.
0
 
LVL 143

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 332 total points (awarded by participants)
ID: 41731745
according to the specs, it should work
https://msdn.microsoft.com/en-us/library/ms179250.aspx

as alternative, I would suggest a view without the computrd col, or drop the column snd add it back afterwards as brute force solution
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 41732057
Btw, I suggested importing into a view w/o the computed column(s) in the very first post.
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 41732067
sorry, I started eriting, and on posting, network was lost ( walking on the beach ...)
post submitted when network came back...
0
 
LVL 15

Assisted Solution

by:dbbishop
dbbishop earned 0 total points (awarded by participants)
ID: 41732189
Angel,
Yeah, it should but I think computed columns throw something into the mix that is not accounted for. I've done format files in the past, but not with computed columns. I've skipped columns (and fields) before (not on this job), so when it didn't work, that link was the first place I looked.

I opted for changing the code to use INSERT INTO (column list) / SELECT * FROM OPENROWSET(BULK...)m and leaving the computed columns out of the inset list. Works like a charm.

The ETL process (in place before I got here) drops and recreates ALL staging objects tied to the client schema (tables, functions, views, etc.) on each run and revising all that code made this the simpler process. I wish I could have gotten a definitive answer regarding skipping columns when they are computed, because the standard documentation certainly does not apply.

Thanks for all the alternate suggestions. I just don't think they'd have been best in our case, although if I had to go with one, it probably would have been the views.
0
 
LVL 15

Author Comment

by:dbbishop
ID: 41826399
not that I have any objection with the suggested point split, I believe my solution of using OPENROWSET with INSERT/SELECT was the most productive.

There were some good suggestions, but overall, the idea of loading 25+ million rows of data into a pre-staging staging table and then loading the staging table from that did not seem very productive. I ran several tests (the purpose of this whole 'experiment' was to optimize the current process) and that proved to be the most efficient way to process the data.  

Although it may have been beneficial to just move the computed columns to the end of the table schema, changing the schema would have been difficult, and one of the requirements was to not make schema changes if possible.
0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Viewers will learn how the fundamental information of how to create a table.

971 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question