VistarTechnologies
asked on
Importing JSON into multiple SQL tables using SQLBulkCopy or parameterized insert statements.
I have a several JSON source files (about 500 MB total size, with 76,000 individual records) that I'm working on importing into a SQL database using C#. I prefer importing into SQL using SQLBulkCopy, but to use that for multiple tables I would need to effectively create the entire database in C# before importing, or else read through every JSON file once for each table I'm importing.
Neither option which seemed good to me, so I have written code to import each line of data individually using parameterized SQL queries, but that hits the database once for each record on each table. There is also a potential problem in that our database has a record keys table for the primary key on each database table, and in order to not be writing to that table after every row insert (doubling the number of database writes), I'm keeping track of it in C# and updating it at the end. The code to keep track of the value is more complicated than I would like, which means the risk of an issue is higher than updating it once after a SQLBulkCopy.
The most important question is which would be faster, parsing the JSON files once for every database table and using SQLBulkCopy for one insert per table, or parsing the JSON files once and writing each individual line? Also, if the times are similar, it might be worth rewriting it to parse the files once for each table anyway, just to make the code simpler and reduce the likelihood of primary key errors.
Neither option which seemed good to me, so I have written code to import each line of data individually using parameterized SQL queries, but that hits the database once for each record on each table. There is also a potential problem in that our database has a record keys table for the primary key on each database table, and in order to not be writing to that table after every row insert (doubling the number of database writes), I'm keeping track of it in C# and updating it at the end. The code to keep track of the value is more complicated than I would like, which means the risk of an issue is higher than updating it once after a SQLBulkCopy.
The most important question is which would be faster, parsing the JSON files once for every database table and using SQLBulkCopy for one insert per table, or parsing the JSON files once and writing each individual line? Also, if the times are similar, it might be worth rewriting it to parse the files once for each table anyway, just to make the code simpler and reduce the likelihood of primary key errors.
ASKER
I am using SQL 2016, but due to the number and complexity of the data manipulations would prefer to use C#.
To read JSON in C#, you need to implement serialisation / deserialisation with your own classes.
you can start looking for this free library: Newtonsoft.Json
https://www.nuget.org/packages/Newtonsoft.Json/
you can start looking for this free library: Newtonsoft.Json
https://www.nuget.org/packages/Newtonsoft.Json/
This question needs an answer!
Become an EE member today
7 DAY FREE TRIALMembers can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
if you're using ver 2016 and above, you can read JSON content using OpenJSON method
OPENJSON (Transact-SQL)
https://docs.microsoft.com/en-us/sql/t-sql/functions/openjson-transact-sql?view=sql-server-2017