Avatar of LenCepeda
LenCepeda
 asked on

Database Design Advice

Hello - Please let me start by mentioning I am completely new to MS SQL server and database design.  I have read a DB design book and looked at a few online tutorials.  

Here goes.  I am attempting to import a file (.csv) from another system on a nightly basis into a new MS SQL 2012 DB (Enterprise Ed).  Once the CSV is imported and formatted as a table (Sales) it will be related to one other table (CustomerID) within the DB.  I would like the CSV Import to append new data to a pre-existing table and overwrite any records within that table that have like "SalesID" number.

So far, i setup a MS Server 2012 VM, installed MS SQL Server 2012, Created the DB. successfully manually imported a CSV into the DB to create the initial SALES table and now I am researching solutions that will allow me to import newly created CSV on schedule.  It looks like SSIS / Bulk Insert should do the trick so i installed those feature and began to research next steps.

Once the DB is complete, it will be connected with MS Excel Powerpivots that users can access and refresh.  I have some experience establishing those types of files so i think we will be fine once we work our the basic DB.

1.  Am I on the right track?
2.  Should the imported CSV's go directly into the pre-existing table or should that data get imported into a new table and merged using another process?
3. Once the data is merged in the "master" SALES table, should it get copied elsewhere to allow the Powerpivot connection or is it ok to allow the Powerpivot to connect directly to the "master" sales table

Thank you, any advice would be greatly appreciated.
Microsoft SQL ServerDatabases

Avatar of undefined
Last Comment
Olaf Doschke

8/22/2022 - Mon
Scott Pletcher

1.  Broadly, yes.  Specifically, no.  You skipped the part when you do a true design process to get from a CSV file to normalized relational tables.  The correct design will almost never be to load a spreadsheet into a single table.  At least some code/lookup tables should be needed.
2.  The CSV import should go into a staging table, not the main data table(s).  Staging tables are extremely common.
3.  It's ok to allow Powerpivot to use the main table (for reads only).  If necessary, you can use snapshot isolation later to speed it up.
LenCepeda

ASKER
Thank you, Scott.

I guess i jumped into this a bit prematurely without exactly having a well thought out plan or ER Diagram but sometimes I feel I pick up new things by just trying (and failing).  In any case, I was able to create a Data Flow SSIS package and successfully import the CSV into an existing table or should i say staging table now.  The next step will be to figure out a way to schedule a job to run the data flow package on a regular schedule (appending new SALES data to the staging table and modifying/deleting existing SALES data based on SALESID).  Is this what you mean by a code/lookup table?  If so, can you please elaborate?    Thank you and best regards.
ASKER CERTIFIED SOLUTION
Scott Pletcher

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
Olaf Doschke

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck