Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


How to deal with file import into database? asp.net, MSSql

Posted on 2012-03-18
Medium Priority
Last Modified: 2012-03-21
Purely at a theoretical level, I don't know if there is a "correct" answer. I know this is really general but I would appreciate some advice.

I have a project that requires a file import on a daily basis. I'd like to optimize the database but I'm not sure what's best in this situation. I've never had to deal with a daily import of data before.

The daily list is a .csv file that is a list of vehicles with make, model, year, condition, but also includes stuff like address, city, etc. The database itself will include this imported list and mainly administration type things like: Admin, Users, Offers and then layout/display type info to facilitate searches/display, etc.

Is it worth the time to import the list, then transfer to tables like, Address, Vehicle, Users, Admin and do the foreign key, typical database structure?

Or should I just leave the list as is and only optimize the Admin stuff?

Question by:MHenry
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
LVL 42

Expert Comment

ID: 37735512
By "optimize", I think you are mean "normalize".   Optimization means designing the database so that you get good performance and/or space utilization.  Normalization means designing the database to promote data integrity.  

In some cases, a normalization and optimization are at odds; but not always.  The goal is to find a happy balance within the bounds of your stakeholder requirements.

Since the database seems to be for reporting and analysis, I don't much see the point of foreign key constraints (except possibly to identify problems in the source data).  Ditto for normalization.  You may want to consider restructuring the CSV to address specific performance issues however.  That includes decomposition as well as aggregation and indexing.

Assisted Solution

mr_nadger earned 1000 total points
ID: 37736646
How much data are you importing?
Is the data is primarily for reporting or searching?
Is the data read-only, or can users amend it through whichever tool will be doing the searching?
How powerful is the server hosting your database?

As Dqmq said, you need to know what the data's for, before you can really plan the database.
Personally, I'd put in an auto-incrementing integer primary key field, if no unique field exists in the data, to help with the searching (users love a simple ID key!).

The vital step would be indexing it, e.g. make/model/year, and state/city, this would help with both searches and reporting (we recently had a very humbling experience with a Microsoft SQL designer giving us a lesson on Indexing for Dummies, so make sure you understand indexing).
After each import, re-index the data through tsql as the built-in re-indexing is a bit dubious, and make sure you run a nice dbcc checkdb every so often to make sure your database stays healthy.

Author Comment

ID: 37742866
It's about 250mb. And is actually being imported twice daily.

I was just thinking that (for example) if there's an address in the import list and an address for users the typical design would be to have an address table and do all the foreign key stuff.

But I've been thinking lately that there is no real reason to try to maneuver all that data into different tables - especially since it's being updated twice a day. It just seems to me that it's just more of an opportunity for something to go wrong.

Mainly just wanted to see if there's a compelling reason to do so...

LVL 42

Accepted Solution

dqmq earned 1000 total points
ID: 37743727
>Mainly just wanted to see if there's a compelling reason to do so...

"Compelling reasons" need to come from your business case, not from our outsiders view.  The integrity of a reports/analysis database is dependent on the loading process which can be fairly tightly controlled.   Very often, the overhead of referential integrity on top of that is not cost effective.  So, in that sense, I agree with your "gut":  why add complexity risk, unless  there is a justified benefit.  Is there?

That said, I can tell you that many report/analysis imports are restructured for better alignment with the queries that are anticipated.  That's done both for performance reasons and to simplify the query development.   Your call.

Author Comment

ID: 37743772
I don't think there is a compelling reason. But that's why I'm asking here. ;)

I can move the data into more traditional tables once a customer adds the vehicle to his list. And when it's moved into the user's list I can do all the more complex queries. The imported list is mainly just a "choose from this list" kind of thing.

At least that's the way I'm thinking about it now.

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question