Remodel Poorly Designed SQL Server Database Tables

Hello experts

how to redesign below table which has 11523477 rows .

CREATE TABLE [dbo].[nsPDFDocument](
      [DocumentID] [varchar](100) NOT NULL,
      [DocumentDate] [char](10) NOT NULL,
      [DocumentTime] [char](10) NOT NULL,
      [DocumentSize] [int] NOT NULL,
      [PDFData] [image] NULL,
      [FileName] [varchar](255) NULL,
      [OwnerID] [varchar](50) NULL,
      [Title] [varchar](150) NULL,
      [CreationDate] [char](10) NULL,
      [CreationTime] [char](10) NULL,
      [Provisional] [char](1) NULL,
      [PreviousVersionID] [varchar](100) NULL,
      [isLockedBy] [varchar](50) NULL,
      [SecondaryStorageURI] [varchar](150) NULL,
      [PreviousExportUri] [varchar](150) NULL,
      [PurgedDocumentHash] [varbinary](max) NULL

i have two index on this table :Document id (clustered) and ownerid (non-unique,non-clustered)

In terms of design, the following problems I identified

1. No keys or constraints on the table
2.No primary key
3.One of the problems identified with the original datatypes
i am new to database design ,so please help me to understand and redesign table .
thanks .
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Paul MacDonaldDirector, Information SystemsCommented:
I would establish a primary key as a matter of rote.  The use of other keys, constraints, data types, etc, is unanswerable without knowing more about how this table interacts with its application, other tables, etc.
MD SHAMIMAuthor Commented:
that table we use for store all pdf files and export to the secondary storage .
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
Off the top of my head..
  • There's really not enough information here to nail down the modeling.  Sample values?  How is this table used?  Is there frequent INSERTS, UPDATES, SELECTs?
  • Any date values should be in a column with a date/datetime data type, and not a char.
  • Find out if there's a reason for separate date and time columns, and if no combine them into a single datetime column.
  • Having a table name with Document in it plus a column DocumentID implies that should be the primary key, and not just clustered.
  • Any foreign keys on this puppy?  If there's an OwnerID that implies there's an Owner table with a related ID column.  Same with Version.
  • 11 million rows, is there any 'old' data in this table that should be archived/deleted?  Having columns for DocumentDate/CreatedDate implies that you can execute some logic to the effect of 'delete any rows that are x years old'.
  • Since your varchar columns have a lot of round numbers for lengths like 100 and 150, I'd eyeball the values in these columns to make sure the lengths are not excessively large.  For example, having a varchar(50) column for state codes that will only contain two characters.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Olaf DoschkeSoftware DeveloperCommented:
I agree with others, a single table most often is not enough to see normalization and/or performance problems.

I consider negative: image is an outdated datatype. If this is the PDF document/file itself you would perhaps even use a filetable to store the file and only relate to that via this table.

DocumentID seems to be the unique id I would expect as primary key. OwnerID and PreviousVersionID seem to be foreign keys, which should have a foreign key constraint these IDs need to exist as primary key in other tables. PreviousVersionID might even be the DocumentID of a previous version of the file, which should get a better name, but that's also a matter of taste.

On the positive side, you don't have and repeated columns with numbering (address1,address2,...)

That the clustered index is on DocumentID is a sign it was designed by someone not having a good idea about the importance of the clustered index to cluster data you want to fetch in groups, but the nature of that ID being a varchar might point out a primary key having some categorizing nature, which clusters documents in a good way. I'm pessimistic about that, though. As other IDs also are varchar I wouldn't bet on that. It might even be columns, which store a uniqueidentifier, while that has its own data type.

It's necessary to see more of the data, to which other tables this table relates and what queries are done on it.

Bye, Olaf.
MD SHAMIMAuthor Commented:
thanks Jim Horn for your wonderful thoughts .i was able to solve the problem by following your steps .
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.