Solved

Difference between nonwide and  wide table??

Posted on 2008-10-22
11
2,607 Views
Last Modified: 2012-05-05
Hi,

I have a huge amount of data to store. In total of 500.000 different columns with the same index key.
What is the best way to go with this? Because with SQL server express edition max columns are 1024 with a nonwide table else 30.000 with a wide table. Anybody has suggestions?

Regards,
Dennis
0
Comment
Question by:Schuttend
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 22

Expert Comment

by:dportas
Comment Utility
You can create many tables (more than 2 billion in fact) so the limit on columns per table theoretically shouldn't be too much of a problem.

I would first look very long and hard at your design to be sure whether 500,000 columns really makes sense for this. If you are using Express, which is limited to 4GB per database then each of your columns could only contain an average of less than 8KB of data! That seems totally absurd.

Are you sure there isn't a more effective design using fewer columns? Could you explain a bit more about what your data is and what all these columns mean?
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
Check out capacity comments / specs SQL Server 2008 Books Online : http://msdn.microsoft.com/en-us/library/ms143432.aspx  whiule they can be that wide, have a look at the updates - only 4096 on a wide table. So, it will get ugly.

Any chance of making it XML data and hold it on an XML data type ? Maybe having the often used attributes available.

But very seriously, you will / must group those different attributes / properties into significantly more manageable groups - and all be keyed the same way so you can simply join.  It does mean you will need a "master" and then anything up to 500 tables hanging of it.

Can you show some of it ? Describe some of it ?

Doubt that express will cope, given dportas' statement.


0
 
LVL 13

Expert Comment

by:AielloJ
Comment Utility
Do you really mean 500,000 columns - or did you mean 500,000 rows?  Just developing the table creation schema then typing in and naming 500,000 columns would literally take months.  I would also inagine than most, if not all, of the tools available would choke.  Can you describe what this application is?
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
@AielloJ : my thoughts too, or at least "wonder what the definition of a column is in Schuttend speak", that is why I have asked Schuttend to "show" or describe the data...
0
 

Author Comment

by:Schuttend
Comment Utility
Hi all,

The data is trading system data. With this I mean I have 100.000 + trading systemsn which will generate a tradingsignal every day. With signals , additional caluclations are being made which are also insert into the database. Later on I want to use selection criteria to select some signals from the database.
Each day, every system has
date,signal,equity,drawdown    columns

I have 100.000+ systems, therefore the database wiil get quite large. The tradingdata can start from 1920 onwards, so this means alot of days and also alot of columns.
Does this explain something?

Regards,
Dennis
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 13

Accepted Solution

by:
AielloJ earned 200 total points
Comment Utility
Schuttend:

Your explanation helps.  What did you mean about the trading data starting from 1920 onwards.  Do you mean the year 1920?  Your technical naming appears to confuse rows and columns.  It appears that each day, you will add 100,000 rows (one for each trading system) , with 5 columns in each row (date, signal, equity, drawdown - I only see 4 columns).  Imagine your database table looking like a spreadsheet.  A sheet with 100,000 columns would be almost impossible to manage.  A sheet with 100,000 rows of 5 columns - not so bad.

It may be helpful to visit: http://www.datamodel.org and make sure your database design is normalized.  This will always be a critical factor in the success or failure of a database project.

I don't think you have to worry about wide or not, but whether all your data will fit in a 4Gb limit if you'e using Express versions of the database.

Regards,

JRA
0
 
LVL 22

Expert Comment

by:dportas
Comment Utility
You don't need 1000s of columns. You need one column to identify the source system plus the four columns you mentioned.

If you haven't designed a database before then it would be a good idea to hire some help for your project.
0
 
LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 150 total points
Comment Utility
With that type of transactional volume being added each day - ie new rows, then would recommend something a bit more robust than Express.

Need to think of the data base as a spreadsheet - individual worksheets for lookup type functions such as company, origin (location or office), trader name, etc, then your "trading" worksheet which does have company ID, date, trading type (buy, sell), volume traded, office traded, trader  etc etc. There maybe a dozen or so columns associated with a single "trade", and a few masterfiles for lookups of codes in that "trade" transaction...

Depending on how much detail you need access to, you will most likely have an archiving facility where individual transactions are rolled up into a "day" snap shot like : trade item, open, close, volume, or such like. that way you will be converting hundreds of thousands transactions into thousands of daily summaries. That will help keep the volume in check.

If you can come up with a fairly aggressive archiving strategy, then you might get away with Express to begin with, and certainly good enough to develop a database and data model from, but as soon as you get significant volume, then you will be "stuck". Express is the same engine as Standard and Enterprise, (just throttled back), so anything you do can be migrated.

By the sounds of it you do need to think carefully about database models / database design, and think along the lines of "spreadsheets" when talking about data elements, and more importantly, what you want out of the system. Describing more how you work with the data, and what that data is, and where you can get access to it (ie the inputs). Then, once you have that description, can then start talking about database designs, but not before. Again, more of a business description rather than trying to incorporate database elements...

And would suggest you do buy in some consulting - might cost a few hundred for some ideas, or a few thousand, or several tens of thousands (and for some datawarehousing projects, then can be considerably higher). It really depends on the service you need, such as, design only, design+build, design+build+data_takeup, design+build+ETL+maintain, and then there is size, complexity, features, etc. The way you can control costs is to be very accurate in your business description, how data flows, what you want out, where to get it from (e.g. electronic feed, manual maintenance).
0
 

Author Comment

by:Schuttend
Comment Utility
I have read some more on the subject. I indeed do not need more then about 10 columns. The next step would be to import all data into sql server.
Each day when new data is available more records will be added. As i am working with hugh amount of data from vb6 I need a tool for importing array data at once into sql server. Looping through the array is just not possible, because this is just too slow.
Can somebody give me any suggestions?

Regards,
Dennis
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
Yep, document what you can, then post that doco here, then we can start to work on it... Good starting point is an example of what you are importing and what you think might be needed to store that data.

Would be inclined to import into a "staging" area. Validate those results then update / insert into the "real" database table.

Can all be done via SQL server.
0
 

Author Comment

by:Schuttend
Comment Utility
Please find attached a example file which needs to be imported. Currently you see this columns,

Date,Open,High,Low,Close,Modelsignal1,Modelsignal2 etc

I have about 100.000 different modelsignals for the same date.
Date interval will start at about the 1990 until current date. So each day additional data has to be imported into the database.

Additional calculations needs to be done with modelsignals. For example:
With Modelsignalsignal1 and [close] values ,calculations will be made to derive Equity1 and Drawdown1 and maybe in the future more values.

So per model there is
date
modelsignal
equity
drawdown

All this has to be imported into a database.

Usage:
user can specify selection criteria (Equity,Drawdown,...)  and trading start date.  Selected modelsignals are combined into 1 general tradingsignal.

Does this help??
Example.txt
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
This video discusses moving either the default database or any database to a new volume.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now