Data management in conjunction with Excel

Ian Bell
Ian Bell used Ask the Experts™
on
Hi,

I have an Excel csv database, or will have when completed consisting of around 200 columns and almost one million rows and at least 1GB in size.
I need to then add around 35 columns of formulas copied down to 170k rows I intend to sort the data each time I want to do some calcs.
I fear that Excel will not be able to handle such a workload and wondered how best in terms of efficiency and speed  I could go about solving this puzzle.
Thanks if you can shed some light on it.

Ian
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2014

Commented:
Don't use Excel as a database
Ian Bellretired

Author

Commented:
I had thought of using Access and sending data that I require for analysis to Excel. I have about 40 such datasets to be queried. Is this a good option or are there better options available considering that I am code illiterate but willing to learn some.
Thanks
Roy CoxGroup Finance Manager

Commented:
You can't use the csv format definitely.

The amount of formulas would probably result in slow calculations. Maybe use Access
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Ian Bellretired

Author

Commented:
Currently I have around 40 Excel xlsx worksheets and managing and updating them is a nightmare.
If I was to load all the data into Access can I export to Excel xlsx file what I require, then do the calcs and delete the Excel data ?
while keeping the Access data intact. Or is there a better way of handling this ?
Thanks Roy for responding.
Fabrice LambertConsulting
Distinguished Expert 2017

Commented:
Hmm, 200 columns plus 35 additional columns, we are gettingpretty close to Access's limits (max is 255 columns), plus 1GB x 40, we are way over Access's limits (it can handle up to 2GB).

Better turn yourself toward more robust SQL engines, such as SQL Server, MySQL ect ....
Ian Bellretired

Author

Commented:
I have MYSQL installed but not able to work with it as a good understanding of code is required which I don't have.
Yes now I remember Access can't handle this as 2gb chokes it.
Roy CoxGroup Finance Manager

Commented:
Here's some specifications for Access

Access specifications

If you can work with Access then you can build reports to export to Excel
Rob HensonFinance Analyst

Commented:
Just because a table of 200 columns and 1 million rows takes up 1Gb in Excel doesn't necessarily mean that it will be the same in Access.

An Access table is just raw data whereas the Excel sheet will have a certain amount of extra data related to the sheet.
Ian Bellretired

Author

Commented:
Roy, I wondered if storing it as an xlsb file would do the trick. If so how easy is it for Access to export the data or a copy of the data to Excel ?
I need to run regression and I am familiar with Excel and Solver.
Ian Bellretired

Author

Commented:
Hi Rob, This storage thing has become a bit of an old chestnut for me and haven't yet found a solution.
Due to my lack of code and other tech skills sometimes it is beyond me. I wished when I was younger I had studied programming and statistics instead of chasing skirt and drinking beer :)
Roy CoxGroup Finance Manager

Commented:
From
Excel specifications and limits

32-bit environment is subject to 2 gigabytes (GB) of virtual address space, shared by Excel, the workbook, and add-ins that run in the same process. A data model’s share of the address space might run up to 500 – 700 megabytes (MB), but could be less if other data models and add-ins are loaded.

64-bit environment imposes no hard limits on file size. Workbook size is limited only by available memory and system resources.

Excel 2016 offers the Large Address Aware functionality that lets 32-bit Excel 2016 consume twice the memory when users work on a 64-bit Windows operating system. For more information, see Large Address Aware capability change for Excel.

Note: Adding tables to the Data Model increases the file size. If you don’t plan to create complex Data Model relationships using many data sources and data types in your workbook, uncheck the Add this data to the Data Model box when you import or create tables, pivot tables, or data connections.

I don't use Access much but I'm sure that you could build a report to export to Excel quite easily
Ian Bellretired

Author

Commented:
Thanks Roy that is one for the short list. Do you mind me asking what others do you work with ? or any other methods you could recommend ?
Top Expert 2014
Commented:
SQL Server Express (free) has a 10 GB limit, so you'd get some breathing room.

It is quite likely that you will get a lot of space relief by normalizing your CSV/Excel data, eliminating duplicate data.
Ian Bellretired

Author

Commented:
Aikimark, All these options are no doubt excellent. My problem is getting my head around the coding. I have known for a long time that Access is not going to handle these large memory sucking sheets.
Roy CoxGroup Finance Manager

Commented:
Mostly, I use Excel. I've dona a little with Access and FileMaker Pro (which was very good).

If youhave a lot of calculating to do with the data and are more comfortable with Excel then you could keep the data in Access and export the relevant chunks to Excel - create a template in Excel with all the formulas, etc.
Ian Bellretired

Author

Commented:
Hi Roy et al,

Apologies for late reply. Considering the large amount of data including future additions I have opted to go with SQL Server Express as it can handle a maximum of 10 GB. I had installed this one along with Mysql a year or so ago.
I just now need to get my head around it. I noticed the SQL 2017 which I have is free and the latest 2019 is available as an evaluation copy.
Ian
Ian Bellretired

Author

Commented:
Ajkimark, Thanks for confirming SQL Server Express as a solution. You mentioned normalising data and deleting duplicates. How exactly would I go about that ?
Ian
Top Expert 2014

Commented:
Read articles on 3NF (third normal form).  The illustrations with such articles should give you a start.  Post a new question if you get stuck
Ian Bellretired

Author

Commented:
Thanks, I am just looking at it now. I have opted for your suggestion to use SQL Server Express. Are you suggesting this SQL method from personal experience of using it ? If you don't mind me asking.
Top Expert 2014

Commented:
SQL Server Express has been packaged with Visual Studio since the early 2000s, so I've had it for at least a decade.  It is now easier to use than the early days.
Ian Bellretired

Author

Commented:
I am having problems connecting to the server  (database type) Database Engine.
But that's another problem.
Ian Bellretired

Author

Commented:
Thanks guys, You were all indeed helpful. It all just highlights how little I know considering I've been on computers daily since the first PC came out in the eighties.
Ian Bellretired

Author

Commented:
Thanks Aikimark, I've added those links to my browser and will open each time I start up.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial