Data Appliances?!

Posted on 2006-04-01
Medium Priority
Last Modified: 2010-04-03

At my bosses request, I've begun to investigate Data Appliances. I'm still vague about what they are and how they work. I did find the following quote:

"The data appliance for BI is a purpose-built database machine specifically used to manage analytical data and retrieve the results from massive data analyses with impressive performance – a matter of seconds or minutes instead of hours or even days."

It's the "impressive performance -" part that grabbed my attention! Is this really true?

We're a small company that does high-end analysis of our client's sales/advertising data. Our work is project by project; that is, we get 40 gigs of data from the client and do client specific analysis and deliver it in cubes. It's quite different from working for one company where the database is established and analytics developed go into production and get used over and over. For us, each client's data is different and each client wants different analytics. When we've delivered, it's done and overwith; we dump the data and on to the next project.  

It would be great if we could speed up our development time.

Any info that would help me understand what appliances are and what the learning curve is would be most appreciated.


Question by:studioEtc
  • 2
LVL 44

Accepted Solution

scrathcyboy earned 1000 total points
ID: 16352863
Data appliance is a large storage array with fiber optic connection to local network (yes this is fast) or to your network via the internet (no, the performance is NOT impressive).  If you use and dump 40 GB of data, no, these are not for you, these cost $10,000 and up, and they are for companies using the same data over and over again, but they need 10 terabytes of online storage or so, shared between many offices across the land.  That is where firber appliances are good.

For your needs, send the client an 80 GB IBM drive, tell them to upload all the data to it, get it back, analyze the data, then burn the results to DVD for archival storage.  Wipe the drive clean, send it to the next client, and now you have an extremely cost effective data analysis solution.  And I saved your company $10K or more just by giving this feedback, right?

Assisted Solution

IPKON_Networks earned 1000 total points
ID: 16360253
I agree. A DA is not really suitable for you unless you have stack of money available to spend on a high-end solution.

I would suggest a different approach. I assume you are getting the 40GB from clients already in some form or another. In order to speed up development are you able to improve the upload process. A FTP site allowing data transfer to you, and a standardised file format allows importing into what your database easier.

If this was via XML, then get them to add an XSLT to automate the data load process. Sounds like you don't have a huge amount of storage requirements so an alternative is a centrally file hosting solution (alternative to FTP above).

What part of the process are you trying to improve? Data capture, upload, formatting, storage, reporting/analysis, feedback?

Hope this helps

Author Comment

ID: 16362801
There can be no standard format... for each project we have to design a database and then write import code to normalize the data as it is loaded.

The quote in my original question mentioned seemed to indicate that the analysis part was speedy. (Queries that took days now running in minutes).  We don't have queries that take days but we do have queries that take a lot longer than we want them to.

I know an appliance is overkill for the amount of data we have.

Also, I did a search here on Experts-Exchange for Data Appliances, and found some posts for Snap drives. Is a Snap drive the same thing you are all talking about?


LVL 44

Expert Comment

ID: 16363496
I thought of FTP upload too, but too much risk to expose proprietary data, and many clients struggle to get big files on FTP server, so not very good solution for 40GB.  

Yes snap is one type of network appliance, read about the marketing hype here -

But this still not a "solution" for your database import, customize, and output.  It is just big virtual storage medium at high cost, for companies needing online access to terabytes of data.  

WE seem to be talking two different things here - (1) what is network appliance and (2) what you need to solve business issues.  

I dont see any correlation between them, unless you need to store hundreds of clients on data array at same time, and run query processing on them all.  Remember, any network appliance is going to slower than direct disk access on your local machine.  It may be comparable to a server's drives, but it will be always slower than having the data on a hard disk, during development and testing.

Featured Post

Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
A look at what happened in the Verizon cloud breach.
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…
Suggested Courses
Course of the Month6 days, 20 hours left to enroll

593 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question