Data Appliances?!


At my bosses request, I've begun to investigate Data Appliances. I'm still vague about what they are and how they work. I did find the following quote:

"The data appliance for BI is a purpose-built database machine specifically used to manage analytical data and retrieve the results from massive data analyses with impressive performance – a matter of seconds or minutes instead of hours or even days."

It's the "impressive performance -" part that grabbed my attention! Is this really true?

We're a small company that does high-end analysis of our client's sales/advertising data. Our work is project by project; that is, we get 40 gigs of data from the client and do client specific analysis and deliver it in cubes. It's quite different from working for one company where the database is established and analytics developed go into production and get used over and over. For us, each client's data is different and each client wants different analytics. When we've delivered, it's done and overwith; we dump the data and on to the next project.  

It would be great if we could speed up our development time.

Any info that would help me understand what appliances are and what the learning curve is would be most appreciated.


Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Data appliance is a large storage array with fiber optic connection to local network (yes this is fast) or to your network via the internet (no, the performance is NOT impressive).  If you use and dump 40 GB of data, no, these are not for you, these cost $10,000 and up, and they are for companies using the same data over and over again, but they need 10 terabytes of online storage or so, shared between many offices across the land.  That is where firber appliances are good.

For your needs, send the client an 80 GB IBM drive, tell them to upload all the data to it, get it back, analyze the data, then burn the results to DVD for archival storage.  Wipe the drive clean, send it to the next client, and now you have an extremely cost effective data analysis solution.  And I saved your company $10K or more just by giving this feedback, right?

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
I agree. A DA is not really suitable for you unless you have stack of money available to spend on a high-end solution.

I would suggest a different approach. I assume you are getting the 40GB from clients already in some form or another. In order to speed up development are you able to improve the upload process. A FTP site allowing data transfer to you, and a standardised file format allows importing into what your database easier.

If this was via XML, then get them to add an XSLT to automate the data load process. Sounds like you don't have a huge amount of storage requirements so an alternative is a centrally file hosting solution (alternative to FTP above).

What part of the process are you trying to improve? Data capture, upload, formatting, storage, reporting/analysis, feedback?

Hope this helps
studioEtcAuthor Commented:
There can be no standard format... for each project we have to design a database and then write import code to normalize the data as it is loaded.

The quote in my original question mentioned seemed to indicate that the analysis part was speedy. (Queries that took days now running in minutes).  We don't have queries that take days but we do have queries that take a lot longer than we want them to.

I know an appliance is overkill for the amount of data we have.

Also, I did a search here on Experts-Exchange for Data Appliances, and found some posts for Snap drives. Is a Snap drive the same thing you are all talking about?


I thought of FTP upload too, but too much risk to expose proprietary data, and many clients struggle to get big files on FTP server, so not very good solution for 40GB.  

Yes snap is one type of network appliance, read about the marketing hype here -

But this still not a "solution" for your database import, customize, and output.  It is just big virtual storage medium at high cost, for companies needing online access to terabytes of data.  

WE seem to be talking two different things here - (1) what is network appliance and (2) what you need to solve business issues.  

I dont see any correlation between them, unless you need to store hundreds of clients on data array at same time, and run query processing on them all.  Remember, any network appliance is going to slower than direct disk access on your local machine.  It may be comparable to a server's drives, but it will be always slower than having the data on a hard disk, during development and testing.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.