Moving existing Oracle data warehouse to new server.  What type of HP servers is suitable? Would like real world advice.

Posted on 2007-12-04
Medium Priority
Last Modified: 2013-12-06
Current System Info:

HP 11 server = 2 CPUS, 4 gigs of memory
datawarehouse = 80 gigs
Largest tables (partitionned) = 10 gigs
Potential data growth over the next 3 years = 400 gigs

Hi, I'm in the market for a new data warehouse server and I'd like to know other's recommendation.
We use HP and Oracle 10g.  I'd like to take advantage of oracle parallel processing.
Currently our 2 CPU server takes 20 hours to complete one of our ETL batch run.  This has become unacceptable to the client.
***(YES our instance is tuned correctly)***
  We our now experiencing a bottleneck with I/O.  Memory is fine.  Indexes are being used but the data
load is now so massive that our inserts take too long.  SGA is running at optimal.
**PLEASE PLEASE don't tell me I need to review my code, indexing, analyze plan**
Everything is tuned.  It is the server that is the bottleneck.

How many CPUs should I go with?
How many disk controllers?

I'd just like others input using real world experience.  I'd like to lower ourt batch window to less than an house (each week we load approx. 1.5 gigs worth of new data).
Question by:datalor
  • 3
  • 2
  • 2
  • +2
LVL 40

Expert Comment

ID: 20403937

I am not HP H/W expert, but from the current env. we have , I would recommend:

4 CPUs @ 1.5 GHz or higher
8 - 16 GB RAM
Multiple disks (2-4) with multiple controllers (2-3)

Author Comment

ID: 20404709
That's great, that's what I'm looking for.
LVL 57

Expert Comment

ID: 20405024
http://h71019.www7.hp.com/ActiveAnswers/cache/383733-0-0-0-121.html has some approved configurations, you'll need to create an account to access it but that's free.
Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.


Accepted Solution

JJSmith earned 1200 total points
ID: 20406228

You should be looking at something like the rp3440 or the rp4440 (PA-RISC) as a minimum with ultra320 disks (at least 4 spindles ). If you have a few pennies extra then the rx series would give you that extra grunt (itainium).

However that said, if you've identified I/O bottleneck then any server that gives you muliple paths to disk spindles should help resolve your issue.

We have servers ranging from rp3410 (single CPU) with 4 disk trays (2120) through to SuperDomes with XP arrays - our biggest improvement in data crunching came when we started to move hosts onto a SAN fabric - expensive, but typically gives faster I/O than directly attached storage.

Be advised if you purchase the 2120, we did, with Ultra320 SCSI HBA and Ultra320 Disk Drives - only to find out that the 2120 doesn't work above Ultra160 ;-)

Parallel processing at the oracle end is all well and good, particularly with partitioned tables - but the I/O will just serialize if you don't have enough spindles.

Good look with your procudment mechanism!


LVL 57

Expert Comment

ID: 20406648
I'm not sure about whether a SAN is faster or not; looking at the TPC results sometimes HP use a SAN and sometimes DAS, for your superdome the HP-UX/oracle solution uses 50 MSA100s in a SAN but the Windows/SQL version uses 49 RAID controllers as DAS. You're certainly right that it's spindles that count though.

How much more do you spend on the disk subsystem than on the server?


Author Comment

ID: 20406656
Thanks...that helps.
FYI: Our current DB server is HP 9000/800 L1000.
I think it was purchased in the late 90s.
Both CPUs are 400Mhz.
LVL 21

Expert Comment

ID: 20408270
You need to pin down what the actual problem with your current system is to ensure you don't hit it again in a couple of years time.

You _know_ it's I/O, but on what? Is the system paging (swapping) because it doesn't have enough memory? Could a bigger SGA size get around frequent re-reads & writes? (again,  more memory). Better to let Oracle (SGA) cache frequently used data than rely on HP-UX disk caching.

Or could the LVM layout be better organised for performance? Is the I/O bottleneck on log writes or tables writes? A cheapo disk array could give you performance improvements with your existing server over just using internal disks with no RAID/striping.

Author Comment

ID: 20411250

paging (swapping) is not a problem, we have sufficient memory and the SGA is sized correctly
However swapping does occur and I believe this is unavoidable when sorts occur on very large data sets.  These sorts are being made by the BI tool and are done on large fact tables or datamarts. This isn't the problem though. Most queries use the sort_area_size efficiently and only sort to disk when the data set is too big.  The problem is reading/writing from/to large indexes and tables.  The CPUs are not being pushed but they are simply waiting on the data to be retrieved (disk reads/ disk writes).  The bottle neck is not on log writes.  The dabase system stats point to waits occurring only on direct reads/writes to disk.

But i'll look into exactly how the data is distributed across the LVM.  This is a bit of a grey area for me.
What should I be looking for?  Tables/indexes spread across multiple disks/disk controllers?  I can see the different disks but how can tell how many controllers I have? And what about spindles?  If I purchase disks with multiple spindles.  How can quarantee that the server i purchase will have multiple paths to disk spindles?
LVL 21

Assisted Solution

tfewster earned 800 total points
ID: 20415620
> ..how can tell how many controllers I have..

The device name shown by vgdisplay or lvdisplay will show you the controller ids, e.g. c2t5d0, c3t5d0  (c3 and c3 are separate controllers; On a SAN that could be 2 paths to the same "target" [logical disk], but it sounds as if you're using direct attached disk).

If you could post the output of `bdf`  and `vgdisplay -v`, we should be able to help analyse your current setup.  

[As the L1000 doesn't do hardware RAID, I'd guess that you have 4 x 36GB or 72Gb internal disks and are using mirroring for resilience; Couple that with the fact that one built-in controller is slower than the other on an L-Class, mirroring will kill write performance.]

lvdisplay -v /dev/<vgname>/<lvname> will show you the detailed layout of a logical volume on disks; If the volume is striped across multiple disks, that `lvdisplay` command will show the Logical Extents of the LV alternating across Physical Extents on different disks.

`sar -d` will show you how busy the individual disks are; You can also get a 60 day evaluation licence for Glance, which will show you how busy the individual Logical Volumes/filesystems are - Though I guess you already have a good idea of that through the database monitoring tools

Your options for expansion are basically:
1) "Dumb" expansion disk cabinets like the DS2120, using LVM to mirror and stripe the disks; Or purchase  Veritas Volume Manager software to allow RAIDing
2) "Smart" storage, with battery backed cache that can handle RAIDing & multi-pathing; The higher cost of the smart array is offset by not having to mirror for resilience.

In either case, go for additional I/O cards to connect to your expansion storage, whether its a Fibre or SCSI disk array. I'm not sure if HP still have a SCSI attach option for their smart storage, so you'll need to talk to an HP reseller at some stage to find out the current product set.
As others have said, the more physical disks (spindles), the better - Though the cache on a smart array will mask this by caching writes and (with some luck) pre-fetching reads.

Featured Post

Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Why Shell Scripting? Shell scripting is a powerful method of accessing UNIX systems and it is very flexible. Shell scripts are required when we want to execute a sequence of commands in Unix flavored operating systems. “Shell” is the command line i…
New style of hardware planning for Microsoft Exchange server.
This video shows how to copy a database user from one database to another user DBMS_METADATA.  It also shows how to copy a user's permissions and discusses password hash differences between Oracle 10g and 11g.
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

569 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question