[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

Moving existing Oracle data warehouse to new server.  What type of HP servers is suitable? Would like real world advice.

Posted on 2007-12-04
9
Medium Priority
?
571 Views
Last Modified: 2013-12-06
Current System Info:

HP 11 server = 2 CPUS, 4 gigs of memory
datawarehouse = 80 gigs
Largest tables (partitionned) = 10 gigs
Potential data growth over the next 3 years = 400 gigs

Hi, I'm in the market for a new data warehouse server and I'd like to know other's recommendation.
We use HP and Oracle 10g.  I'd like to take advantage of oracle parallel processing.
Currently our 2 CPU server takes 20 hours to complete one of our ETL batch run.  This has become unacceptable to the client.
***(YES our instance is tuned correctly)***
  We our now experiencing a bottleneck with I/O.  Memory is fine.  Indexes are being used but the data
load is now so massive that our inserts take too long.  SGA is running at optimal.
**PLEASE PLEASE don't tell me I need to review my code, indexing, analyze plan**
Everything is tuned.  It is the server that is the bottleneck.

How many CPUs should I go with?
How many disk controllers?

I'd just like others input using real world experience.  I'd like to lower ourt batch window to less than an house (each week we load approx. 1.5 gigs worth of new data).
0
Comment
Question by:datalor
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +2
9 Comments
 
LVL 40

Expert Comment

by:omarfarid
ID: 20403937
Hi,

I am not HP H/W expert, but from the current env. we have , I would recommend:

4 CPUs @ 1.5 GHz or higher
8 - 16 GB RAM
Multiple disks (2-4) with multiple controllers (2-3)
0
 

Author Comment

by:datalor
ID: 20404709
That's great, that's what I'm looking for.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 20405024
http://h71019.www7.hp.com/ActiveAnswers/cache/383733-0-0-0-121.html has some approved configurations, you'll need to create an account to access it but that's free.
0
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

 
LVL 6

Accepted Solution

by:
JJSmith earned 1200 total points
ID: 20406228

You should be looking at something like the rp3440 or the rp4440 (PA-RISC) as a minimum with ultra320 disks (at least 4 spindles ). If you have a few pennies extra then the rx series would give you that extra grunt (itainium).

However that said, if you've identified I/O bottleneck then any server that gives you muliple paths to disk spindles should help resolve your issue.

We have servers ranging from rp3410 (single CPU) with 4 disk trays (2120) through to SuperDomes with XP arrays - our biggest improvement in data crunching came when we started to move hosts onto a SAN fabric - expensive, but typically gives faster I/O than directly attached storage.

Be advised if you purchase the 2120, we did, with Ultra320 SCSI HBA and Ultra320 Disk Drives - only to find out that the 2120 doesn't work above Ultra160 ;-)

Parallel processing at the oracle end is all well and good, particularly with partitioned tables - but the I/O will just serialize if you don't have enough spindles.

Good look with your procudment mechanism!

 




0
 
LVL 56

Expert Comment

by:andyalder
ID: 20406648
I'm not sure about whether a SAN is faster or not; looking at the TPC results sometimes HP use a SAN and sometimes DAS, for your superdome the HP-UX/oracle solution uses 50 MSA100s in a SAN but the Windows/SQL version uses 49 RAID controllers as DAS. You're certainly right that it's spindles that count though.

How much more do you spend on the disk subsystem than on the server?

www.tpc.org/tpch/results/tpch_results.asp?orderby=hardware
0
 

Author Comment

by:datalor
ID: 20406656
Thanks...that helps.
FYI: Our current DB server is HP 9000/800 L1000.
I think it was purchased in the late 90s.
Both CPUs are 400Mhz.
0
 
LVL 21

Expert Comment

by:tfewster
ID: 20408270
You need to pin down what the actual problem with your current system is to ensure you don't hit it again in a couple of years time.

You _know_ it's I/O, but on what? Is the system paging (swapping) because it doesn't have enough memory? Could a bigger SGA size get around frequent re-reads & writes? (again,  more memory). Better to let Oracle (SGA) cache frequently used data than rely on HP-UX disk caching.

Or could the LVM layout be better organised for performance? Is the I/O bottleneck on log writes or tables writes? A cheapo disk array could give you performance improvements with your existing server over just using internal disks with no RAID/striping.
0
 

Author Comment

by:datalor
ID: 20411250
THanks:

paging (swapping) is not a problem, we have sufficient memory and the SGA is sized correctly
However swapping does occur and I believe this is unavoidable when sorts occur on very large data sets.  These sorts are being made by the BI tool and are done on large fact tables or datamarts. This isn't the problem though. Most queries use the sort_area_size efficiently and only sort to disk when the data set is too big.  The problem is reading/writing from/to large indexes and tables.  The CPUs are not being pushed but they are simply waiting on the data to be retrieved (disk reads/ disk writes).  The bottle neck is not on log writes.  The dabase system stats point to waits occurring only on direct reads/writes to disk.

But i'll look into exactly how the data is distributed across the LVM.  This is a bit of a grey area for me.
What should I be looking for?  Tables/indexes spread across multiple disks/disk controllers?  I can see the different disks but how can tell how many controllers I have? And what about spindles?  If I purchase disks with multiple spindles.  How can quarantee that the server i purchase will have multiple paths to disk spindles?
0
 
LVL 21

Assisted Solution

by:tfewster
tfewster earned 800 total points
ID: 20415620
> ..how can tell how many controllers I have..

The device name shown by vgdisplay or lvdisplay will show you the controller ids, e.g. c2t5d0, c3t5d0  (c3 and c3 are separate controllers; On a SAN that could be 2 paths to the same "target" [logical disk], but it sounds as if you're using direct attached disk).

If you could post the output of `bdf`  and `vgdisplay -v`, we should be able to help analyse your current setup.  

[As the L1000 doesn't do hardware RAID, I'd guess that you have 4 x 36GB or 72Gb internal disks and are using mirroring for resilience; Couple that with the fact that one built-in controller is slower than the other on an L-Class, mirroring will kill write performance.]


lvdisplay -v /dev/<vgname>/<lvname> will show you the detailed layout of a logical volume on disks; If the volume is striped across multiple disks, that `lvdisplay` command will show the Logical Extents of the LV alternating across Physical Extents on different disks.

`sar -d` will show you how busy the individual disks are; You can also get a 60 day evaluation licence for Glance, which will show you how busy the individual Logical Volumes/filesystems are - Though I guess you already have a good idea of that through the database monitoring tools


Your options for expansion are basically:
1) "Dumb" expansion disk cabinets like the DS2120, using LVM to mirror and stripe the disks; Or purchase  Veritas Volume Manager software to allow RAIDing
2) "Smart" storage, with battery backed cache that can handle RAIDing & multi-pathing; The higher cost of the smart array is offset by not having to mirror for resilience.

In either case, go for additional I/O cards to connect to your expansion storage, whether its a Fibre or SCSI disk array. I'm not sure if HP still have a SCSI attach option for their smart storage, so you'll need to talk to an HP reseller at some stage to find out the current product set.
 
As others have said, the more physical disks (spindles), the better - Though the cache on a smart array will mask this by caching writes and (with some luck) pre-fetching reads.
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Backups and Disaster RecoveryIn this post, we’ll look at strategies for backups and disaster recovery.
New style of hardware planning for Microsoft Exchange server.
This video shows how to Export data from an Oracle database using the Original Export Utility.  The corresponding Import utility, which works the same way is referenced, but not demonstrated.
Via a live example, show how to restore a database from backup after a simulated disk failure using RMAN.
Suggested Courses

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question