Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1645
  • Last Modified:

Backup Performance and Fibre Channel Zoning

Hi,

We recently acquired an EMC VNX5300 and a FC Tape Library.
The original zoning configuration was not following EMC’s best practice for single HBA zoning so I went ahead and created new zones.
Before the change, one of my backup jobs which used to take 11+ hours went down to 5+ but after creating the new zones my backup jobs is taking again 11+ hours to complete.

To accomplish the backup, I am using a windows 2003 server with Arcserve R16 on it.  This server, BackupServer1 has a dual 8Gb HBA and connects to both FC switches.
The backup job backs up a LUN that it is attached to a Solaris server.  The Solaris server uses a single HBA 4Gb connected to SANSwitch1.

The VNX_SPA, VNX_SPB are the storage processors for the VNX 5300.

The tape library is an (Overland) IBM 3573 FC with dual LTO6 drives.  I have one drive attached to SANSwitch1 and the other attached to SANSwitch2.

The switches are EMC (Brocade) DS-300B 8Gb FC.

I am obviously not a storage person and I just started learning about Fibre Fabric and Zoning.

 Does my new zoning configuration make sense?  
Could it be improve?
Is anything on the way is configured causing my backup issue?

I know there are other things on the variable such as the backup server itself, backup software agent and the library that could be the culprit, but I need to start somewhere.

Thanks in advance for your help.

BEFORE

SANSwitch1 (3 Zones)

VNX_ESX
Members:  ESXi1, ESXi2, ESXi3, VNX_SPA_P2, VNX_SPB_P2
VNX_SOLARIS
Members: BackupServer1, TLTape1, VNX_SPA_P2, VNX_SPB_P2, SolarisServer
VNX_WIN
Members: BackupServer1, TLTape1, VNX_SPA_P2, VNX_SPB_P2

SANSwitch2 (3 Zones)

VNX_ESX
Members:  ESXi1, ESXi2, ESXi3, VNX_SPA_P3, VNX_SPB_P3
BACKUP_ZONE
Members: BackupServer1, DocImagingServer, VNX_SPA_P3, VNX_SPB_P3
VNX_WIN
Members: BackupServer1, TLTape2, VNX_SPA_P3, VNX_SPB_P3

AFTER

SANSwitch1

VNX_ESXi1
Members: ESXi1, VNX_SPA_P2, VNX_SPB_P2
VNX_ESXi2
Members: ESXi2, VNX_SPA_P2, VNX_SPB_P2
VNX_ESXi3
Members: ESXi3, VNX_SPA_P2, VNX_SPB_P2
VNX_SOLARIS
Members: SolarisServer, VNX_SPA_P2, VNX_SPA_P2
VRC_BACKUP
Members: VRCData, BackupServer1_1
BACKUP_TL
Members: BackupServer1_1, TLTape1
VNX_BACKUPSERVER
Members:  BackupServer1_1, VNX_SPA_P2, VNX_SPB_P2
            
SANSwitch2

VNX_ESXi1
Members: ESXi1, VNX_SPA_P3, VNX_SPB_P3
VNX_ESXi2
Members: ESXi2, VNX_SPA_P3, VNX_SPB_P3
VNX_ESXi3
Members: ESXi3, VNX_SPA_P3, VNX_SPB_P3
BACKUP_TL
Members: BackupServer1_2, TLTape2
VNX_BACKUPSERVER
Members:  BackupServer1, VNX_SPA_P3, VNX_SPB_P3
VNX_DOCIMAGINGSRV
Members: Poseidon, VNX_SPA_P3, VNX_SPB_P3
DOCIMGSRV_BACKUPSERVER
Members: DocImagingServer, BackupServer1_2
0
cartereverett
Asked:
cartereverett
  • 8
  • 6
1 Solution
 
Duncan MeyersCommented:
Yes - your 'after' zoning looks good, however there's a couple of things that need your attention. First is that the switches should be cross-connected to the SPs slightly differently. For SANSwitch1, you should have SPA Port 2 and SPB Port 3 connected.
For SANSwitch2, you should have SPA Port 3 and SPB Port 2 connected.

I'd recommend going slightly farther with your zoning and configure single initiator, single target zoninig, so:
VNX_ESXi1
Members: ESXi1, VNX_SPA_P2, VNX_SPB_P2

Becomes

VNX_ESXi1-SPA_P2
Members: ESXi1, VNX_SPA_P2 and

VNX_ESXi1-SPB_P3
Members: ESXi1, VNX_SPB_P3

The reason you do this is to prevent an issue down the track where the SPs try and log into themselves if you configure SANCopy of MirrorView for replication. Just saves you a bit of time later. If you're sure you won't be using either of those options, then stick with your proposed configuration.
0
 
Gerald ConnollyCommented:
Before the change, one of my backup jobs which used to take 11+ hours went down to 5+ but after creating the new zones my backup jobs is taking again 11+ hours to complete.

What happened to make it go down from 11+ to 5+ hours BEFORE you made the change
0
 
cartereverettAuthor Commented:
When it used to take 11+ hours the LUN that I was backing up was on an old iSCSI SAN.  When I moved the data to the new SAN and I did the backup, it went from 11 to 6 hours.  
So the change was the new SAN and on my initial tests zoning was configured no following single HBA zoning best practices.

Something very strange happened last night.  The AC unit in our server room stopped cooling and since the ETA for the technician was 30, to avoid damage to the unit I completely shutdown the VNX5300 including the additional DAEs.  The unit was off for over an hour.
I did my regular backups last night and the backup went back to 6 hours.
Now I'm really confused...
0
Veeam and MySQL: How to Perform Backup & Recovery

MySQL and the MariaDB variant are among the most used databases in Linux environments, and many critical applications support their data on them. Watch this recorded webinar to find out how Veeam Backup & Replication allows you to get consistent backups of MySQL databases.

 
cartereverettAuthor Commented:
Meyersd,

Thanks for answer.  Not to question your suggestion, but more to try to understand it...why should the switches be cross connected different?

I attached a diagram of how they are connected.
0
 
Duncan MeyersCommented:
It's EMC best practice. It's to protect against a very rare fibre loop failure condition. Port 2 on both SPs are logically on the same fibre loop.

I don't think the diagram came through.

Can you post more information about the backup job? How big is it? How is the SAN disk configured that the backup job is using?  Is the backup going straght to tape or are you doing disk to disk to tape?
0
 
cartereverettAuthor Commented:
Thanks for the explanation.  I created the new zones and now I have 1:1 single hba zoning configuration.

The backup job is about 1.3TB.  I created a 2TB LUN on Pool1.  Pool1 has two tiers (Performance (Raid 5 (4+1)) and Capacity (Raid 6 (6+2)).  
There was an initial full backup created.  Now a daily differential backup gets written to the LUN.  The differential happens on the Solaris box.  That portion takes about an hour to complete.  I then do a full backup of everything on the LUN to tape.

One thing I forgot to mention is that I mapped a LUN to a Windows server and it is showing twice (see attached picture) and some other DGC LUNZ are showing with red X’s.  
On the windows server I have a dual HBA connected to each switch.
Duplicate-LUNs.jpg
0
 
Duncan MeyersCommented:
Download PowerPath from Powerlink (https://powerlink.emc.com) and install it to resolve that problem. PowerPath handles path balancing and multi-path I/O. You can run it in unlicensed mode, but I'd recommend purchasing a license if this server is likely to need good storage performance.

You can also use Symantec Storage Foundation Basic as a freebie, but it is limited to two processors and 4 volumes under management. You can download it from the Symantec website.
0
 
Duncan MeyersCommented:
Try writing the backup directly to tape. If  you've got LTO-4 or LTO-5 you should be able to get over 400GB/hour
0
 
cartereverettAuthor Commented:
It is kind of cumbersome the way it works.  The Solaris server has an application which is vendor supported.  The application uses Oracle for the DB.
The vendor via the application provides a way to do the backup.  I can either use to tape or to disk.  
Either way will take the Oracle DB offline during the backup and bring them back online after the backup.  If I do a backup to tape the application does a full backup to tape which takes the Oracle DB offline during the 6+ or 11+ hours.  We can’t have the DB offline for more than two hours so the only way to accomplish this is by using the “To disk” option which does a differential backup to disk and it only take an hour to complete.  I then do my full backup to tape from a Windows server.

So I can’t really do the backup directly to tape.

Hopefully I’m making sense…
0
 
cartereverettAuthor Commented:
by the way, I have LTO6 TL
0
 
Duncan MeyersCommented:
Just a thought: you can do a Oracle RMAN backup direct to a Data Domain box with partial source-based dedupe - that would really fly.

For the pool that you're writing to: how much RAID 5 and how much RAID 6 disk do you have?
0
 
Duncan MeyersCommented:
And what tiering policy are you using?
0
 
cartereverettAuthor Commented:
3.5TB (raid 5)
32TB (raid 6)

"Start high then Auto-tier" policy
0
 
Duncan MeyersCommented:
If you zone the tape library to the Solaris server, can you have the Solaris server dump its disk-based data direct to tape?
0
 
Duncan MeyersCommented:
Thanks! Glad I could help.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 8
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now