BES 5.0.1 Server Rebuild

Greetings Experts!

I have a situation with which I need assistance.  I have a BES farm with 6 servers.  Three are live and three are standby.  One of the standby is also live for BAS.

BES1 - Live
BES2 - Live
BES3 - Live
BES1 - Standby (also BAS Live)
BES2 - Standby (also BAS installed, but disabled, as redundant for BES1Standby)
BES3 - Standby

The three Live servers are physical boxes and the three standby servers are virtual machines.  They all connect to a single SQL farm.  See the attached map for more clarity:

Rough BES Diagram
Right now, the three LIVE BES servers (1-3) are booting from SAN.  Due to SAN space limitations, I need them to boot from their physical boxes.  As far as I know, this will require an OS rebuild (please correct me if I am wrong) to get the OS from the SAN to the local, physical RAID5.  Server OS is Server 2008 (NOT R2) SP2.

My Question is this:

What is the best process to get the server rebuilt with the SAME BES instance name with as little downtime for the end users as possible?

DrUltima
LVL 31
Justin OwensITIL Problem ManagerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

dblaylock0315Commented:
Since you have your SQL database on a separate server this should be a fairly straightforward rebuild.  I would bring the standy server online to handle traffic, then just go through the rebuild and connect to the existing BES database.  I wouldn't expect the whole process to take more than 2 hours.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Justin OwensITIL Problem ManagerAuthor Commented:
I would agree with you on that.  My assumption is that I would need to rebuild the OS, reload BES and make it the BES1 standby (since BES1 standby will be active at that time).  Then use BES to flip the Active/Standby status of the two servers.  I have a problem with "should be" and "assumption", though.  Those two phrases have bitten me more than once.

Can any Experts confirm they have performed this function and that it works correctly as stated?

DrUltima
dblaylock0315Commented:
I understand completely.  I have several bite scars from "should be" and "assumption".  I'm actually planning on doing something similar this afternoon in my environment.  I'll post the results for you.
CompTIA Security+

Learn the essential functions of CompTIA Security+, which establishes the core knowledge required of any cybersecurity role and leads professionals into intermediate-level cybersecurity jobs.

Justin OwensITIL Problem ManagerAuthor Commented:
Thank you.  I have about 1800 BES users on that server, and I don't feel comfortable without more feedback.

DrUltima
dblaylock0315Commented:
You're quite welcome.  Will let you know.
Justin OwensITIL Problem ManagerAuthor Commented:
How did it go for you?
dblaylock0315Commented:
Actually ran into a problem with one of my blades.  Had to get that back up and running.  Getting started on the BES here in a few minutes.
Mighty_SillyCommented:
Greetings,

Seems dBlayLock's got ya covered.  

Here's what I've done in my 3 BES to a SQL DB setup:
Original Config: 3 standalone BESs running Windows Server 2003 SP2 connecting to ONE DB.
End result:  3 standalone BESs running Windows Server 2008 SP2 (NOT R2) connecting to same DB.
***:  Standalone here means NOT HA.
(Assumption - you're running BES 5.x.)

Below is general steps I took, skipping right over lots of details.  Feel free to check in on certain steps.
Since I want to upgrade my OS from 2003 to 2008 I moved all users on ONE BES to the other two BESs, then rebuilt OS to 2008 and brought it Online as HA to one of the BES, made it primary, worked on the (now in-active) BES for 2008 OS build, brought THAT online and HA it to the earlier 2008 server, then, using that same 2008 and HA it to the last 2003 BES and did that all over again.  Once I had all servers running 2008, I took the very first 2008 BES and made it standalone again.

So it went like this:
BES1, BES2, BES3
- BES2 tear down -killed and revived as HA to BES3
- worked on BES3. HA it to BES2. killed BES2 again and HA it to BES1
- worked on BES1. HA to BES2. killed BES last time.
- revived BES2 to its own instance

All steps were done during business hours.  Only 'Oops' we got was the initial HA cut over where I neglected to test to ensure ALL BES services were handled by the new 2008 OS BES before killing the 2003 instance of it.  Caused the MDS portion to die thus no users were able browse via 'BlackBerry Browser' (browsing via the carrier's 'Internet Browser' were fine.)

Cheers!
-silly-

Justin OwensITIL Problem ManagerAuthor Commented:
I have over 5000 BES users and cannot move from server to server without experiencing issues (as has been proven in the past) due to some other networking considerations unique to my environment.

I appreciate you sharing your experience, though.

DrUltima
Mighty_SillyCommented:
No problem.  Totally agree on the 'Move' process which would kick of mini-EA that Could muck things up.

But, you wouldn't need to move users like I did.  Are you running BES 5.x?  Are you 'StandBy' BES in High Availability (HA) setup paring up with the Primarys?

If so, then you're home free.  Of course, that's taking into account you actually had tested the HA fail over before and are confident that it'd work in your environment :-)

I had to admit I was a bit skeptical of the HA feature (great in theory, but bad in practice).... but oh no, I was amazed at how quickly the services failed over.

We have over 1100 BBs on one of the BES (the other two are beta and lab rats so don't count) and my various fail over of THAT BES didn't give me any grief (again, aside from the initial 'Oops'.)  In my 'move' users phase I picked the lab rats to go first... heheh.  And yeah, even as I had plenty of practice I still managed to mess up on the 1100 BBs BES on killing the standby instance too soon.  So yeah, 'Haste DOES make Waste' !!!!

That said, take your time and only start the process when you feel comfortable, unless you can get a 'major outage' approval from your customer base for the actual OS build phase, which I doubt.

Good luck,
-silly-
Justin OwensITIL Problem ManagerAuthor Commented:
dblaylock0315,

Did you make any progress?

DrUltima
Justin OwensITIL Problem ManagerAuthor Commented:
I am going to give this one a couple of more days and then finalize the Question.

DrUltima
dblaylock0315Commented:
I did run into a few issues.  Still working through them.  Unfortunately, I've got a lot of other irons in the fire.  Having issue with failover at the moment.  Hoping to spend some more time on it this afternoon.  With any kind of luck I'lll have it sorted before the end of the day.
Justin OwensITIL Problem ManagerAuthor Commented:
Thank you, Experts.
Mighty_SillyCommented:
Hi DrUltima,

Thank you for the points.  Now that this is closed and dblaylock is busy, look at my situation from the point AFTER I HAD to move users - the ONLY reason i had to do that was because we didn't have a Standby setup.  You, on the other hand, already have this inplace so you can skip that part of my comment.

Again, if you're running BES 5.0 in High Availability (HA) mode, you are set to get this going (unless you already got it done.)  Simply bring a standby (fail-over) to be Primary/Active, work on your Primary/now in-active,  Once that's done, simply rinse & repeat the process until you got OS running locally on each hardware.  Now if you are NOT BES 5.0, then you will have to deal with a quick downtime while you bring up the StandBy to take the Primary's place.  BES 5.0's HA feature can do that rather quickly.

Good luck,
and again, thanks for giving me some credit.

Regards,
-silly-
Justin OwensITIL Problem ManagerAuthor Commented:
It is done and went without a hitch.  Both of your comments affirmed what I thought I already knew. :)  Failover was seamless (minus about a 35 second lapse in service).  Failback was the same.

Cheers,

Justin
dblaylock0315Commented:
My sincere apologies for the lack of response.  Not a one man show out here at all but anytime I mention BES all I hear are crickets chirping and feet pounding as everyone else runs for the hills.  Not sure why. :)  I finally did get everything working and was able to get make my changes switching back and forth between standby and primary.  Glad yours went well DrUltima and thanks for the points despite not responding in a timely fashion.
Justin OwensITIL Problem ManagerAuthor Commented:
No worries... All of us (Experts) are volunteers here.  I understand how life can pull you away from here.  It has happened to me in the past, and I am sure it will happen again in the future.  Being pulled away from here didn't diminish the value of your original response.

Cheers,

Justin
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
BlackBerry

From novice to tech pro — start learning today.