Go Premium for a chance to win a PS4. Enter to Win


Service availability management basics

Posted on 2013-01-30
Medium Priority
Last Modified: 2013-03-07
3 questions;

1) If you were looking independently at an IT departments adherence to good practice service “availability management”, - in this question the service is file & print services and database services (sql-server, oracle rdbms) etc, what specifically at ground level should you be looking for. What metrics indicate good availability management, what evidence indicates good availability management. What exactly should admins be doing for good availability management.

2) And second part, what does poor availability management look like. What sort of lack of procedures leads to poor availability management I.e can you provide some practical examples of poor availability management? What is bad availability management, what metrics indicate bad availability management?

3) Can you give a non-IT guide to what technically you are doing different between availability mgmt. and performance mgmt.
Question by:pma111
  • 4
  • 3
LVL 51

Expert Comment

by:Keith Alabaster
ID: 38838451
This is a home work question - and I won't do those. You'll just get a redirect to something like  Microsoft's MOF and SMF and can read for yourself.

If you'd like to rephrase into a way that expresses the issue you have to address then I will.

Author Comment

ID: 38838669
Your suspicions are wrong

Author Comment

ID: 38838849
You may as well delete this question you've wrongly labelled it a homework question (again) its actually a risk team asking for some input from IT responsible for service delivery for some input based on experience in that field, always useful to hear tales from people in the field but seemingly that's not permitted anymore. There's no points to be rewarded for the above post and that post will probably stop any other responses, so all in all the question may as well be deleted.
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

LVL 51

Accepted Solution

Keith Alabaster earned 1500 total points
ID: 38840622
That's not the case at all - but I will delete it if that is what you want, let me know. No interest in the points, they are just for fun - and with over 5 million I don't worry about them anymore. It won't stop responses from the 'point hunters' but the experts will have taken the same view as I did anyway.

What I do concern myself over is the question itself.  

Your last response starts to answer my response to you by piutting some parameters against the question. For example, your question 'what does poor availability management look like?' does not have any context whereas a similar query might be 'How do I judge the quality or effectiveness of Availability Management?'

Pretty sure I have answered a number of questions for you in the past and quite happy to do so again both now with this one and in the future on whatever question you have that comes into my area of concern.

As I say, quite happy to answer it based on the additional info you have provided - or delete it if that is still what you want.


Author Comment

ID: 38905011
'How do I judge the quality or effectiveness of Availability Management?'
Is probably much better wording on reflection
LVL 51

Expert Comment

by:Keith Alabaster
ID: 38908267
Firstly you will need to define several things, and personally I put these into my Service Catalogue which is included in our Enterprise Architecture documentation set and made available to our users.

1) 'What' is it that you are making available?
2) 'What' is the acceptable definition of available - from both your perspective and your users?
3) 'What' (if any) are the performance metrics that are tied to 'Available'?
4) 'What' exceptions/planned downtime/maintenance windows are allowed in a calendar month?

As examples for each to better describe what I mean I'll use a couple of variations. This MAY seem like waffle but everyone will have a different interpretation - especially if something becomes 'not available' - unless it is written down and communicated.

1) This could be a single internal application such as a CRM, a complete end-to-end web service such as Office 365 which uses multiple elements including Internet, firewalls plus a hosted solution, or it could be a hybrid of all of them.

2) Available can have different connotations - depending upon where you sit.... A server can be up and running perfectly, having 100% up-time but if the network connections have failed it is not available to users.  Similarly, on a service if one component has failed but all the other components are running as expected (or advertised), does this meet or fail expectations? Take Microsoftt exchange and all of its constituent parts.... Exchange is running perfectly but your ISP goes down temporarily so remote access to Outlook Web Access is inoperative. Would you (or your users) say the Exchange Service is now not available or just a constituant component?  

Pedantic? maybe, but can be a real difference financially if not set out in expectations.

3) Performance - we have all been caught out here if no definition is made. The 1GB Internet connection fails and you failover to the 5Mb ADSL braodband connetion. Access to web sites now takes 30 seconds. The system works - so it is available - but users say speed makes it unworkable and is therefore not available.....

4) Self evident and needs no real explanation except to confirm the obvious that anything declared can be taken off the Availability schedule without penalty.

We have several hundred entries in our Service Catalogue with each covering off the four points above. Actually we have a few additional entries which also include our own criteria (based upon the four at the beginning) which then tell us how much forward capacity (storage, network & Internet bandwidth, memory, CPU load, spare virtualisation hosts) we need to keep for each service we offer. In turn, this allows us to work out what the cost is for each service also.

We use system tools such as SCOM, Solarwinds, vCentral configured with those parameters to trigger alerts when they broke.

 The big one was resilience and wew arrived at three tiers:

a) Must be available at all times
b) Up within four hours
c) Whatever.........

For each service this allowed us to talk to the business and demonstrate the costs of each service based upon THEIR view of availability and a compromise was stuck in each case. Bottom line, we agreed what 'Availability' actually meant with the business users - based upon business value - and the cost of each category.

We judge the effectiveness of Availability Management by ensuring we do not over-provision costly IT Services, equipment and resources for business services that do not warrant them but those that do are sufficiently buffered to a level that will cover continuity until we can react to an issue.
LVL 51

Expert Comment

by:Keith Alabaster
ID: 38963498
Thanks :)

Featured Post

Put Machine Learning to Work--Protect Your Clients

Machine learning means Smarter Cybersecurity™ Solutions.
As technology continues to advance, managing and analyzing massive data sets just can’t be accomplished by humans alone. It requires huge amounts of memory and storage, as well as high-speed processing of the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will inform Clients about common and important expectations from the freelancers (Experts) who are looking at your Gig.
In this article, the configuration steps in Zabbix to monitor devices via SNMP will be discussed with some real examples on Cisco Router/Switch, Catalyst Switch, NAS Synology device.
Michael from AdRem Software outlines event notifications and Automatic Corrective Actions in network monitoring. Automatic Corrective Actions are scripts, which can automatically run upon discovery of a certain undesirable condition in your network.…
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…

886 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question