Solved

Service availability percentage ?

Posted on 2014-11-28
9
129 Views
Last Modified: 2014-12-19
What are the standard value for service or system availability can define in the real way

I want to have my all the IT services  in terms of percentage ? example AD service 99% up time  like that . is there any industry standard on the value can define  . if I say 100% or 60% will not work in the reality  . please advice me on the way to define my availability based on any standard ?

please help me to justify the above
0
Comment
Question by:cur
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 29

Assisted Solution

by:Rich Weissler
Rich Weissler earned 215 total points
Comment Utility
You've entered into the realm of Service Level Agreements (SLA) and definitions of System Availability, and there is a fair amount you could read on the topic.

Yes, you can absolutely define your IT Services in terms of a percentage, but be crystal clear about what you are measuring.  You would normally define a service ("Directory Services", or "Name Resolution", or "Print Services", for example)  Then define what constitutes available, for example, user logins in under five seconds, or print spoolers accepting output within ten seconds.  Any time your systems are not available to meet the minimum service requirements, they are considered unavailable.  Add up all the time they are unavailable, and divide that by the period of time you are agreed to be available.  

I've seen some less good ways of defining IT services as a percentage.  In one system, which was actually used at a former employer of mine, they added up the total down time of each server during the week, and divided by the number of seconds in a week.  This would give a percentage.  (When I failed to get across to the powers that be, why this wasn't a useful number, I turned off a test server and left it off for a few weeks.  My 'IT Service number" was negative from then on, but everything meaningful was available.)  The first measure I gave above, provided an incentive to build fault tolerant systems.  The second measure had a tendency to provide a disincentive for failover and load balanced clusters.

To get more information about how this should be measured, and the meanings of the measure, I'd suggest digging into Service Level Agreements (SLA), especially as it pertains to ITIL.  I think the best advise I could possibly give you on how to define availability and how to define the percentages, is to come to an agreement with your customers.  They may by (and frequently are) internal customers, and it may take a few iterations to find out what is important to the customers... but carefully define both (1) what constitutes the IT Service, and (2) what constitutes available, from the orientation of the customer.
0
 
LVL 23

Assisted Solution

by:David
David earned 214 total points
Comment Utility
Continuing the above vein, your negotiations should distinguish between scheduled and unscheduled outages -- the latter being the "unavailable" time Razmus mentions. Another factor is whether or not the database is required in off hours.  It may be feasible and cost effective to move downtime to weekends if the business requirement is Monday to Friday.

It's one thing to do critical patching off-hours, but pretty abusive if you require your day-shift DBA(s) to routinely come on-site at 02.00 on a Sunday morning, just because the customer didn't want to cooperate.

As a business moves toward around-the-clock operation, the risk of an outage, and the cost to mitigate that outage, will both rise quickly.  Physical databases can be protected with real application clusters, and a site can be protected with a redundant hot database (DataGuard, Golden Gate, etc.) that's in a different geographical area.

"Industry standards" are meaningless, as the whole thing comes down the risk tolerance of those who are approving the budget.  The age of your physical equipment, the versions of your software, the reliability of your electrical system make each shop unique.  Perhaps you would benefit in thinking in terms of how much an outage would cost versus the cost of preventing one.
0
 
LVL 25

Assisted Solution

by:madunix
madunix earned 71 total points
Comment Utility
The following points that you should look for in an SLA (between You and your Service provider); a bad SLA misses out on some of these:
- classification of issues: what is considered critical priority and what is considered low priority. If you have an issue you do not want to spend time on the classification, this should already be clear in the end you need to make sure that you decide the priority in case of disagreement, not the service provider
- response times: if you log an incident, how long before you get someone on the line that can help you? These response times are typically separated per priority.
- work around times: what does you service provider guarantee how quickly you are up and running again? Note that work around means that some other minor functionality may not work anymore.
- final solution times: how much before the problem is resolved.
- uptimes: can your service provider guarantee uptimes. Typically this is expressed as a percentage: 99%, 99.9% .. all the way up to 99.999% aka five nines. Five nines is typically used for telco grade fully redundant systems, allowing only 5 minutes down time per year. What is offered here should also be reflected back in your purchase agreement.
- penalty clauses: what happens if the service provider cannot meet the SLA? Does he offer a discount or gives you money back?
- performance reporting: does your service provider provide you monthly metrics on their service  performance
- escalation path: who can you call if you are not happy about the service? You may want to discuss multiple levels of escalation up to the highest official of the service provider's organization. Typically escalation levels are also agreed  if response, work-around and/or final solution times are exceeded by an agreed margin.

http://en.wikipedia.org/wiki/High_availability
http://www.techrepublic.com/article/build-your-sla-with-these-five-points-in-mind/
http://www.cisco.com/c/en/us/support/docs/availability/high-availability/15117-sla.html
0
 

Author Comment

by:cur
Comment Utility
Thank you for your valuable information .what will happen this down time in continues way ? 1 % of the down time  in the continue way ?  is it normal or we need to define that too ?

coz whole year working without any issue and only the last 3 days of the year fail  create the bad customer satisfaction ?
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 29

Assisted Solution

by:Rich Weissler
Rich Weissler earned 215 total points
Comment Utility
The definition of how much downtime is 'acceptable', and the reciprocations of that downtime are all things which needs to be defined with your customer, and spelled out in a Service Level Agreement (SLA).  Normally getting from 99% to 99.9% to 99.99% uptime costs resources, and the customer/consumer of IT services is in the best position to determine how much the uptime is worth to them.
0
 
LVL 23

Assisted Solution

by:David
David earned 214 total points
Comment Utility
Agreed, this is supposed to be a bilateral (two-way) agreement:  who gets what and when if terms are met, and if terms aren't met then.......  If it's not in the written document, it's not a requirement.

A customer who demands 99.999% uptime should pay (a lot) more than someone who is okay with weekend maintenance windows.  Again to the above points, how much risk is involved and how much extra cost is involved to mitigate that risk?  So it's not an "industry" thing, but rather a unique arrangement between a service provider and a consumer.

Two side comments:  it really, really helps for the service provider to practice and test their ability to deliver, prior to pricing things out.  It's great to have well-documented plans and standby systems, until you remember no one has bothered to read the fine manual.....
0
 

Author Comment

by:cur
Comment Utility
my question is if I say 40 hours of down time ? can it be in one go 40 hours of down time  will not worth for the business  ?  I hope we need to define in the weekly basis or monthly basis downtime ? isn't it  . otherwise first 11 months can run without any issue and the last month will have all the down time
0
 
LVL 23

Assisted Solution

by:David
David earned 214 total points
Comment Utility
In my opinion, yes, if you want, but I don't feel that's realistic.  I may be missing your point, sorry if so. Also, it's very significant if your "forty hours" is measured against 52 40-hour weeks or against 52 168-hour weeks.  How long does it take for you recover your hardware, software, and customer data in a total catastrophe?  What if your systems, networks, and backup are kept in a building suddenly condemned due to fire on another floor?  How willing are you to "insure" events completely outside of your control?

As your customer, I don't particular care whether you offer 40 hours down per year.  I care about your promised / contracted response to a service call within x number of minutes. I care about knowing the estimated recovery time, and updates to it.  I care about unscheduled outages daily because of a software bug.

I care about my service provider assuring me that s/he will "own" the problem, and not point to the LAN people then shrug, "not my problem".  HTH.
0
 
LVL 29

Accepted Solution

by:
Rich Weissler earned 215 total points
Comment Utility
Absolutely valid concerns.  (And we used to joke about shutting down in December and just going home, because we'd already reached our target service level, and the service levels were defined without an eye on this.)  Bring this up, and determine what you and your customer consider reasonable and doable.  (Again, it's perfectly acceptable to have another department or group internal to a single company as your 'customer'.)
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

NTFS file system has been developed by Microsoft that is widely used by Windows NT operating system and its advanced versions. It is the mostly used over FAT file system as it provides superior features like reliability, security, storage, efficienc…
When you start your Windows 10 PC and got an "Operating system not found" error or just saw  "Auto repair for startup". After a while, you have entered a loop for Auto repair which does not fix anything and you will be in a  panic as all your work w…
Windows 8 came with a dramatically different user interface known as Metro. Notably missing from that interface was a Start button and Start Menu. Microsoft responded to negative user feedback of the Metro interface, bringing back the Start button a…
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now