• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 637
  • Last Modified:

IT contingency plans samples

dear all,
i am in the process of developing the IT contingency plan of my company.this is the first time I do it. and honestly i have a lot of questions like does IT contingency plan should be activated only when the main database of the company is down or also if a secondary system is down too.
to what extent should it cover? every server and system?and to a departmental level or only global systems?
hope you can help me by guiding me to a site to download excellent templates or samples?
Thanks
0
sparks2000
Asked:
sparks2000
  • 2
2 Solutions
 
ToxaconCommented:
* Find out which systems support your core business functions.
* Find out how much it will cost per hour if these servers are down.
* Find out how much it will cost per hour if every computer is down.
* Find out how much it will cost per hour if one site is down.
* Find out how big downtime costs can you tolerate.
* Figure out why would downtime occur (virus attack, no electricity, civil unrest, sabotage etc) and their likelihood.
* Calculate prices for different recovery models against different threats and let the CEO or CIO decide how much money to invest in fault or threat tolerance.
* Make the plans according to the budget.
0
 
SteveCommented:
Hi sparks2000,

IT contingency /Disaster recovery plans should cover many aspects of your system, but it depends on your company as it differes for each case.

In general, consider the following:

What software/services are regarded as critical to business function
What hardware is needed to ensure this software/service works
What hardware is regarded as critical to business function
What cost you associate with any downtime of these services/functions (per day or per hour as required)

Specific elements to consider include:
Internet access
E-mail flow
access to documents/files
any databases/specific software used

Based on this you can start to assess the backups or contingency plans you have in place to see if they sufficiently cover the requirements or loss of business you have identified above.

You should also take this further to assess other systems that are not deemed 'business critical' but be aware that the cost involved in these being unavailable are usually much lower.

If you identify services that really are critical to your business, you should ensure they are backed up and that you have assessed how quickly you could get the servcies running again in the events below:

Server failure/crash - replacement parts needed
server failure/crash - entire server replacement needed
Data corruption
Building inaccessible (eg fire/floood/snowed in)
Theft of hardware
Total loss of systems (eg building destroyed by fire and systems are irrepairable)

In most cases, companies who go through this process realise that the amount of complaining they did when you tried to convince them to buy a backup drive and good backup software was a bit ridiculous.

Choosing the cheapest option for backups turns out to be a HUGE mistake when the system breaks and the business loses £20,000 per day for a week while you try to build a new server and recover the data.

All of a sudden, that £1000 backup solution doesnt sound so expensive.....
0
 
madunixCommented:
A business impact analysis (BIA) is one of the key steps in the development of a business continuity plan (BCP). A BIA will identify the diverse events that could impact the continuity of the operations of an organization. Depending on the complexity of the organization, there could be more than one plan to address various aspects of business continuity and disaster recovery. These do not necessarily have to be integrated into one single plan; however, each plan should be consistent with other plans to have a viable business continuity planning strategy.

The BCP is designed to create a state of readiness that would minimize disruptions and enable rapid recovery following a disaster occurrence. It is reviewed and updated, at a minimum, on an annual
basis to ensure the continuous accuracy and validity of the plan.

Normally a BCP Steering Committee directs and approves all matters relating to the development, testing, implementation and ongoing maintenance of the BCP.

In our case,  the BCP differentiates between; minor contingencies (where we can still use our principal operations site); and major contingencies (where our principal site is no longer usable and operations or systems have to be run from the back-up site).

Our plan covers all aspects of the business procedure, including systems. The plan contains details of the hardware, software and staffing requirements for off-site processing in circumstances that render our main processing area inoperable. It allocates tasks to pre-designated staff members involved in the establishment of the off-site operation. We have back-up machines in our offsite centre which contain full copies of all applications and data. The data is mirrored from the production environment.

The business continuity plan BCP should be reviewed every time a risk assessment is completed for the organization. Training of the employees and a simulation should be performed after the business continuity plan has been deemed adequate for the organization. Although the primary business objective of BCP  is to mitigate the risk and impact of a business interruption, the dominating objective remains the protection of human life.


Make sure you understand the purpose of disaster planning. BCP is to ensure the business’s survival, not just to recover computer/Server systems. Please note, that creating a plan will be a new experience and without guidance, a huge amount of time will be spent in learning about disaster recovery. Management has a key role to play in the planning, business process prioritization and risk assessment, and development and finalization of continuity plans for the organization.

Our plan covers the following aspects of disaster recovery:
• Composition of the recovery management team
• Their roles, responsibilities and contact details
• Business recovery priorities
• Alternate business location
• Infrastructure requirements, including hardware and software
• Staff/customer notification checklists
• Communication and administrative procedures
• Recovery procedures
• List of documents maintained off-site
• External contact list

A good starting point is a self-analysis exercise, which will help clear the clouds and depict a true picture of the organization’s preparedness in the event of a disaster. Below are few points that you can use for your reference:
- What are the types of applications / nature of services that are hosted?
- Which departments are units depending on them?
- Speak to departments and assess the impact of unavailability of systems / services.
- Identify their RTO (min acceptable downtime to recovery critical activities) & RPO (min acceptable data loss that can still support resumption of critical activities).
- Do a Risk assessment at Core IT level, meaning at the hardware and software level, and check if there are HA (High Availability) and DR in place. If not, then your strategies on Clustering / Replications will play an important role.
- Do a Risk Assessment at Data center / Facility level that can hamper your services. This will lead you to planning alternate DR sites and network recovery requirements along with Applications / systems recovery.
- Once strategies have been completed, prepared detailed plans that will cover the step wise recovery of the applications / systems and identified people with their contact details, roles & responsibilities.
- Do a testing to check the recovery if successful or not.  The effectiveness of the business continuity plan BCP can best be evaluated by reviewing the results from business continuity tests, since many organizations simply have a BCP framework in the form of a paper document.
- Keep the BCP documents updated.
..etc.

For templates search @
http://www.scribd.com
http://www.docstoc.com/
http://iase.disa.mil/index2.html
http://searchstorage.bitpipe.com/
http://searchdatabackup.techtarget.com/search/1,293876,sid187,00.html?query=CDP&x=0&y=0
http://www.elitetele.com/Disaster-Recovery/
http://www.redbooks.ibm.com/redbooks/pdfs/sg246844.pdf
http://www.allhandsconsulting.com/toolbox/BCP_2-07a.PDF
http://www.redbooks.ibm.com/cgi-bin/searchsite.cgi?query=Disaster+AND+Recovery
Also search via google the following -->  "searchdisasterrecovery DR Plan.doc"

0
 
madunixCommented:
You could  audit your BCP with the following questions i.e. survey for Business Continuity (Contingency) and Disaster Recovery Plan.

1) During the course of a disaster or significant disruption, does your organization have written plans for business continuity and IT disaster recovery?
YES ?
NO ?

2) If you answered “Yes” to question (1) do the established plans cover critical business functions with recovery priorities?
YES ?
NO ?

3) Have you performed a business impact analysis including Recovery Time Objective and Recovery Point Objective?
YES ?
NO ?

4) Does your Business Impact Analysis calculate and classify the financial risk of disturbances to all vital functions?
YES ?
NO ?

5) Have you taken actions to mitigate known risks and single points of failure (e.g. power loss, physical access, etc.)?
YES ?
NO ?

6) Do you have a dedicated team of professionals focused on business continuity and/or IT disaster recovery?
YES ?
NO ?

7) If you answered “No” to question (6), is there an established external business continuity and disaster recovery service provider to handle your planning needs?
YES ?
NO ?

8) Is senior management fully committed to disaster recovery and business continuity?
YES ?
NO ?

9) Are your disaster recovery costs, options, and disaster declaration procedures understandable?
YES ?
NO ?

10) Do you have a sufficient budget to support your disaster recovery program?
YES ?
NO ?


11) Is your business continuity plan updated regularly to keep it current with hardware, software, business and staffing changes?
YES ?
NO ?

12) Is there an organized training and awareness program for your employees?
YES ?
NO ?

13) Does your disaster recovery centre have an operation centre?
YES ?
NO ?

14) Is there remote accessibility to your disaster recovery centre?
YES ?
NO ?


Tests

15) If you answered “Yes” to question (1), is the plan periodically tested?
YES ?
NO ?

16) If you answered “Yes” to question (15), how often is the plan tested?
Annually                  -----------------------
Semi-annually            -----------------------
Other (Please specify)      -----------------------


17) Did you test the plan in 20xx?
YES ?
NO ?


18) If you answered “Yes” to question (17), please specify the test dates and whether the tests were satisfactory or not?
Test Dates                                    Yes                        No
(1) ----------------------                  ……                        ……
(2) ----------------------                  ……                        …...
(3) ----------------------                  ……                        …...
(4) ----------------------                  ……                        ……
(5) ----------------------                  ……                        ……
(6) ----------------------                  ……                        ……

18.1) Who rates the success criteria of the executed tests? Internally Rated ? Other ?

19) Do the tests include market participants who have direct or indirect relations with your organizations?
YES ?
NO ?

19.1) Do you practice spontaneous tests to recover from a Disaster Recovery Site and resume the day from that location?
YES ?
NO ?

20) Have you tested your plan using a worst-case scenario?
YES ?
NO ?

21) Has your plan been tested for the possibility of facility loss?
YES ?
NO ?


22) If you answered “Yes” to questions (20) or (21), did testing prove that you can follow all Recovery Time Objective and Recovery Point Objective?
YES ?
NO ?

23) In the event of any disaster case how long does it take for you to stand up your system? (Please specify)

……………………………………………………………………………………………


Crisis Communication

24) Does your organization have a documented crisis management process?
YES ?
NO ?

25) If you answered “Yes” to question (24), during the event of a crisis does the process cover internal and external communications?
YES ?
NO ?

26) In the case of a disaster are you prepared to address liabilities and responsibilities?
YES ?
NO ?

27) In the event of an outage or emergency do you provide detailed contact information?
YES ?
NO ?


Back-up and Recovery Best Practices

28) Do you have a recovery strategy?
YES ?
NO ?

29) If you answered “Yes” to question (28), what is your organization recovery strategy?
-Hot Sites
-Warm Sites
-Cold sites
-Duplicate information processing facilities
-Mobile sites
-Reciprocal arrangements with other organizations

30) Where is your disaster recovery centre and please specify how many kilometers further away is it from your organization?
………………………………………………………………………………………………………


31) Do you have a backup strategy?
YES ?
NO ?

32) Do you have written backup and archive procedures?
YES ?
NO ?

33) Do you have industry-standard back-up solutions? (media, tape drives, library, software etc.)
YES ?
NO ?

34) To ensure sufficient permanent access do you have a migration policy to "refresh" tape technology and data formats every three to five years to?
YES ?
NO ?

35) Do you always use the "verify" option to ensure that your system backups are working?
YES ?
NO ?


36) Do you periodically test your back-up media?
YES ?
NO ?

37) Can you access to your past data with your back-up strategy?
YES ?
NO ?

38) Are backups fully automated for unattended operation (autoloaders, etc.)?
YES ?
NO ?

39) If your backups are manual, do you follow a sound process and written procedures?
YES ?
NO ?

40) If your backups are not manual, do you have online backup?
YES ?
NO ?

41) Does your current backup and recovery methodology fulfill management’s business uptime needs?
YES ?
NO ?


Archive Best Practices

42) Do you regularly send your backup copy to a safe, off-site archive?
YES ?
NO ?

43) Do you have retention period on backup data for legal obligations?
YES ?
NO ?

44) Is media properly taken care of when shipped, handled, stored, and used?
YES ?
NO ?

45) Is your archive system designed to facilitate data format standards and an archive tape tracking method?
YES ?
NO ?



GLOSSARY
Backup Strategy: Planned approach to data protection through a backup-policy that assigns the backup responsibilities to the appropriate personnel or departments, and sets the duplication time cycles.

Business continuity (contingency): A state of continued, uninterrupted operation of a business.

Business continuity management: A whole-of-business approach that includes policies, standards, and procedures for ensuring that specified operations can be maintained or recovered in a timely fashion in the event of a disruption. Its purpose is to minimize the operational, financial, legal, reputational and other material consequences arising from a disruption.

Business continuity plan: A component of business continuity management. A business continuity plan is a comprehensive written plan of action that sets out the procedures and systems necessary to continue or restore the operation of an organization in the event of a disruption.

Business impact analysis: A component of business continuity management. Business impact analysis is the process of identifying and measuring (quantitatively and qualitatively) the business impact or loss of business processes in the event of a disruption. It is used to identify recovery priorities, recovery resource requirements, and essential staff and to help shape a business continuity plan.

Cold Site: An information system (IS) backup facility that has the necessary electrical and physical components of a computer facility, but does not have the computer equipment in place. The site is ready to receive the necessary replacement computer equipment in the event the users have to move from their main computing location to the alternative additional functions.

Crisis Management: Set of procedures applied in handling, containment, and resolution of an emergency in planned and coordinated steps.

Critical business functions: Any activity, function, process, or service, the loss of which would be material to the continued operation of the financial industry participant, financial authority, and/or financial system concerned. Whether a particular operation or service is “critical” depends on the nature of the relevant organization or financial system. Data centre operations are an example of critical operations to most financial industry participants. Examples of critical services to financial systems include, but are not limited to, large value payment processing, clearing and settlement of transactions, and supporting systems such as funding and reconciliation services.

Disaster Recovery: Activities and programs designed to return the organization to an acceptable condition. The ability to respond to an interruption in services by implementing a disaster recovery plan to restore an organization's critical business functions.

Disaster Recovery Plan: A set of human, physical, technical and procedural resources to recover, within a defined time and cost, an activity interrupted by an emergency or disaster.

Duplicate (redundant) IPFs (Information processing facility): They are dedicated, self developed recovery sites that can backup critical applications. They can range in form a standby hot site to a reciprocal agreement with another company installation.

Hot Sites: They are fully configured and ready to operate within several hours. The equipment, network and systems software must be compatible with the primary installation being backed up. The only additional needs are staff, programs, data files and documentation.

Mobile Sites: This is a specially designed trailer that can be quickly transported to a business location or to an alternate site to provide a ready-conditioned IPF. These mobile sites can be connected to form larger work areas and can be preconfigured with servers, desktop computers, communications equipment, and even microwave and satellite data links. They are useful alternative when there are no recovery facilities in the immediate geographic area. They are also useful in case of a widespread disaster and are a cost effective alternative to duplicate IPFs for a multi-office organization.
Reciprocal Agreements with other organizations: This is a less frequently used method between two or more organizations with similar equipment or applications. Under the typical agreement, participants promise to provide computer time to each other when an emergency arises.

Recovery Point Objective (RPO): The recovery point objective is determined based on the acceptable data loss in case of disruption of operations. It indicates the earliest point in time to which it is acceptable to recover the data. RPO effectively quantifies permissible amount of data loss in case of interruption.

Recovery Strategy: A recovery strategy identifies the best way to recover a system in case of an interruption, including disaster, and provides guidance based on which detailed recovery procedures can be developed.

Recovery Testing: A test to check the system’s ability to recovery after a software or hardware failure.

Recovery Time Objective (RTO): The recovery time objective is determined based on the acceptable down time in case of disruption of operations. It indicates the earliest point in time at which the business operations must resume after disaster.

Warm Sites: They are partially configured, usually with network connections and selected peripheral equipment, such as disk drives, tape drivers, and controllers, but without the main computer. Sometimes a warm site is equipped with a less-powerful CPU, than the generally used. The assumption behind the warm site concept is that the computer can usually be obtained quickly for emergency installation and, since the computer is the most expensive unit, such an arrangement is less costly than a hot site.

Worst case scenario: Worst possible environment or outcome out of the several possibilities in planning or simulation. Imagining a worst case scenario helps in planning expenditure cuts, in formulating contingency plans, and in setting aside enough reserves to cushion the impact if the event or situation actually occurs.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now