Data Locations / Backup/Archive operations

Posted on 2011-03-14
Last Modified: 2013-11-05
Can anyone with experience of backing up databases let me know what I am after in the below which is about where data for a database driven application “could reside” dependant on setup and business operations/policy.

I basically want to identify every where data for a specific application “goes”. It is sensitive data so I really want control or at least some insight into everywhere this data essentially goes during backup/retention/archive/test procedures. So really I need to understand a typical backup/archive/retention workflow/process for a typical application based on a large database containing sensitive data. If anyone could provide such an example that would help?

Also, what kind of documentation would typically include this data “path/flow”? I.e. everywhere the data held within the database that drives this application does/could the data end up. Is this called a certain type of document by any chance that I could ask for? What infrastructure/architecture in a large application could this data reside on for backup/retention/archive/business continuity reasons. What is the archive/backup/retention workflow stages and where does the data go during these stages?

In my non expert backup mind, the data lives on the hard disc on the database server, i.e. the same server where the MSSQL service is running. But I assume for larger applications the data may well be spread across numerous servers/storage devices.

Any pointers welcome. As simple terminology as possible be greatly appreciated to show a non backup/DBA person the flow/process of where data from a database driven app goes/resides.
Question by:pma111
  • 5
  • 4

Expert Comment

ID: 35129116
This short answer is "It depends."

Each environment is unique.  Without knowing the size/layout/etc of your environment, I will assume a normal decent sized IT shop.

You're correct, in that the data will reside on the disk on the database server.  This disk can be local disk, meaning internal to the box, or SAN disk, which appears local but actually sits on a storage array that is centralized.

When using typical backup softwares, the database files, logs, or whatever backup methodology is being used is moved via Ethernet through the backup server to tape.  The data is not typically kept on the backup server itself.  There is meta data, which is just a record of the data, on the backup server.  The file with the real data on it will sit on a tape.

Now, there are 100's of caveats to this.  The data can sit on the backup server, the data can go to multiple places (Virtual Tapes + replication), etc etc.  There are so many scenarios you can mention.  But, in general, you'll have your copy on your database server, and copies on tapes.

If you want a report of this, you'll most likely have to get with your backup administrator at which point they would provide you with a report or catalog dump of what files are on which tapes.

Please let me know if you have any specific questions, or if you can provide any details on your backup environment.

Author Comment

ID: 35129151
Thanks for the reply

Is this workflow called anything in particular? I.e. the x workflow?

Also backups on tapes, are these typically encrypted? What kind of device would be required if someone got physical access to the tapes to restore them? And what sort of cost is it for such a  restore device?

So in summary typically data goes from:

db server disc > backup server disc > tape

Author Comment

ID: 35129184
And for a retention policy.

Is it typical for huge databases that data will roll onto different disks, or will it just be backed up to tape and restored if ever required?

I wasnt sure how larger databases work as i assume the disks on a database/data server can fill up quite quickly?

Expert Comment

ID: 35133727
Encryption is starting to come around, but is not really mainstream yet unless you have newer tapes drives.  Most likely if you haven't asked for encryption it's not encrypted.

Again, it depends on the backup software writing the tape on how easy it is to read the data.  Some tapes can easily be imported, some are harder.

The log files on database servers get backed up frequently.  These allow the disks to not fill up easily.

The data doesn't really sit on the backup server, it's moved from the client machine to the tape directly.  Some products use disk for staging, as again there's many options.
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.


Author Comment

ID: 35137389
Thanks for all the advice.

A couple of final things. Why do folk use a SAN to store the applications database, as opposed to using the local disc on the database server? Whats the pros'cons of sticking with the local disc of the db server and not using SAN to store the apps DB?

And.... how does a SAN "appear"? For example ( I am not a network admin) to see other "servers" and their various discs/shares  in our domain we just use explorer, share enum and port scans, and its relatively easy to see which are DB servers running MS-SQL, email servers running Exchange, File Servers, Domain Controllers etc). Would a SAN device just look like a domain member server running server 2008 or something, or would it be in a totally differernt domain and be spread across numerous servers as opposed to one? The concept of a domain full of servers and the various local or viritual discs on the servers/workstations and a SAN I appreciate may be totally different?

Expert Comment

ID: 35144696
Performance, reliability, scalability, and cost savings.

SAN itself is not seen on the network.  It's attached via fiber channel and uses SCSI commands.  There's another centralized storage option called NAS, which does use the network.  It's shared much like a share on a server, such as \\server\share.  SANs are not something you scan for.  Your SAN admin creates maps from the storage array to the server needing the storage.  He then presents the storage to the server, which sees it just like it would internal disk.  The server doesn't know it's not internal, nor does it care.

Author Comment

ID: 35145978
How so with cost savings? And I guess the  other 3 Performance, reliability, scalability...

And say an admin had to pick a file off the SAN, how do they essentially "logon" to the SAN, and what prevents a malicious insider also logging on to the SAN as well, i.e. what type of remote access tools / authentication is used to obtain access?

Author Comment

ID: 35145998
>>Your SAN admin creates maps from the storage array to the server needing the storage

Are you saying in say \\databaseserver\ needed to write files to SAN drive X, in explorer or my computer on \\databaseserver\ you'd see a mapped drive X as if it was a local drive? I.e. if you got access to \\databaseserver you could potentially also gain access to SAN drive X through the mapped drive? Can a normal user given the right credentials / know how map a SAN drive to their machine as if it was a local/network drive to save files to/from the SAN?

Accepted Solution

LinuxNubb earned 250 total points
ID: 35153443
Not mapped as in a mapped drive, it's a SAN "Zone" set on the Fiber switches.  You cannot map to the LUNs presented from a SAN.

However, the NAS piece I talked about can be mapped to.  These are network shares that sit on the storage arrays.  They have permissions just like any other network share.

Performance - You can configure the storage arrays to be very high performing disks.  New technologies such as Enterprise Flash Disks allow 30x the performance of a typical fiber channel drive.  

Reliability - You have the ability to create very reliable LUNs on the storage arrays, as well as the ability to make snapshots or clones in the case of disaster.

Scalability - Need more disk, add more storage to the array.

Cost Savings - Most of the disk that is local to a server is wasted space.  With a SAN array, you can ensure you are using disk drives and power that you are actually using.  Not to mention the easy of managing your server environment without having to add new disk drives to individual servers all the time.

An admin cannot typically just pick a file off a SAN drive.  The SAN admin doesn't have access to the file system on the SAN drive.  The only way someone could get access to the data, is they would have to map the LUN in the array to their hacked server, and then mount the LUN locally.  This is no remedial task, as there are many pieces required to make this happen.  It would be much easier for the hacker to get onto the box with the LUN, vs stealing it somewhere else.

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to set-up an On Demand, IPSec, Site to SIte, VPN from a Draytek Vigor Router to a Cyberoam UTM Appliance. A concise guide to the settings required on both devices
These days, all we hear about hacktivists took down so and so websites and retrieved thousands of user’s data. One of the techniques to get unauthorized access to database is by performing SQL injection. This article is quite lengthy which gives bas…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now