Determine Exactly Which Hard Drive the Database-Files/Transaction-Files are Stored

We have an old Sybase server whose database is acting up.  We have tried rebuilding the file-system and the database file.  But the problem returns.  We want to replace the hard drive that the database-files and transaction-files are stored.  We want to determine exactly which hard drive it is because we are not familiar with Unix.  Moreover, we also want to see if those files are stored in the same hard drive as the operating system or not; if they are, we will need to re-install the operating system as well as restoring the database to the new hard drive.  Please help me to determine these two things.

So far, I have found these:

•      I use sp_helpdb command and sp_helpdevice command and find that the database files and the transaction files are stored in these physical devices:

/dev/rdsk/c0t0d0s1
/dev/rdsk/c0t3d0s4
d_master
/dev/rdsk/sybdbs2

Open in new window


•      I want to know more about those physical devices.  I use the following commands to examine them:

df -k /dev/rdsk/c0t0d0s1
df -k /dev/rdsk/c0t3d0s4
df -k d_master
df -k /dev/rdsk/sybdbs2

Open in new window


The df command complains that the first three devices are “not a block device, directory or mounted resource”.

On the other hand, the df command shows the following info for the last device:

Filesystem        kbytes   used  avail capacity Mounted on
/dev/dsk/c0t3d0s0 576558 371019 147889    71%   /

Open in new window


In any case, this doesn’t tell me which drive(s) those devices are on.

•      We don’t see any external device attached to the Sybase server.  Having said this, there is a backup Sybase server, and the backup Sybase server has an external device attached to it (through a SCSI cable).  At this point, I assume the database-files and transaction-files are all stored inside the Sybase server.


By the way, the Sybase server uses this Unix operating system:

SunOS <my-server-name> 5.4 Generic_101945-62 sun4m sparc

By the way, what we have tried so far to repair the database are:

•      Tried
dbcc checkalloc(<db_name>, fix)

Open in new window

.  Unfortunately, this command could not fix and could not complete.

•      Tried
drop-db/add-new-db/restore-db-from-backup

Open in new window

.  Unfortunately the restore failed to complete.

•      Tried
fsck-to-fix-the-devices

Open in new window

.  It could not complete and complained about “MAGIC NUMBER WRONG”.

•      Tried
Analyze-option-in-format-command-to-repair-Disk-0

Open in new window

, and then add-new-db and restore-db-from-backup.  This method seemed to work.  But after one week or so, we found a table has I/O error.  Honestly, we don’t even know if the databases are really in Disk-0 or not.

Please help me to determine which hard drive those database-files and transaction-files are stored, and whether they are in the same hard drive as the Unix operating system.  Thanks in advance.
LVL 1
homerslmpsonAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Joe WoodhousePrincipal ConsultantCommented:
Ok first things first.

Your Sybase install is using raw partitions, not files. You cannot use any file-based commands on these. Think of them as areas of unformatted disk being directly addressed by your Sybase dataserver.

"Which drives are they on" is up to your Solaris setup. That's actually a question for your Solaris system administrator - it's not a database question. The last of the four seems to be some kind of logical volume but again it's still a raw partition and you can't actually use any file commands on it.

The Solaris "format" command is used to create and manage raw partitions.

Note that "d_master" isn't even a raw partition. It's an alias within ASE. To find out which raw partition it actually uses you need to look at the RUNSERVER file which by default is $SYBASE/$SYBASE_ASE/install/RUN_{servername}, unless you are on a very old version of ASE in which case it is in $SYBASE/install/RUN_{servername}.

Perhaps that's the next thing you can tell us - the ASE version?

My next questions are about the scripts you are running which are not in fact Sybase commands but are something that someone at your site has written. Can you show us the contents of restore-db-from-backup, fsck-to-fix-the-devices and Analyze-option-in-format-command-to-repair-Disk-0 please?

It would also be handy to see the output of the commands. When you say they didn't work - what happened? Did you get error messages?
0
homerslmpsonAuthor Commented:
My co-worker will be in the office within the hour.
He has a programming background.  And while he doesn't fully understand Unix, etc, he still has a basic understanding of the commands, etc.
I'm the network administrator - sadly I have NO idea what any of this stuff does.
He asked me to post the question on here in hopes someone can help us.
I'll send him an email now asking him to update this with answers to your questions asap.
Thanks in advance!
0
homerslmpsonAuthor Commented:
I am the co-worker of homerslmpson.  Please call me Jay.

The output from "select @@version" is:

SQL Server/11.0.3.2/P/Sun_svr4/OS 5.4/SWR 7578 Rollup/OPT/Mon Nov  3 22:19:21 PST 1997

Open in new window


The content of our /sybase/install/RUN_SYBASE script is:

#!/bin/sh
#
# SQL Server Information:
#  name:                   SYBASE
#  master device:          /dev/rdsk/c0t0d0s3
#  master device size:     51200
#  errorlog:               /sybase/install/SYBASE.err
#  interfaces:             /sybase
#
/sybase/bin/dataserver -d/dev/rdsk/c0t0d0s3 -sSYBASE -e/sybase/install/SYBASE.err \
-i/sybase

Open in new window


I have some additional info that I have found since I posted that question.  I add those info here just in case they are useful:

The output from cat /etc/vfstab is:

#device              device              mount         FS     fsck  mount    mount
#to mount            to fsck             point         type   pass  at boot  options
#
#/dev/dsk/c1d0s2     /dev/rdsk/c1d0s2    /usr          ufs    1     yes      -
/proc                -                   /proc         proc   -     no       -
fd                   -                   /dev/fd       fd     -     no       -
swap                 -                   /tmp          tmpfs  -     yes      -
/dev/dsk/c0t3d0s0    /dev/rdsk/c0t3d0s0  /             ufs    1     no       -
/dev/dsk/c0t3d0s6    /dev/rdsk/c0t3d0s6  /usr          ufs    1     no       -
/dev/dsk/c0t3d0s7    /dev/rdsk/c0t3d0s7  /export       ufs    2     yes      -
/dev/dsk/c0t0d0s5    /dev/rdsk/c0t0d0s5  /freespace    ufs    2     yes      -
/dev/dsk/c0t0d0s0    /dev/rdsk/c0t0d0s0  /sybase       ufs    2     yes      -
/dev/dsk/c0t3d0s3    /dev/rdsk/c0t3d0s3  /usr/openwin  ufs    2     yes      -
/dev/dsk/c0t3d0s1    -                   -             swap   -     no       -

# The following lines have been commented-out to allow Sybase to access these
# partitions and Raw Partitions.  Nov-24-1999
# /dev/dsk/c0t0d0s3  /dev/rdsk/c0t0d0s3  /master       ufs    2     yes      -
# /dev/dsk/c0t0d0s1  /dev/rdsk/c0t0d0s1  /sybdbs       ufs    2     yes      -
# /dev/dsk/c0t3d0s4  /dev/rdsk/c0t3d0s4  /syblogs      ufs    2     yes      -
# /dev/dsk/c0t3d0s5  /dev/rdsk/c0t3d0s5  /sybscurty    ufs    2     yes      -
# /dev/dsk/c0t0d0s4  /dev/rdsk/c0t0d0s4  /sybtemproc   ufs    2     yes      -

Open in new window


The output from format command is:

AVAILABLE DISK SELECTIONS:
 0. c0t0d0 <IBM-DNES-309170-SA30 cyl 11195 alt 2 hd 5 sec 320>
    /iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f,800000/sd@0,0
 1. c0t3d0 <SEAGATE-ST34520N-1206 cyl 9004 alt 2 hd 4 sec 246>
    /iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f,800000/sd@3,0

Open in new window


Those are not scripts.  Those are just the "methods" that I had tried to attempt to fix the problem.  For example, the method "drop-db/add-new-db/restore-db-from-backup" is simply describing what I tried to do -- that was drop the database, add a new and empty database, and restore data to the new database from backup.  Basically I described the 4 different methods that I had tried.

When I tried the method of "dbcc checkalloc(<db_name>, fix)", I got the I/O error.  The error message is something similar to this one (it may not be the exact error message because I cannot run the database in single-user mode to use the "fix" option right now; having said this, based on my memory the error message is pretty much the same as this one):

I/O error detected during read for
   BUF pointer = '0xed318720',
   MASS pointer = '0xed318720',
   (Buf#: '0'),
   page ptr = '0xecdbe000',
   dbid = '5',                  <-- This is our database
   Mass virtpage = '50372477',
   Buffer page = '40829',
   Mass status = '0x4000100',
   Buffer status = '0x1',
   size = '2048',
   cache (id: 0) = 'default data cache'.

Open in new window


When I tried the method of "drop-db/add-new-db/restore-db-from-backup", I got this error message during restoring the database:

Backup Server session id is:  136.  Use this value when executing the
'sp_volchanged' system stored procedure after fulfilling any volume change
request from the Backup Server.
Backup Server: 6.28.1.1: Dumpfile name 'isprod15160120CD ' section number 0001
mounted on disk file '/export/home/syb11.dump/isprod.dump'
Backup Server: 4.58.1.1: Database isprod: 5114 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 11830 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 17336 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 25886 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 34828 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 46942 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 54464 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 58630 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 68264 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 77318 kilobytes LOADed.
Backup Server: 4.58.1.1: Database isprod: 83048 kilobytes LOADed.
Backup Server: 3.14.2.4: Logical page 302845725 is not in the given database
map.
Msg 8009, Level 16, State 1:
Line 1:
Error encountered by Backup Server.  Please refer to Backup Server messages for
details.

Open in new window


When I tried the "fsck-to-fix-the-devices" method, I got a "MAGIC NUMBER WRONG" error.  For example, when I tried to fix the /dev/rdsk/c0t0ds1 device that some of the database-files are on, I used this command:

fsck /dev/rdsk/c0t0d0s1

Open in new window


And I got this error:

** /dev/rdsk/c0t0d0s1
BAD SUPER BLOCK: MAGIC NUMBER WRONG
USE AN ALTERNATE SUPER-BLOCK TO SUPPLY NEEDED INFORMATION;
eg. fsck [-F ufs] -o b=# [special ...]
where # is the alternate super block. SEE fsck_ufs(1M).

Open in new window


Finally I tried the "Analyze-option-in-format-command-to-repair-Disk-0" method, this worked for a while.  What I did were the following: (1) Put the system in single-user mode. (2) Run format command. (3) Choose Disk 0 (actually I had no idea if the database files were indeed in Disk 0). (4) Choose "Analyze" option to scan and repair Disk 0. (5) Quit out from Disk 0 and choose Disk 1 (there are only two disks).  (6) Choose "Analyze" option to scan and repair Disk 1. (7) Restore the database without seeing any error.

This last method seemed to work, and there was no problem for 1 or 2 weeks.  But recently when I checked the database using dbcc checkdb(..) command, I started seeing I/O error in one table.  The I/O error message is almost the same as the last time.  The only difference is that it occurs to a different table and in a different pointer value:

Checking stkreq2_o
Msg 823, Level 16, State 1:
Line 1:
I/O error detected during read for
   BUF pointer = '0xed335fa0',
   MASS pointer = '0xed335fa0',
   (Buf#: '0'),
   page ptr = '0xed034000',
   dbid = '5',                  <-- This is our database
   Mass virtpage = '50373723',
   Buffer page = '42075',
   Mass status = '0x4000100',
   Buffer status = '0x1',
   size = '2048',
   cache (id: 0) = 'default data cache'.

Open in new window


Seem like the problem is back.  This means the last method might not have fixed anything.  Might be that method had just moved things around and didn't trigger the error until recently.  Now the problem occurs to a different table (stkreq2_o).  Luckily for us at least for now is that the table is not being used.  It was left over from someone's attempt to test something.

If you need additional info, please let us know.  Thanks for your help.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Joe WoodhousePrincipal ConsultantCommented:
Ok that does sound like the underlying disk is going bad. It also looks like the Sybase raw partitions and your Solaris file systems are mixed and distributed between two underlying physical disks:

/dev/rdsk/c0t0d0s*
/dev/rdsk/c0t3d0s*

I think you're looking at a complete rebuild of your box as it won't be possible to just swap out the Sybase database devices without affecting the O/S disk.

That's quite a vintage version of Sybase you have there, too. I haven't worked on that version since I think 1998. :)
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
homerslmpsonAuthor Commented:
OK, I assume you are saying that I have two physical hard drives:

Disk 0 that is c0t0d0s
Disk 1 that is c0t3d0s

I also assume you are saying this:
All devices that have "c0t0d0s" prefix are in Disk 0 (c0t0d0s). Example: /dev/rdsk/c0t0d0s1
All devices that have "c0t3d0s" prefix are in Disk 1 (c0t3d0s). Example: /dev/rdsk/c0t3d0s4

This means our database file that is stored in sybdbs (/dev/rdsk/c0t0d0s1) is in Disk 0, and that is the IBM hard drive.  Because the database is the one giving me problem, this means Disk 0 (IBM hard drive) is having a problem that needs to be replaced.

On the other hand, Disk 1 (Seagate hard drive) has the transaction log.  I assume transaction log doesn't have error, right?  This means Seagate hard drive is OK, and I can keep it, right?

If I understand the output from mount command and from /etc/vfstab correctly, the bad IBM hard drive not only has the database file, but also has Sybase program installed in it.  But seem like it doesn't have the Unix operating system in it.  Does this mean that I need to replace the IBM hard drive, re-install Sybase in it, and restore our database into it?  But we don't need to re-install Unix, right?

Please let me know if I understand this correctly or not.  Thanks.
0
robocatCommented:
Assuming the database error is due to an error on disk 0, you can replace this disk without affecting the Solaris OS.

However I have 2 remarks:

This operation will require a good understandig of Solaris, because you need to be able to recreate the partitions, restore the sybase filesystems from backup and also restore the raw partitions. If you don't have these skills, make sure you have somebody available that does, because it's really easy to mess things up.

Also, in these days running a database from a single disk is not considered acceptable from a redundancy perspective (unless the database is really really unimportant).

From both remarks I conclude it is better to rebuild the database on a newer system and leave the old one as is.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Unix OS

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.