Advertisement

07.20.2005 at 12:44AM PDT, ID: 21497744
[x]
Attachment Details

Poor NFS performance - options required to reduce getattr percentage

Asked by glennstewart in Unix Setup

Tags: nfs, getattr

Hi All,

This has been a long term problem that I've tried to solve with reference to some online documentation and trial and error.

Problem:
NFS performance between an Informix database server and an application server is seen to be a problem.

Symptom:
Some scripts run over night perform an existance test on an NFS mounted file before read/write (i.e. if [ -f $file ]). Despite the fact that the file is actually there, the response if $?=1.
Furthermore, performing nfsstat every half an hour with a reset daily, shows that the getattr value on the client side is not performing as well as expected.

Documentation I have referred to:

(1) http://www.princeton.edu/~psg/unix/Solaris/troubleshoot/nfsstat.html

In particular:
getattr > 40%:
The client attribute cache can be increased by setting the actimeo mount option. Note that this is not appropriate where the attributes change frequently, such as on a mail spool. In these cases, mount the filesystems with the noac option.

(2) http://www.hn.edu.cn/book/NetWork/NetworkingBookshelf_2ndEd/nfs/ch18_06.htm

getattr > 60%:
Check for possible non-default attribute cache values on NFS clients. A very high percentage of getattr requests may indicate that the attribute cache window has been reduced or set to zero with the actimeo or noac mount option. It can also indicate that the NFS filesystem implementation is doing a poor job of attribute caching.

(3) http://docs.hp.com/en/B1031-90043/ch02s03.html

Set actimeo=1 or actimeo=3 for a directory that is used and modified frequently by many NFS clients. This ensures that the file and directory attributes are kept reasonably up to date, even if they are changed frequently from various client locations.

------------

With the above mentioned, several actimeo options have been tried including actimeo=1 & actimeo=3. Both still produced getattr of greater than 85%.

The following are example nfsstat's:

===============================================
# nfsstat -c

Client rpc:
Connection oriented:
calls       badcalls    badxids     timeouts    newcreds    badverfs
3364645     79          79          0           0           0
timers      cantconn    nomem       interrupts
0           0           0           79
Connectionless:
calls       badcalls    retrans     badxids     timeouts    newcreds
0           0           0           0           0           0
badverfs    timers      nomem       cantsend
0           0           0           0

Client nfs:
calls       badcalls    clgets      cltoomany
3313421     79          3313490     0
Version 2: (0 calls)
null        getattr     setattr     root        lookup      readlink
0 0%        0 0%        0 0%        0 0%        0 0%        0 0%
read        wrcache     write       create      remove      rename
0 0%        0 0%        0 0%        0 0%        0 0%        0 0%
link        symlink     mkdir       rmdir       readdir     statfs
0 0%        0 0%        0 0%        0 0%        0 0%        0 0%
Version 3: (3311489 calls)
null        getattr     setattr     lookup      access      readlink
0 0%        2898793 87% 1331 0%     112749 3%   29612 0%    4 0%
read        write       create      mkdir       symlink     mknod
9521 0%     85283 2%    9729 0%     0 0%        0 0%        0 0%
remove      rmdir       rename      link        readdir     readdirplus
2310 0%     0 0%        5165 0%     0 0%        123228 3%   13698 0%
fsstat      fsinfo      pathconf    commit
3667 0%     0 0%        0 0%        16399 0%

Client nfs_acl:
Version 2: (0 calls)
null        getacl      setacl      getattr     access
0 0%        0 0%        0 0%        0 0%        0 0%
Version 3: (2002 calls)
null        getacl      setacl
0 0%        2002 100%   0 0%
===============================================
And mount points:

# nfsstat -m

/calypso from databaseserver:/calypso
 Flags:         vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,rsize=32768,
wsize=32768,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=3,acdirmin=3,acdirmax=3

/topcall from topcall:C:\TCLFI
 Flags:         vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,rsize=8192,wsize
=8192,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

/opt/informix from databaseserver:/opt/informix
 Flags:         vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,rsize=32768,
wsize=32768,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=3,acdirmin=3,acdirmax=3

/calypso/reports from databaseserver:/calypso/reports
 Flags:         vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,rsize=32768,
wsize=32768,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=3,acdirmin=3,acdirmax=3

/calypso/archive from databaseserver:/calypso/archive
 Flags:         vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,rsize=32768,
wsize=32768,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=3,acdirmin=3,acdirmax=3
===============================================

I suspect that given this will require outages to test (which are spaced apart by 2 weeks), and further trial and error, while it might not be a difficult question to answer the person answering might not have the complete picture. And futhermore, the person answering will have to be patient for me to allocate points.

For this reason, I have decided to allocate more points.
But over time, I might allocate extra points if not getting too many submissions.

So I'll start off at just above moderately difficult - 170 points.


Thanks in advance for any contributions.Start Free Trial
[+][-]07.21.2005 at 04:47AM PDT, ID: 14492659

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]07.25.2005 at 12:18AM PDT, ID: 14516347

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]07.25.2005 at 12:19AM PDT, ID: 14516350

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]07.25.2005 at 02:13PM PDT, ID: 14521947

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]07.31.2005 at 05:24PM PDT, ID: 14566791

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]07.31.2005 at 09:56PM PDT, ID: 14567541

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]08.18.2005 at 08:58PM PDT, ID: 14706374

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]08.18.2005 at 09:00PM PDT, ID: 14706383

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]08.18.2005 at 09:00PM PDT, ID: 14706386

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]08.18.2005 at 10:42PM PDT, ID: 14706685

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]08.18.2005 at 10:53PM PDT, ID: 14706723

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]08.19.2005 at 03:37PM PDT, ID: 14713471

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]08.20.2005 at 02:27AM PDT, ID: 14714948

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]08.21.2005 at 09:23PM PDT, ID: 14721762

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]08.21.2005 at 10:34PM PDT, ID: 14721953

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]08.28.2005 at 03:07AM PDT, ID: 14770604

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 7-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]08.28.2005 at 04:39AM PDT, ID: 14770742

View this solution now by starting your 7-day free trial. Setting up your free trial is quick, easy, and secure. We will return you to this solution, unlocked, when you're done.

 

About this solution

Zone: Unix Setup
Tags: nfs, getattr
Sign Up Now!
Solution Provided By: gheist
Participating Experts: 3
Solution Grade: A
 
 
[+][-]09.01.2005 at 11:24PM PDT, ID: 14808009

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]11.03.2005 at 09:00PM PST, ID: 15222576

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
 
Loading Advertisement...
Microsoft
  • Internet Protocols
  • Applications
  • Development
  • OS
  • Hardware
  • Windows Security
Apple
  • Operating Systems
  • Hardware
  • Programming
  • Networking
  • Software
Internet
  • Search Engines
  • File Sharing
  • WebTrends / Stats
  • Spy / Ad Blockers
  • Web Browsers
  • New Net Users
  • Web Development
  • Chat / IM
  • Anti Spam
  • Web Servers
  • Anti-Virus
  • Email Clients
Gamers
  • Tips
  • Online / MMORPG
  • Puzzle
  • Emulators
  • Action / Adventure
  • Role Playing
  • Consoles
  • Game Programming
  • Strategy
  • Sports
  • Misc
  • Computer Games
Digital Living
  • Hardware
  • Automotive
  • New Net Users
  • New Users
  • Software
  • Digital Music
  • Gaming World
  • Home Security
  • Apple
  • Networking Hardware
Virus & Spyware
  • Vulnerabilities
  • IDS
  • Encryption
  • Anti-Virus
  • Operating Systems Security
  • Software Firewalls
  • WebApplications
  • Cell Phones
  • Operating Systems
  • Internet
  • Hardware Firewalls
Hardware
  • Displays / Monitors
  • Handhelds / PDAs
  • Components
  • Peripherals
  • Laptops/Notebooks
  • Servers
  • Misc
  • Apple
  • Embedded Hardware
  • Networking Hardware
  • Storage
  • Desktops
  • New Users
Software
  • System Utilities
  • Industry Specific
  • Network Management
  • Photos / Graphics
  • Page Layout
  • VMware
  • Misc
  • Web Development
  • OS
  • CYGWIN
  • Voice Recognition
  • Virtualization
  • Message Queue
  • Quality Assurance
  • Security
  • Firewalls
  • MultiMedia Applications
  • Development
  • Database
  • Office / Productivity
  • Business Management
  • OS/2 Apps
  • Server Software
  • Internet / Email
ITPro
  • OS
  • Storage
  • Encryption
  • Operating Systems Security
  • Apple Hardware
  • Laptops & Notebooks
  • Servers
  • Networking Hardware
  • Peripherals
  • Devices
  • Displays / Monitors
  • WebTrends / Stats
  • Search Engines
  • Firewalls
  • Web Computing
  • WebApplications
  • IDS
  • Vulnerabilities
  • Email Clients
  • File Sharing
  • Spy / Ad Blockers
  • Web Browsers
  • Web Servers
  • Networking
  • Anti-Virus
  • Consulting
  • Chat / IM
  • Anti Spam
Developer
  • Web Servers
  • Web Browsers
  • Game Programming
  • Dev Tools
  • Industry Specific
  • Office / Productivity
  • Database
  • CYGWIN
  • Web Development
  • Search Engines
  • File Sharing
  • WebTrends / Stats
  • Programming
  • Content Management
  • Application Servers
  • Protocols
Storage
  • Removable Backup Media
  • Storage Technology
  • Servers
  • Grid
  • Remote Access
  • Backup / Restore
  • Misc
  • Hard Drives
OS
  • Miscellaneous
  • Security
  • Development
  • Linux
  • VMware
  • MainFrame OS
  • Unix
  • Apple
  • OS / 2
  • AS / 400
  • BeOS
  • Microsoft
  • VMS / OpenVMS
Database
  • Oracle
  • Miscellaneous
  • MySQL
  • Software
  • Sybase
  • Contact Management
  • PostgreSQL
  • Data Manipulation
  • Clarion
  • InterSystems Cache
  • Siebel
  • MUMPS
  • OLAP
  • SQLBase
  • SAS
  • GIS & GPS
  • 4GL
  • Berkeley DB
  • DB2
  • Informix
  • Interbase / Firebird
  • FoxPro
  • Reporting
  • LDAP
  • Filemaker Pro
  • MS SQL Server
  • dBase
  • MS Access
Security
  • Misc
  • Web Browsers
  • Software Firewalls
  • Operating Systems Security
  • File Sharing
  • Spy / Ad Blockers
  • Vulnerabilities
  • WebApplications
  • IDS
  • Anti-Virus
  • Encryption
  • Anti Spam
  • Email Clients
  • VPN
  • Chat / IM
Programming
  • Editors IDEs
  • Installation
  • Handhelds / PDAs
  • Multimedia Programming
  • System / Kernel
  • Automation
  • Algorithms
  • Game
  • Signal Processing
  • Project Management
  • Open Source
  • Database
  • Misc
  • Languages
  • Processor Platforms
  • Theory
Web Development
  • Scripting
  • Blogs
  • Web Servers
  • Software
  • Search Engines
  • Web Graphics
  • Web Services
  • Images
  • Internet Marketing
  • Images and Photos
  • Components
  • Document Imaging
  • Web Languages/Standards
  • Illustration
  • WebApplications
  • Fonts
  • WebTrends / Stats
  • Authoring
  • Digital Camera Software
  • Miscellaneous
Networking
  • Protocols
  • Apple Networking
  • Network Management
  • Message Queue
  • Application Servers
  • Content Management
  • File Servers
  • Email Servers
  • Misc
  • Java Editors & IDEs
  • Wireless
  • Networking Hardware
  • Backup / Restore
  • System Utilities
  • ISPs & Hosting
  • Web Servers
  • Storage Technology
  • Removable Backup Media
  • Servers
  • Web Computing
  • Broadband
  • Grid
  • OS / 2
  • Novell Netware
  • Unix Networking
  • Windows Networking
  • Security
  • Telecommunications
  • Operating Systems
  • Linux Networking
Other
  • Lounge
  • Business Travel
  • Community Support
  • New Net Users
  • Philosophy / Religion
  • Math / Science
  • Miscellaneous
  • URLs
  • Expert Lounge
  • Politics
  • Puzzles / Riddles
  • Automotive
Community Support
  • Suggestions
  • New to EE
  • New Topics
  • CleanUp
  • Announcements
  • General
  • Feedback
  • Input
  • EE Bugs
 
07.21.2005 at 04:47AM PDT, ID: 14492659
> Despite the fact that the file is actually there, the response if $?=1.
this is a know NFS problem, unfortunately
my experiance is that this happens when different processes create and check a file (path) at same time
Is this the case for you?
It's some time ago since I struggled with this, but IIRC the only workaround was to tweak the NFS cachiing option, in particular to switch caching off which is a performance penulty (for obvious reason).
Assisted Solution
 
07.25.2005 at 12:18AM PDT, ID: 14516347
Please post full onconfig and uname -a for Informix problem
Please post uname -a on involved systems for nfs problem
Assisted Solution
 
07.25.2005 at 12:19AM PDT, ID: 14516350
Dont forget onstat -p for informix too.
Assisted Solution
 
07.25.2005 at 02:13PM PDT, ID: 14521947
The way I read it, you've reduced the attribute caching timeout on the client, so attributes will be re-read (getattr) from the NFS server _more_ frequently; Unless you have other NFS clients accessing these files, or the database/application suppliers advise against it, I would go the other way and _increase_ the attribute cache timeouts on the client (at least the ac*max ones) to the default, 60 seconds.

During your next test slot, I suggest you:
- Temporarily disable any other NFS clients
- Experiment with the ac*max timeouts on the client
- If possible, NFS mount as few of the filesystems on the client as possible, to see if you can pin down which filesystems are getting hit most.  I figure you have about 100 files open on the NFS filesystems, each getting their attributes re-checked every 3 seconds; 100 * 60 * 60 *24 /3 is roughly 3 million getattrs each day.
- Try to identify the type of access or times when performance is seen to be poor, e.g. batch reports reading large chunks of data as opposed to small reads/write by onlline users updating individual records.

On the NFS server,  look for network/disk/CPU/memory bottlenecks that might be affecting its performance as an NFS server. Disk caching on the NFS exported filesystems or additional/faster network cards might help.

Further reading:
O'Reilly - Managing NFS and NIS, 2nd Edition - See Chapters 18 and 16:
http://www.hn.edu.cn/book/NetWork/NetworkingBookshelf_2ndEd/nfs/index.htm
Assisted Solution
 
07.31.2005 at 05:24PM PDT, ID: 14566791
Thanks for comments so far. Taking into account all answers. Still waiting on outages for this type of work.
Patience required.... but will be offering these points as soon as a result is achieved.

Increasing points to this question now.
 
07.31.2005 at 09:56PM PDT, ID: 14567541
How do you imagine - you just increase points and everyone gives you full guide for 20+ system types ???

please help out with onstat -p , informix's onconfig file  and uname -a.
Assisted Solution
 
08.18.2005 at 08:58PM PDT, ID: 14706374
Apologies gheist.... not my intention to be tight on points here.
Not expecting any answers overnight either.

-------------------------------------
onstat -p

IBM Informix Dynamic Server Version 9.30.UC7     -- On-Line -- Up 23 days 15:38:38 -- 593920 Kbytes

Profile
dskreads pagreads bufreads %cached dskwrits pagwrits bufwrits %cached
3458560948 434667395 2931857666 0.00    16792627 55567591 89726596 81.28  

isamtot  open     start    read     write    rewrite  delete   commit   rollbk
1685025386 210798806 1836164194 2864919965 45851422 9112826  3884761  3221742  33726

gp_read  gp_write gp_rewrt gp_del   gp_alloc gp_free  gp_curs
0        0        0        0        0        0        0      

ovlock   ovuserthread ovbuff   usercpu  syscpu   numckpts flushes
0        0            0        1043561.38 233927.12 1157     3105    

bufwaits lokwaits lockreqs deadlks  dltouts  ckpwaits compress seqscans
910866565 52613    2538615337 138      0        2575     999790   479973  

ixda-RA  idx-RA   da-RA    RA-pgsused lchwaits
2307355166 74940115 90060497 2470403885 132697272
-------------------------------------
uname -a
SunOS server.domain.com.au 5.8 Generic_108528-29 sun4u sparc SUNW,Sun-Fire-880
-------------------------------------
 
08.18.2005 at 09:00PM PDT, ID: 14706383
onconfig:

#**************************************************************************
#
#                          INFORMIX SOFTWARE, INC.
#
#  Title:       onconfig
#  Description: Informix Dynamic Server 2000 Configuration Parameters
#
#**************************************************************************

# Root Dbspace Configuration

ROOTNAME        rootdbs         # Root dbspace name
ROOTPATH        /calPRD0/informix/dev/root_chk1
                                # Path for device containing root dbspace
ROOTOFFSET      0               # Offset of root dbspace into device (Kbytes)
ROOTSIZE        1000000         # Size of root dbspace (Kbytes)

# Disk Mirroring Configuration Parameters

MIRROR          0               # Mirroring flag (Yes = 1, No = 0)
MIRRORPATH                      # Path for device containing mirrored root
MIRROROFFSET    0               # Offset into mirrored device (Kbytes)

# Physical Log Configuration

PHYSDBS         rootdbs         # Location (dbspace) of physical log
PHYSFILE        100000          # Physical log file size (Kbytes)

# Logical Log Configuration

LOGFILES        155             # Number of logical log files
LOGSIZE         20000           # Logical log size (Kbytes)

# Diagnostics

MSGPATH         /opt/informix/online.log # System message log file path
CONSOLE         /opt/informix/console.log # System console message path
ALARMPROGRAM    /opt/informix/etc/eventalert.sh # Alarm program path
TBLSPACE_STATS  1               # Maintain tblspace statistics

# System Archive Tape Device

TAPEDEV         /calPRD0/informix/dev/archive
#TAPEDEV         /dev/null
TAPEBLK         16              # Tape block size (Kbytes)
TAPESIZE        40000000        # Maximum amount of data to put on tape (Kbytes)

# Log Archive Tape Device

LTAPEDEV        /calPRD0/informix/dev/logical
LTAPEBLK        16              # Log tape block size (Kbytes)
LTAPESIZE       40000000        # Max amount of data to put on log tape (Kbytes)

# Optical

STAGEBLOB                       # Informix Dynamic Server 2000 staging area

# System Configuration

SERVERNUM       0               # Unique id corresponding to a OnLine instance
DBSERVERNAME    creative        # Name of default database server
DBSERVERALIASES pncrtive        # List of alternate dbservernames
NETTYPE         tlitcp,5,200,NET
RESIDENT        1               # Forced residency flag (Yes = 1, No = 0)

MULTIPROCESSOR  1               # 0 for single-processor, 1 for multi-processor
NUMCPUVPS       4               # Number of user (cpu) vps
SINGLE_CPU_VP   0               # If non-zero, limit number of cpu vps to one

NOAGE           1               # Process aging
AFF_SPROC       0               # Affinity start processor
AFF_NPROCS      0               # Affinity number of processors

# Shared Memory Parameters

LOCKS           20000           # Maximum number of locks
BUFFERS         150000          # Maximum number of shared buffers
#BUFFERS         240000          # Maximum number of shared buffers
NUMAIOVPS       128             # Number of IO vps
PHYSBUFF        32              # Physical log buffer size (Kbytes)
LOGBUFF         32              # Logical log buffer size (Kbytes)
CLEANERS        32              # Number of buffer cleaner processes
SHMBASE         0xa000000         # Shared memory base address
SHMVIRTSIZE     204800          # initial virtual shared memory segment size
SHMADD          20480           # Size of new shared memory segments (Kbytes)
SHMTOTAL        0               # Total shared memory (Kbytes). 0=>unlimited
CKPTINTVL       1800            # Check point interval (in sec)
LRUS            64              # Number of LRU queues
LRU_MAX_DIRTY   2               # LRU percent dirty begin cleaning limit
LRU_MIN_DIRTY   1               # LRU percent dirty end cleaning limit
LTXHWM          50              # Long transaction high water mark percentage
LTXEHWM         60              # Long transaction high water mark (exclusive)
TXTIMEOUT       0x12c             # Transaction timeout (in sec)
STACKSIZE       32              # Stack size (Kbytes)

# System Page Size
# BUFFSIZE - OnLine no longer supports this configuration parameter.
#            To determine the page size used by OnLine on your platform
#            see the last line of output from the command, 'onstat -b'.


# Recovery Variables
# OFF_RECVRY_THREADS:
# Number of parallel worker threads during fast recovery or an offline restore.
# ON_RECVRY_THREADS:
# Number of parallel worker threads during an online restore.

OFF_RECVRY_THREADS 10              # Default number of offline worker threads
ON_RECVRY_THREADS 1               # Default number of online worker threads

# Data Replication Variables
DRINTERVAL      30              # DR max time between DR buffer flushes (in sec)
DRTIMEOUT       30              # DR network timeout (in sec)
DRLOSTFOUND     /export/home/informix/etc/dr.lostfound # DR lost+found file path

# CDR Variables
CDR_EVALTHREADS 1,2             # evaluator threads (per-cpu-vp,additional)
CDR_DSLOCKWAIT  5               # DS lockwait timeout (seconds)
CDR_QUEUEMEM    4096            # Maximum amount of memory for any CDR queue (Kb
ytes)
CDR_NIFCOMPRESS 0               # Link level compression (-1 never, 0 none, 9 ma
x)

# Backup/Restore variables
BAR_ACT_LOG     /tmp/bar_act.log
BAR_MAX_BACKUP  0
BAR_RETRY       1
BAR_NB_XPORT_COUNT 10
BAR_XFER_BUF_SIZE 31
RESTARTABLE_RESTORE off
BAR_PROGRESS_FREQ 0

# Informix Storage Manager variables
ISM_DATA_POOL   ISMData
ISM_LOG_POOL    ISMLogs

# Read Ahead Variables
RA_PAGES        32              # Number of pages to attempt to read ahead
RA_THRESHOLD    30              # Number of pages left before next group

# DBSPACETEMP:
# OnLine equivalent of DBTEMP for SE. This is the list of dbspaces
# that the OnLine SQL Engine will use to create temp tables etc.
# If specified it must be a colon separated list of dbspaces that exist
# when the OnLine system is brought online.  If not specified, or if
# all dbspaces specified are invalid, various ad hoc queries will create
# temporary files in /tmp instead.

DBSPACETEMP     tempdbs         # Default temp dbspaces

# DUMP*:
# The following parameters control the type of diagnostics information which
# is preserved when an unanticipated error condition (assertion failure) occurs
# during OnLine operations.
# For DUMPSHMEM, DUMPGCORE and DUMPCORE 1 means Yes, 0 means No.

DUMPDIR         /calPRD0/informix/tmp # Preserve diagnostics in this directory
DUMPSHMEM       1               # Dump a copy of shared memory
DUMPGCORE       1               # Dump a core image using 'gcore'
DUMPCORE        0               # Dump a core image (Warning:this aborts OnLine)
DUMPCNT         1               # Number of shared memory or gcore dumps for
                                # a single user's session

FILLFACTOR      90              # Fill factor for building indexes

# method for OnLine to use when determining current time
USEOSTIME       0               # 0: use internal time(fast), 1: get time from O
S(slow)

# Parallel Database Queries (pdq)
MAX_PDQPRIORITY 0               # Maximum allowed pdqpriority
DS_MAX_QUERIES                  # Maximum number of decision support queries
DS_TOTAL_MEMORY                 # Decision support memory (Kbytes)
DS_MAX_SCANS    10              # Maximum number of decision support scans

DATASKIP        off             # List of dbspaces to skip

# OPTCOMPIND
# 0 => Nested loop joins will be preferred (where
#      possible) over sortmerge joins and hash joins.
# 1 => If the transaction isolation mode is not
#      "repeatable read", optimizer behaves as in (2)
#      below.  Otherwise it behaves as in (0) above.
# 2 => Use costs regardless of the transaction isolation
#      mode.  Nested loop joins are not necessarily
#      preferred.  Optimizer bases its decision purely
#      on costs.
OPTCOMPIND      0               # To hint the optimizer
DIRECTIVES      1               # Optimizer DIRECTIVES ON (1/Default) or OFF (0)

ONDBSPACEDOWN   2               # Dbspace down option: 0 = CONTINUE, 1 = ABORT,
2 = WAIT
OPCACHEMAX      0               # Maximum optical cache size (Kbytes)

DEADLOCK_TIMEOUT 60              # Max time to wait of lock in distributed env.
# HETERO_COMMIT (Gateway participation in distributed transactions)
# 1 => Heterogeneous Commit is enabled
# 0 (or any other value) => Heterogeneous Commit is disabled
HETERO_COMMIT   0

SBSPACENAME                     # Default smartblob space name - this is where b
lobs
                       # go if no sbspace is specified when the smartblob is
                       # created. It is also used by some datablades as
                       # the location to put their smartblobs.
SYSSBSPACENAME                  # Default smartblob space for use by the Informi
x
                       # Server. This is used primarily for Informix Server
                       # system statistics collection.

BLOCKTIMEOUT    3600            # Default timeout for system block
SYSALARMPROGRAM /opt/informix/etc/evidence.sh # System Alarm program path

# Optimization goal: -1 = ALL_ROWS(Default), 0 = FIRST_ROWS
OPT_GOAL        -1
ALLOW_NEWLINE   0               # embedded newlines(Yes = 1, No = 0 or anything
but 1)

#
# The following are default settings for enabling Java in the database.
# Replace all occurrences of /usr/informix with the value of $INFORMIXDIR.

#VPCLASS        jvp,num=1       # Number of JVPs to start with
JVPJAVAHOME     /usr/informix/extend/krakatoa/jre/
JVPHOME         /usr/informix/extend/krakatoa # Krakatoa installation directory
JVPPROPFILE     /usr/informix/extend/krakatoa/.jvpprops # JVP property file
JDKVERSION      1.2             # JDK version supported by this server
JVMTHREAD       native          # Java VM thread type (green or native)
# The path to the JRE libraries relative to JVPJAVAHOME
JVPJAVALIB      /lib/sparc/

# The JRE libraries to use for the Java VM
JVPJAVAVM       hpi:jvm:java:net:math:zip:jpeg
# Classpath to use upon Java VM start-up (use _g version for debugging)
# JVPCLASSPATH  /usr/informix/extend/krakatoa/krakatoa.jar:/usr/informix/extend/
krakatoa/jdbc.jar
JVPCLASSPATH    /usr/informix/extend/krakatoa/krakatoa_g.jar:/usr/informix/exten
d/krakatoa/jdbc_g.jar

BAR_DEBUG_LOG   /tmp/bar_dbug.log # ON-Bar Debug Log - not in /tmp please
 
08.18.2005 at 09:00PM PDT, ID: 14706386
Points increased to 300
 
08.18.2005 at 10:42PM PDT, ID: 14706685
As you see - read cache is 0% hit which is very bad for performance.
Most likely you have to index your tables for most popular SELECT statements

NETTYPE         tlitcp,5,200,NET
I suggest using soctcp,4,200,NET ( 200 connection/CPU from "Informix Unleashed" and 4 processors), _do not dream to use ipcshm_

MAX_PDQPRIORITY 0               # Maximum allowed pdqpriority
Set it to something between 2 and 99 to let SELECT and INSERT statements coexist on your system

NUMAIOVPS       128             # Number of IO vps
Too much for sure, leave blank for autotuning ( 1+number of dbspaces )

RA_PAGES        32              # Number of pages to attempt to read ahead
RA_THRESHOLD    30              # Number of pages left before next group
A bit of overkill of reading 31 more pages instead of just one , leave both blank for automatic tuning, this is the heaviest IO loader

RESIDENT        1               # Forced residency flag (Yes = 1, No = 0)
SHMTOTAL        0               # Total shared memory (Kbytes). 0=>unlimited
These both can bring your system down.
SHMVIRTSIZE     204800          # initial virtual shared memory segment size
SHMADD          20480           # Size of new shared memory segments (Kbytes)

In meantime run ipcs -a and tune SHMVIRTSIZE closer to SHM allocation in day after startup and SHMADD to something that covers first month of work ( assuming you have enough memory for that all, if you do not this one is disk hog), I prefer rounding to 2^x to load system memory manager less. and in the end make SHMTOTAL less than (system memory - 100M)/(number of serious software, including informix)

before you restart informix ( onmode -kuy ; oninit -v ) post "onstat -g glo" too, maybe I will give more tips
Assisted Solution
 
08.18.2005 at 10:53PM PDT, ID: 14706723
I strongly suggest altering SHMVIRTSIZE since even BUFFERS do not fit in there....

So my plan is to make you to restart informix now, write down ipcs results tomorrow(or on monday)
And in a month post back with all the reports asked above.

SunOS server.domain.com.au 5.8 Generic_108528-29 sun4u sparc SUNW,Sun-Fire-880

Is this server NFS server or client ??? NFS attribute cache is on client.
Assisted Solution
 
08.19.2005 at 03:37PM PDT, ID: 14713471
The V880 is the database server and NFS server.
The Application server (and hence NFS client), is a 440 running the same version of Solaris.

It's the attr cache on the client I would be changing.

If I haven't already mentioned, it's the writing of files by the application server to the NFS mounted filesystem that is having problems.
In numerous scripts we have tests for file existence prior to working with a said file. In a rare cases, the $? of such tests will return non-zero result even though the files actually exist. (This is how the problem was identified)
 
08.20.2005 at 02:27AM PDT, ID: 14714948
in vfstab or so, after nfs mount:
actimeo=1 option specifies universal timeout for file/directory attributes of 1 second instead of default 60
proto=tcp sets mount to not lose data over routers etc...
Assisted Solution
 
08.21.2005 at 09:23PM PDT, ID: 14721762
I've tried actime=1, actime=3 and actime=60 a different times. Each have the same symptoms.
I'm assuming that the proto=tcp is a default, because it's not in /etc/vfstab

/etc/vfstab
#device         device          mount           FS      fsck    mount   mount
#to mount       to fsck         point           type    pass    at boot options
<SNIP>
ausydcs02:/calypso         -       /calypso             nfs - yes actimeo=3
ausydcs02:/calypso/reports -       /calypso/reports    nfs - yes actimeo=3
ausydcs02:/calypso/archive -       /calypso/archive    nfs - yes actimeo=3
ausydcs02:/calypso/email   -       /calypso/email       nfs - yes actimeo=3
ausydcs02:/opt/informix    -       /opt/informix        nfs - yes actimeo=3
 
08.21.2005 at 10:34PM PDT, ID: 14721953
Ad what exact system error your perogram gets so it barfs out nonzero return code ???
Assisted Solution
 
08.28.2005 at 03:07AM PDT, ID: 14770604
> In numerous scripts we have tests for file existence prior to working with a said file. In a rare cases, the $? of such tests will return non-zero result even though the files actually exist. (This is how the problem was identified)

NFS is unreliable (in some situations), as already said.
In particular in (shell) scripts you may identify this problem 'cause each check is an atomar operation as is the file creation, hence you never can be sure that a file does not exist after you checked its non-existance.

As you also mention Solaris, I know that Sun is aware of such problems, and the only recomendation I got from Sun is to disable the cach in NFS *and* on the disk's BIOS to get rid of the problem. Disabling cache is contra-productive, as you can imagine ...

If you like I can give you some reliable (in most cases) script code to create a file only if it does not exist, but it will be worse in performace than disabling all caches ...
Assisted Solution
 
08.28.2005 at 04:39AM PDT, ID: 14770742
Since you say
1) sometimes file are missing.
2) getattr load is too high.
Any value of attribute cache will be bad for one or another. There is no exit.
Accepted Solution
 
09.01.2005 at 11:24PM PDT, ID: 14808009
I am doing an outage for this on Saturday.
I suppose I can only raid the attr cache for this test.
 
11.03.2005 at 09:00PM PST, ID: 15222576
Apologies for a late reply on this question.

Because of the no-win situation with performance, as correctly identified by gheist... this is now being investigated by Sun.
I appreciate the help on this log, and I have closed it with the most helpful submitter getting awarded points.
 
 
20080716-EE-VQP-32