• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3388
  • Last Modified:

PostgreSQL: pg_clog/0736: No such file or directory

I'm a Linux and Postgres newbie that need to figure out why I suddenly get this error message on my website:
Query failed: ERROR: could not access status of transaction 1935960436 DETAIL: Could not open file "pg_clog/0736": No such file or directory.
I have searched Google but failed to find anything useful. Is 0736 a special code error og actually a missing file?
In my postgres directory: var/lib/postgresql/8.2/main/pg_clog
I have the following files:
0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000a 000b 000c

It is only the forum page that fails to run in table debatforum.
On that table I can't run:
select * from debatforum - I get the above error
But I can run this:
select * from debatforum limit 1

So there must be some corrupt data I need to get rid of.

When trying to help then please tell me where to run any commands, e.g. vacuum in phpPgAdmin tool, or directly on the Linux prompt (I use puTTy to connect as root)
0
simonphoenix
Asked:
simonphoenix
  • 8
  • 8
1 Solution
 
gheistCommented:
You have to create empty clog file (dd if=/dev/zero of=pg_clog/0736 bs=32768 count=512 (assuming you did not change default 16MB size) and zero out damaged pages. using postgresql conf file. That incurs data loss so copy data directory to safe place before proceeding.

I will not provide step-by-step since described actions require great accuracy and understanding, so spens few hours reading postgresql docs. If you get to step-by-step you can post it here to get some advice.
0
 
simonphoenixAuthor Commented:
I copied the pg_clog directory and tried to use your:
dd if=/dev/zero of=pg_clog/0736 bs=32768 count=512
, but got this error:
dd: opening `pg_clog/0736': No such file or directory

Am I missing a file named 0736? I only have the ones mentioned in main question. I read about creating an empty clog file manually, but how is that done? Can I copy one of the existing and renaming to 0736 and use your command?
0
 
gheistCommented:
What is your Linux distribution?
it could be different pg_clog and data directory is in use.

You may succeed using "locate pg_clog" if locate is installed and indexed.
0
Restore individual SQL databases with ease

Veeam Explorer for Microsoft SQL Server delivers an easy-to-use, wizard-driven interface for restoring your databases from a backup. No expert SQL background required. Web interface provides a complete view of all available SQL databases to simplify the recovery of lost database

 
simonphoenixAuthor Commented:
Well, now I made the command on my backup directory and got a file named 0736. I copied this into the pg_clog directory and restarted the server.
Now it runs again.
I'm still baffled - what happened to 0736 in the first place? Was it ever there before?
0
 
gheistCommented:
Power failure or application crash happened. That should be evident from "messages" log.

What type of filesystem you are using?
What is your distribution?

We can improve matters fro next time bad things happen.
0
 
simonphoenixAuthor Commented:
Filesystem: you mean PostgreSQL 8.2.7?
Distribution: Ubuntu Linux 7.10
Interface: Webmin 1.410 / puTTy (SSH)
0
 
gheistCommented:
filesystem type: from mount -v command output like ext3 or ntfs.


http://www.linux-security.cn/ebooks/ulk3-html/0596005652/understandlk-CHP-18-SECT-7.html
0
 
simonphoenixAuthor Commented:
Filesystem is ext3
0
 
gheistCommented:
Into /etc/fstab edit respective line which leads to postgresql data:
/dev/sda1 / ext3 defaults,noatime,data=journal 1 1

(doubles written data compared to writeback)

Have you found out what caused damage?
0
 
simonphoenixAuthor Commented:
What happens when using you config change?
Do I need to restart postgresql server?

A possible cause for the damage:
The virtual server maxed out its memory and deadlocked. After restart it seemed to work fine for a few days.
0
 
gheistCommented:
You have to remount filesystem.

So problem falls under "power fault" and having well-journaled filesystem will prevent future damage if you power off computer.

Did you trace guilty application (Like apache+php memory leak)? Did you apply any corrective actions(MaxRequestsPerChild 1000)???
0
 
simonphoenixAuthor Commented:
Our System Administrator said that mysqld and apache were dead due to out of memory. But mysql is not used at the moment.

Do you refer to apache2.conf file when you mention MaxRequestsPerChild?

The value is set to 0 by default for:
<IfModule mpm_prefork_module>
    StartServers          5
    MinSpareServers       5
    MaxSpareServers      10
    MaxClients          150
    MaxRequestsPerChild   0
</IfModule>
and
<IfModule mpm_worker_module>
    StartServers          2
    MaxClients          150
    MinSpareThreads      25
    MaxSpareThreads      75
    ThreadsPerChild      25
    MaxRequestsPerChild   0
</IfModule>
0
 
gheistCommented:
Set MaxRequestsPerChild value to something else (as documented in comments around where you copied module parameters from) and disable mysql (using services-admin)
These actions will prevent future memory leaks.

You have to ensure that postgresql databases and logs are always synced to disk to accomodate future power failures. Your admin must be able to help.
0
 
simonphoenixAuthor Commented:
I have implemented your suggestions (MaxRequestsPerChild set to 1000 and /dev/sda1 / ext3 defaults,noatime,data=journal 1 1), and restarted apache2 and postgresql.

I really appriciate your help - well earned points. I love when possible reasons for preventing future errors are provided. That is the way to help, thanks.
0
 
simonphoenixAuthor Commented:
I understand that you shall not tamper with this kind of thing if you are a rookie, but I had to write this since I see a lot of implied knowledge about Linux and Postgresql that I can't decifer when reading documentation. It is not user friendly, at least not until you have had some training or experience.
Thanks again
0
 
gheistCommented:
I am not sure if Ubuntu reacts adequately to / mount options change
You should redo mkinitrd and grub/lilo at least (I do not know how this is done in Ubuntu)
Read tune2fs manual page ("man tune2fs" -e continue) - in worst case reboot will end up in root not mounting and running fsck & co off the livecd.
0

Featured Post

Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

  • 8
  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now