Link to home
Start Free TrialLog in
Avatar of marrowyung
marrowyung

asked on

full backup + differential PostgreSQL backup

hi,


Any full backup + differential PostgreSQL backup for PostgreSQL 9.3.6 and let it restore totally good to postgreSQL 11.4 and 13.4/13.5 ?


I just want to make the backup and restore to a much newer postgreSQL quickly.


Avatar of David Favor
David Favor
Flag of United States of America image

1) Backup/Restore != Quickly

Or rather, speed depends on size of data + number of indexes (along with index complexity) across all tables.

Backup/Restore speed is a function of your data + schemas.

2) You'll use one of the standard tools...

a) https://www.postgresql.org/docs/current/app-pgdump.html

b) https://www.postgresql.org/docs/current/app-pg-dumpall.html

3) If you use the standard pg_dump or pg_dumpall, likely you can go from your current version all the way up to PostgreSQL 14.
Avatar of marrowyung
marrowyung

ASKER

3) If you use the standard pg_dump or pg_dumpall, likely you can go from your current version all the way up to PostgreSQL 14.

this is what I am using and planned to use, but I want it to be faster
now the prediction is, 43x GB need to backup for 7 hours! not good, so including restore, it will be 14 hours ?

so I think all DB has full and differential backup so that I do full backup and restore it bEFORE hand and during switchover, only differential backup and restore to the target, right?

SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Depending on the number of indexes and materialized views restoring from a pg_dump can take much longer than the dump itself.
yes!

yes, the ALTER and create index spend a lot of time!

pg_dump is pretty much an all or nothing backup method.  Not really meant to enable point-in-time  recoveries.

in terms of DB migration from an old version to a newer version, I have no choice to use pg_dumpall, anything else you can suggest to backup everything, nothing left from old server, to new DB server and it can be much faster!


You might look at archiving the WALs and using those to apply to a full backup:

so the WAL is just like T log in MS SQL and archive log in oracle, right ?

Hi,

As I have mentioned in other comments  and here in other similar questions you can speed up the restore process by setting number of jobs to a value less than or equal to number of cores of the database host.
This example
pg_restore --jobs=4 ...

Open in new window

executes restore with 4 jobs (thread) in parallel to speed things up.
This blog explains the difference of using -j or not using it as well as the compression benefit.

Regards,
     Tomas Helgi
hi,

right now I use this command to restore

psql.exe -U postgres -p 5433 -f <full path to the backup file>

and I assume pg_restore --jobs=4 ... is focus on the SINGLE backup file by the above command ?

let me have a check on pg_restore and update you shortly.

David ,

"
1) rsync -av your raw data files to another target (machine or container).

2) After first rsync, pause PostgreSQL, then do a 2nd rsync, which will bring the database consistent."

are you saying data directory of postgreSQL ?
 I tried to copy over the data folder of new postgreSQL, it will make target postgreSQL downgrade to the SAME old verison.
hi,

as according to this:
https://dan.langille.org/2013/06/10/using-compression-with-postgresqls-pg_dump/
https://simplebackups.com/blog/postgresql-pgdump-and-pgrestore-guide-examples/

so if the database doesn't yet exist, if we want to restore a database call ABC with pg_restore using a backup.sql by pg_dumpall , I can do.

pg_restore -U postgres -j 8 -Ft -C -d ABC < backup.sql

Open in new window

?

as I can see we can do pg_dump using options:

pg_dump -Z1 -Fc

so I am expecting I can use the same parameter -Z1 -Fc ?
e.g. pg_dumpall -U postgres -p 5432 -Z1 -Fc  > backup.sql ?  

from this link:
https://dan.langille.org/2013/06/10/using-compression-with-postgresqls-pg_dump/

In the following table, I'm using the 'real' value shown above. In addition, I'm also showing the time it took to restore the dump via pg_store -d testing.

so "testing" there is the name of the database going to restore?


Tomas ,

any update for me ?
Hi,

pg_restore -d testing backupfile

Open in new window

This means that pg_restore connects to the testing database and restores the backupfile into that database (as clearly stated in the manual ;) ).

Regards,
    Tomas Helgi
hi,

tks.

but what I also want to know is , after intergrate the parallel execution, is the command for restore is ok?

pg_restore -U postgres -j 8 -Ft -C -d ABC < backup.sql,

or it should be

pg_restore -U postgres -j 8 -Ft -C -d ABC  backup.sql
e.g. restore backup.sql to ABC database ?

and for pd_dumpall


so I am expecting I can use the same parameter -Z1 -Fc ?
e.g. pg_dumpall -U postgres -p 5432 -Z1 -Fc  > backup.sql ?  

Open in new window

ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
hi,

tks.

so "Best compression is -Z9. Take a look at the manual for params that can be used with pg_dumpall vs pg_dump."
should be
pg_dumpall -U postgres -p 5432 -Z19 -Fc  > backup.sql ?  

pg_restore -U postgres -j 8 -Ft -C -d ABC  backup.sql.tar 
Note that -Ft tells the restore that the backup is a tar archive file.

but using pg_dumpall the output format is .sql extension, so I should just remove -Ft ?


Hi,

The -Z and -F params does not work with pg_dumpall (only pg_dump) see the manual for more info.
And the -Z param is followed by only values 1 to 9 for example -Z9 (not -Z19  ;) ).

but using pg_dumpall the output format is .sql extension, so I should just remove -Ft ? 
Yes, the -F param is only valid when you use  pg_dump and pg_restore. Again, see the manual for more info.

Happy New Year!

Regards,
    Tomas Helgi
You specifically asked, "I just want to make the backup and restore to a much newer postgreSQL quickly."

For large databases, the only way to do a quick backup + restore is to work with raw files, avoiding pg_dump + pg_dumpall, which you've already likely determined won't work, so you opened this question.

Working with raw database files is orders of magnitudes faster for restores... And your backups may also orders of magnitudes faster.

This all depends on size of your data + table schemas.

Consider a restore of a large data file with 0 indexes, which will be fast.

Now add in many complex, mult-column indexes, so during a restore you're restoring data (fast), then rebuilding all indexes (can be massively slow).

Working with raw files, there is no index production step.

You just restore data + start database + you're back up + running.
Tomas  ,

Happy new year.

"he -Z and -F params does not work with pg_dumpall (only pg_dump) see the manual for more info."

this means we can't do anything as we can only use pg_dumpall for now , as we don't know if we use pg_dump for EACH user database what else will be missed during this project.

Yes, the -F param is only valid when you use  pg_dump and pg_restore. Again, see the manual for more info.

seems can only focus on pg_restore.

David.

tks.

Working with raw files, there is no index production step.

so how to backup and restore RAW file for postgreSQL ?

Now add in many complex, mult-column indexes, so during a restore you're restoring data (fast), then rebuilding all indexes (can be massively slow).

I agree on that.

Tomas,

pg_restore -U postgres -j 8  -C -d <target database>  backup.sql
by this, it seems it only restore the content of ONE DB only, not everything from the source DB?

so if I want to restore EVERYTHING from source do I have no choice to do:

psql.exe -U postgres -p 5435 -f <full path to the pg_dumpall output files>  ?

and also ask both command above to output error to a text file ?
1) pg_restore -U postgres -j 8  -C -d <target database>  backup.sql  > error.txt
2) psql.exe -U postgres -p 5435 -f <full path to the pg_dumpall output files> > error.txt

?
Ommit the -d <target database> in the pg_restore to restore everything.

Regards,
   Tomas Helgi

1) pg_restore -U postgres -j 8  -C -d <target database>  backup.sql  > error.txt

ok, just that pg_restore -U postgres -j 8  -C   backup.sql  > error.txt
now i do this:

 psql.exe -U postgres -p 5435 -f <full path to the pg_dumpall output files> > error.txt

and it seems that no error will be in the error.txt and ONLY show in the command prompt, this is the result when I tried to restore content from pg_dumpall from 9.3.6 to 11.14, by using psql.exe -U postgres -p 5435 -f <full path to the pg_dumpall output files> > error.txt :

psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:20: ERROR:  role "postgres" alread
y exists
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950869665: ERROR:  could not creat
e unique index "alf_lock_pkey"
DETAIL:  Key (id)=(8) is duplicated.
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950869721: ERROR:  could not creat
e unique index "alf_node_pkey"
DETAIL:  Key (id)=(845) is duplicated.
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950869729: ERROR:  could not creat
e unique index "alf_node_properties_pkey"
DETAIL:  Key (node_id, qname_id, list_index, locale_id)=(51394005, 240, -1, 1) i
s duplicated.
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950871476: ERROR:  could not creat
e unique index "idx_alf_lock_key"
DETAIL:  Key (shared_resource_id, excl_resource_id)=(8, 8) is duplicated.
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950871700: ERROR:  could not creat
e unique index "store_id"
DETAIL:  Key (store_id, uuid)=(6, 2e2d0396-f940-433c-b48a-f66966d43665) is dupli
cated.
CONTEXT:  parallel worker
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872050: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872058: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872146: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872162: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872170: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872218: ERROR:  insert or updat
e on table "alf_node" violates foreign key constraint "fk_alf_node_txn"
DETAIL:  Key (transaction_id)=(86296005) is not present in table "alf_transactio
n".
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872234: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872322: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872330: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872338: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:950872354: ERROR:  there is no uni
que constraint matching given keys for referenced table "alf_node"


Open in new window

 
psql:E:/DB_Backup/BackupUsingPostgre11_14.sql:20: ERROR:  role "postgres" alread
y exists

should  ignore it ? as the postgresql exists during installation ! or how can I force it to overwrite the existing one ?

for the rest of alert, any postgreSQL script to find out duplicated record ?

I am now testing the restore from 9.3.6 t o 13.5.1 and see the difference

how I try to ps_restore:

E:\Program Files\PostgreSQL\11\bin>pg_restore -U postgres -j 8  -C < E:\DB_Backu
p\BackupUsingPostgre11_14.sql  > E:\DB_Backup\error_restoreTO11_pgstore.txt

Open in new window

error:
pg_restore: [archiver] input file does not appear to be a valid archive

E:\Program Files\PostgreSQL\11\bin>pg_restore -U postgres -j 8  -C E:\DB_Backup\
BackupUsingPostgre11_14.sql  > E:\DB_Backup\error_restoreTO11_pgstore.txt

Open in new window

error:
pg_restore: [archiver] input file does not appear to be a valid archive

and only this one SEEMS working:

E:\Program Files\PostgreSQL\11\bin>pg_restore -U postgres -p 5433 -j 8  -C -f E:\DB_Backup\BackupUsingPostgre11_14.sql  > E:\DB_Backup\error_restoreTO11_pgstore
.txt

Open in new window

but the process seem shows nothing is going on, no DB created, any idea?



hi,

any update for me ?
one thing, I can't find if pg_dumpall totally lock/block any table for operation but pg_dump! any idea on if pg_dumpall also do not block/lock any table for long?

I just pg_dump do not lock/block tables at all so application can keep goes on as usual .

and here:
https://gpdb.docs.pivotal.io/6-5/utility_guide/ref/pg_dumpall.html

it just say:

--lock-wait-timeout=timeout
Do not wait forever to acquire shared table locks at the beginning of the dump. Instead, fail if unable to lock a table within the specified timeout.

and probably this statement present it is not going to lock:

Notes

Since pg_dumpall calls pg_dump internally, some diagnostic messages will refer to pg_dump.

so it is just the SAME as pg_dump ... ?

pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers).

hi,

I have a new post about PostgreSQL full and differential backup here, please help.

https://www.experts-exchange.com/questions/29233111/Alfresco-PostgrSQL-9-3-6-Continuous-Archiving-and-Point-in-Time-Recovery-PITR-setup.html

BTW ,I tried this:

pg_restore -U postgres -j 8  -C E:\DB_Backup\
BackupUsingPostgre11_14.sql  > E:\DB_Backup\error_restoreTO11_pgstore.txt

Open in new window

ONLY the 
error_restoreTO11_pgstore.txt 

Open in new window

can be created but nothing has been restore, so the pg_restore just pend there forever.
found out that full + WAL backup and restore ONLY work between the SAME postgreSQL version, so from 9.3.6 to 11.14.1 /13.5.2, it is not applicatable.


please also read:
https://www.percona.com/blog/2019/04/04/replication-between-postgresql-versions-using-logical-replication/

Here we’ll cover two types of replication you could implement:
  1. Replication between PostgreSQL 10 and 11 versions using built-in logical replication.
  2. Replication between PostgreSQL 9.4 or (<  PG 11) to PostgreSQL 11 using an extension named pglogical 

we are running postgreSQL 9.3.6, so we can't use any of this.

so seems must relies on pg_dumpall and restore ! in our case, 10 hours of downtime.
tks all but pg_restore is not going to work ! I will test pg_upgrade and then replication