Avatar of marrowyung
marrowyung
 asked on

rebuild index and update statistics for a postgresql 9.3.6 DB with command.

hi,


I am trying to rebuild index and update statistics for a postgresql 9.3.6 DB and the command I use is:


 REINDEX DATABASE  <Application database>;  
  REINDEX SYSTEM <Application database> ;
  VACUUM(FULL, ANALYZE, VERBOSE);


I ran that for the first time and  137GB database takes 9 hours to finish but later that the DB size seems goes down to 80GB.


PostgreSQLDatabases

Avatar of undefined
Last Comment
marrowyung

8/22/2022 - Mon
slightwv (䄆 Netminder)

and what is your question?

If you asking about why it is about half the size, did you recently delete a LOT of data in the database?  Rebuilding of indexes will remove the dead rows from the indexes.
ASKER CERTIFIED SOLUTION
Tomas Helgi Johannsson

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
David Favor

What you describe is expected behavior.
marrowyung

ASKER
hi,

I think this is the result of VACUUM(FULL, ANALYZE, VERBOSE);
What I want to know is, if the command I post is too much on index rebuild and update statistics ?

what should the correct one if it it not !

for postgreSQL 9.3., the command above will freeze the whole DB and I can't run anything else? that's why I am wondering why is it?

or any other higher version number of PostgreSQL, when running the command above, is not going to freeze the whole DB?

also it seems I can enable the autovacuum feature of postgreSQL so it free up space for us automatically?


Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
marrowyung

ASKER
what I found out is , this command:

REINDEX DATABASE  <Application database>;  
  REINDEX SYSTEM <Application database> ;
  VACUUM(FULL, ANALYZE, VERBOSE);

only work in PostgresQL 9.3.6 but not on 11.14, only this one ok:
VACUUM(FULL, ANALYZE, VERBOSE);


SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
marrowyung

ASKER
good ! then I am not sure why !!

so these is the command is the command I need to rebuild index and update statistics for all system and user database?

I run these command today again and the result :




Tomas Helgi Johannsson

Hi,
All three commands,
REINDEX DATABASE  <Application database>;  
  REINDEX SYSTEM <Application database> ;
  VACUUM(FULL, ANALYZE, VERBOSE);
are SQL commands and require that you to connect to the database that you intend to run it on. In other words, they only work on currently connected database and the tables/indexes that it holds. The error you get and the manual clearly imply this.

Regards,
     Tomas Helgi
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
marrowyung

ASKER
are SQL commands and require that you to connect to the database that you intend to run it on.

tks but I am connected to it, I can run select version(), but not that.
marrowyung

ASKER


it is connected.





marrowyung

ASKER
and I still found only this one:

 VACUUM(FULL, ANALYZE, VERBOSE);

can be run



I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
Tomas Helgi Johannsson

Hi,
Try
REINDEX system "alfresco";

Open in new window

If that works then your database was created with double quotes around the name.
Note that databases,tables,views etc created with double quotes around the name are case sensitive and therefore you need to address the objects with "name" but addressing the objects only with name will simply not work.
For example
SELECT * FROM "table"

Open in new window

will work but not
SELECT * FROM table

Open in new window



Regards,
     Tomas Helgi
marrowyung

ASKER
REINDEX system "alfresco";


still the same!

but I tried the postgres DB however it works.


so why?

I already drag and drop the name alfresco to the command windows and run it accordingly

Tomas Helgi Johannsson

Hi,

The screenshot you posted tells me that you are connected to the postgres database (the main cluster database - name in bold)  and not alfresco. Create another connection and set alfresco as the database. The connection config should point to alfresco. Then run the reindex alfresco commands using that connection.

Regards,
   Tomas Helgi
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
marrowyung

ASKER
marrowyung

ASKER
Tomas,

"work on all versions as intended and designed. I'm running these commands on both version 9.x, 11.x and 13.x without problems."

I found out why it doesn't works on my side now, is the management tools , if idle for a while, the command SEEMS can't find the SAME connection, and I need to open ONE MORE connection console and run that command:

 REINDEX  table alf_node_properties; --314GB tables.
REINDEX  table alf_node  ;    --80GB tables
REINDEX  table alf_child_assoc;   --17GB tables.
 REINDEX SYSTEM  alfresco;

  VACUUM(FULL, ANALYZE, VERBOSE);
  

Open in new window


BTW, in MS SQL to open a dedicated user database we use : use <database> go;

how about in postgreSQL?  

is there any command/script for postgreSQL to say only to rebuild the index of a table if the fragmentation is >30% and pass that process if not ?

Tomas Helgi Johannsson

Hi,

how about in postgreSQL?  
You open a dedicated connection to a database in PostgreSQL. The USE SQL keyword is only used and works in SQLServer, MySQL and MariaDB. See the relevent SQL keywords list in the DBMS manuals.

is there any command/script for postgreSQL to say only to rebuild the index of a table if the fragmentation is >30% and pass that process if not ? 
By querying the stat catalog tables you can easily build a script that executes VACUUM / REINDEX commands if a certain fragmentation threshold is reached.

Take a look at these links
https://wiki.postgresql.org/wiki/Show_database_bloat
https://fatdba.com/2018/09/04/fragmentation-or-bloating-in-postgresql-how-to-identify-and-fix-it-using-db-vacuuming/

Regards,
     Tomas Helgi
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
marrowyung

ASKER
You open a dedicated connection to a database in PostgreSQL. The USE SQL keyword is only used and works in SQLServer, MySQL and MariaDB. See the relevent SQL keywords list in the DBMS manuals. 

you mean we can ONLY do it in connection string level ? 
Tomas Helgi Johannsson

Hi,

you mean we can ONLY do it in connection string level ?                                   
Yes, as stated in the PostgreSQL manual.

Regards,
    Tomas Helgi
marrowyung

ASKER
tks.

this means when we try to AUTOMATE that, I can use windows task scheduler to with .bat file with that connection string and run that index rebuild.

the index rebuild so far so good ! so now just connection string does matter.

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
marrowyung

ASKER
tks and I might come back later.
marrowyung

ASKER
I have a new thing from here, I rebuild 2 x largest table of the third party application and it has error and might be this is ONE of the reason the application is slow sometimes and it has a frequent running query JOIN them:

 REINDEX  table alf_node_properties;  --
 -- SQL Error [23505]: ERROR: could not create unique index "alf_node_properties_pkey"
  Detail: Key (node_id, qname_id, list_index, locale_id)=(51060016, 28, -1, 1) is duplicated.
 
REINDEX  table alf_node  ;  --
SQL Error [23505]: ERROR: could not create unique index "alf_node_pkey"
  Detail: Key (id)=(844) is duplicated.

anyway to let index rebuild keep going and ignore that?
marrowyung

ASKER
and this one:

REINDEX  table alf_child_assoc;  

result :
SQL Error [53100]: ERROR: could not write block 2853940 of temporary file: No space left on device
  Hint: Perhaps out of disk space?

we will have a lot of space left, why ?

Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Tomas Helgi Johannsson

Hi,
On what disk do you store the PGDATA directory ?
Run this query to see the total-size of indexes on that table.
SELECT    pg_size_pretty (pg_indexes_size('alf_child_ass'));

Open in new window

You will need at least that size of free space on your disk and more for other workload.

Regards,
    Tomas Helgi
marrowyung

ASKER
SELECT    pg_size_pretty (pg_indexes_size('alf_child_ass')); 

7905 MB

and we have:





Tomas Helgi Johannsson

What about the table itself?

SELECT    pg_size_pretty (pg_relation_size('alf_child_ass
'));

Open in new window


Can you successfully run REINDEX on a single index of that table instead of all indexes for that table ?

REINDEX INDEX the_indexname;

Open in new window



Regards,
   Tomas Helgi
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
marrowyung

ASKER
after I run :
VACUUM(FULL, ANALYZE, VERBOSE);

for the DB I want to maintenance, the total sizes cut by 1/2, this is a good signal! and I also delete some backup file which is for test only and it make the driver has more free space,  from 103GB -> 688GB left

then I do that again:

REINDEX  table alf_child_assoc; --this is a 2.1 GB table.

and it can finish... ! so FREE space is that important (and 103GB is not enought..... ?!!) for the reindex rebuild do works.

and this:

 REINDEX  table alf_node_properties;  -- -- SQL Error [23505]: ERROR: could not create unique index "alf_node_properties_pkey"
  Detail: Key (node_id, qname_id, list_index, locale_id)=(51060016, 28, -1, 1) is duplicated.
 
REINDEX  table alf_node  ;  --SQL Error [23505]: ERROR: could not create unique index "alf_node_pkey"
  Detail: Key (id)=(844) is duplicated.

any idea on how to solve? 






Tomas Helgi Johannsson

Hi,

REINDEX  table alf_node_properties;  -- -- SQL Error [23505]: ERROR: could not create unique index "alf_node_properties_pkey"  Detail: Key (node_id, qname_id, list_index, locale_id)=(51060016, 28, -1, 1) is duplicated.
 
REINDEX  table alf_node  ;  --SQL Error [23505]: ERROR: could not create unique index "alf_node_pkey"  Detail: Key (id)=(844) is duplicated.


This means that you have duplicate entries in your table on these keys mentioned in the errors and you need to delete one of the duplicate record in order to be able to create these indexes.
Doing reindex on a busy table can lead to duplicate entries in a table between the time when the reindex process drops a unique index and then tries to recreate it.

Regards,
    Tomas Helgi
marrowyung

ASKER
Doing reindex on a busy table can lead to duplicate entries in a table between the time when the reindex process drops a unique index and then tries to recreate it.

PostgreSQL is not very good on that? I don't see oracle and MS SQL has this kind of issue. or they just handle it for us behind the scene ? the postgreSQL should lock the table when rebuilding the index and this process already slow down the whole DB for a long time..too long for the locking ..

" you need to delete one of the duplicate record in order to be able to create these indexes. "

so before the index process I have to do it.? and any query can help to find this out ? I want to do all that before we backup from 9.3.6 and restore it to 11.14 and 13.4/13.5

it should have primary key to make sure no duplicate entries are here... 

Your help has saved me hundreds of hours of internet surfing.
fblack61
marrowyung

ASKER
I am sorry sir, will VACUUM(FULL, ANALYZE, VERBOSE);, once cancel during the process if we found it is slow, corrupt any data?

Any experience, on the failure of this process corrupt data and make the DB in bad state?