Link to home
Start Free TrialLog in
Avatar of ivanovn
ivanovnFlag for United States of America

asked on

Insert into partitioned table hangs

I am running PostgreSQL 8.1.3 on Windows 2003 Server.

I have a master table that has many temporal partitions. I have a Java application that executes an insert on the master table every second or two (non-concurrently).

However, I noticed that at random times an insert statement just hangs. For example I will have 100 or 200 successfuly inserts and then it just sits there at the next insert and it never finishes. If I restart the Java application it throws a primary key violation for that last record that was hanging (aka the record has been inserted regardless of the statement hanging), and then it works for another couple of hundred statements until it fails again in the same fashion. Failure intervals vary from 20-30 statements prior to failure, to 500-1000 statements prior to failure.

The failing insert statement is no different than all the other ones, and it is supposed to insert into the same partition as all the other ones are.

I checked the pg_locks and this session is not holding any locks or waiting for any locks to be released. There are not many other transactions in the database, and ones that do exist are mostly selects in different schemas.

Here is an example of output from my Java application:

...

16:16:52: INSERT INTO master_table(id, speed, start) VALUES (84,57.7,'05-09-2006 20:53:42 -4:00')
DONE

16:16:53: INSERT INTO master_table(id, speed, start) VALUES (43,74.6,'05-09-2006 20:53:42 -4:00')
DONE

16:16:54: INSERT INTO master_table(id, speed, start) VALUES (71,74.2,'05-09-2006 20:53:42 -4:00')

This last statement never reaches the "DONE" print statement located immediatelly below the query execute statement.

This same application used to load the data into this table prior to it being partitioned and it never had issues.

What could be the issue and where should I be looking to fix this problem?
Avatar of m1tk4
m1tk4
Flag of United States of America image

When you say "temporal partitions", does it really mean you are partitioning your table based on ranges of "start" field? (say, once a day?) If yes, how are the ranges defined?

What's your primary key for the table and what is the exact type of "start" column?

Avatar of ivanovn

ASKER

Yes temporal partitioning is refering to partitioning on monthly date ranges. So the example rule for May 2006 partition is:

CREATE OR REPLACE RULE partition_05_06_insert AS
ON INSERT TO master_table WHERE ("start" >= '05-01-2006' AND "start" < '06-01-2006')
DO INSTEAD
INSERT INTO partition_05_06 VALUES
(
      NEW.id,
      NEW.speed,
      NEW."start"
);

The primary key is composite primary key containing all three fields (id, speed, and start).
The type of "start" column is timestamptz.

>>If I restart the Java application it throws a primary key violation for that last record that was hanging

This sounds to me like some kind of a bug in pg client libraries, that does not properly parse an error returned by the transaction. Here's what I'd do:

1. Try closing and reopening a connection for each insert and see if it improves/changes behaviour.

2. Consider a different approach - if the reason you started partitioning was slow data-mining queries for main table because it's unpartitioned index was huge, maybe it makes sense to keep the main table intact and and just build partial indexes (say, for last month) for those queries. The thing is, while table partitioning based on inheritance is somewhat new in PostgreSQL (although I'm sure you can implement it in 7.x it doesn't appear in the manual until 8.1), partial indexes have been there forever and are probably a lot better tested.
Avatar of ivanovn

ASKER

I attempted the first solution you suggested. I open a connection, execute an insert query, close the connection. However, this caused the same problem. After several hundreds insert statements, one of them shows that connection has been created, but the insert statement never finishes and connection does not get closed until I exit the application.

However, I noticed that even though I was opening and closing connections, the postgres server showed one persistant postgres.exe process that showed up when I started the application and disappeared when I killed the application. I expected to see a postgres.exe process show up and go away for every insert statement (aka every connection), but that didn't happen. Why?

The reason I started partitioning was slow data-mining due to large index, and I also wanted to be able to move data between different filesystems. For example I could move old data to a slower data storage unit by grabbing the appropriate partition and placing it to a new location. It is much easier to do this when you have 100 tables with 10 million records each as opposed to one table with 1 billion records.
ASKER CERTIFIED SOLUTION
Avatar of m1tk4
m1tk4
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of earth man2
earth man2
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ivanovn

ASKER

The client is not running on the same machine as the database. I am not aware of any network issues. I have several other applications accessing the same server in the same fashion via network, and those are not having any issues.

I guess I could probably move the problem application over to the server for the time being and see if I observe the same problem.
Avatar of ivanovn

ASKER

I ran the application from the server directly, bypassing the network, and I got the same problem. The application ran for a while and then got stuck at an insert. So it appears not to be a network issue.
Avatar of ivanovn

ASKER

I am still having the same problem.

However, I had an interesting discovery today that I don't understand.

I left the insert query that was stuck running. Then, without changing
any parameters, I ran reload configuration, aka. sending SIGHUP, and
the query immediately completed and went on to the next insert.

I also noticed that I am having the similar issue with SELECT
statements. Same SELECT statement will run fine 3-4 times in a row and
then get stuck and run forever with no results. But if I reload the
configuration like above, the query immediatelly returns with correct
result.

Why does this happen, and could this give any insight in my problem?
I would suspect Windows specific problem or postgresql.jar.  Is Windows still considered as Trial status ?
Avatar of ivanovn

ASKER

Starting with version 8.0, PostgreSQL runs natively on MS Windows although they do caution that it's not as well supported by the extensive experience gained through years of running on Unix platforms (source: http://www.postgresql.org/docs/current/static/release-8-0.html).

For now I wrote a batch file executed through scheduler that signals SIGHUP to the DB once a minute. This seems to keep my application, but it is just a temporary hack. I am still looking for a real solution.

Is this a threading issue, ie are your threads locking up because you or the driver are not doing yield ?
Avatar of ivanovn

ASKER

Possible, but unlikely. The only thread is spawned from the wrapper application, and for testing purposes I removed the wrapper and ran the application as a single thread and I still had the same problem.

In addition the SELECT statement issues come from pgsql, which eliminates possibility of Java implementation being the problem.
Avatar of sibtay
sibtay

>For now I wrote a batch file executed through scheduler that signals SIGHUP to the DB once a minute. This seems to keep my >application, but it is just a temporary hack. I am still looking for a real solution.

speaking of temporary hacks, try running each of your insert statements in an explict transaction block (BEGIN; END;).
Avatar of ivanovn

ASKER

Thanks for the suggestion.

I did try placing each insert into its own transaction, but that caused the same problem. I took even a more drastic measure and tried disconnecting after each insert and reconnecting for the next insert. That caused the same issue.

So far I haven't gotten a solution for it, but there were some suggestions of 8.1.3 SYSV semaphore bug as a possible cause. I haven't had a chance to apply a rather questionable patch.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ivanovn

ASKER

I haven't tried disabling the constraint_exclusion, so I will try that. However, that defeats the purpose of partitioning in the sense of performance since the planner would not eliminate partitions that do not contain the data satisfying the where clause.

I doubt that it's a locking issue since it's mostly one sessions accessing this table at all times. I guess I could check into it anyway.

Thanks.
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.
I will leave the following recommendation for this question in the Cleanup topic area:

Split between m1tk4 http:#16747538, earthman2 http:#16776435 and sibtay http:#16984256

Any objections should be posted here in the next 4 days. After that time, the question will be closed.

gheist
EE Cleanup Volunteer