• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 678
  • Last Modified:

Insert into partitioned table hangs

I am running PostgreSQL 8.1.3 on Windows 2003 Server.

I have a master table that has many temporal partitions. I have a Java application that executes an insert on the master table every second or two (non-concurrently).

However, I noticed that at random times an insert statement just hangs. For example I will have 100 or 200 successfuly inserts and then it just sits there at the next insert and it never finishes. If I restart the Java application it throws a primary key violation for that last record that was hanging (aka the record has been inserted regardless of the statement hanging), and then it works for another couple of hundred statements until it fails again in the same fashion. Failure intervals vary from 20-30 statements prior to failure, to 500-1000 statements prior to failure.

The failing insert statement is no different than all the other ones, and it is supposed to insert into the same partition as all the other ones are.

I checked the pg_locks and this session is not holding any locks or waiting for any locks to be released. There are not many other transactions in the database, and ones that do exist are mostly selects in different schemas.

Here is an example of output from my Java application:

...

16:16:52: INSERT INTO master_table(id, speed, start) VALUES (84,57.7,'05-09-2006 20:53:42 -4:00')
DONE

16:16:53: INSERT INTO master_table(id, speed, start) VALUES (43,74.6,'05-09-2006 20:53:42 -4:00')
DONE

16:16:54: INSERT INTO master_table(id, speed, start) VALUES (71,74.2,'05-09-2006 20:53:42 -4:00')

This last statement never reaches the "DONE" print statement located immediatelly below the query execute statement.

This same application used to load the data into this table prior to it being partitioned and it never had issues.

What could be the issue and where should I be looking to fix this problem?
0
ivanovn
Asked:
ivanovn
  • 9
  • 3
  • 3
  • +2
3 Solutions
 
m1tk4Commented:
When you say "temporal partitions", does it really mean you are partitioning your table based on ranges of "start" field? (say, once a day?) If yes, how are the ranges defined?

What's your primary key for the table and what is the exact type of "start" column?

0
 
ivanovnAuthor Commented:
Yes temporal partitioning is refering to partitioning on monthly date ranges. So the example rule for May 2006 partition is:

CREATE OR REPLACE RULE partition_05_06_insert AS
ON INSERT TO master_table WHERE ("start" >= '05-01-2006' AND "start" < '06-01-2006')
DO INSTEAD
INSERT INTO partition_05_06 VALUES
(
      NEW.id,
      NEW.speed,
      NEW."start"
);

The primary key is composite primary key containing all three fields (id, speed, and start).
The type of "start" column is timestamptz.

0
 
m1tk4Commented:
>>If I restart the Java application it throws a primary key violation for that last record that was hanging

This sounds to me like some kind of a bug in pg client libraries, that does not properly parse an error returned by the transaction. Here's what I'd do:

1. Try closing and reopening a connection for each insert and see if it improves/changes behaviour.

2. Consider a different approach - if the reason you started partitioning was slow data-mining queries for main table because it's unpartitioned index was huge, maybe it makes sense to keep the main table intact and and just build partial indexes (say, for last month) for those queries. The thing is, while table partitioning based on inheritance is somewhat new in PostgreSQL (although I'm sure you can implement it in 7.x it doesn't appear in the manual until 8.1), partial indexes have been there forever and are probably a lot better tested.
0
Fill in the form and get your FREE NFR key NOW!

Veeam is happy to provide a FREE NFR server license to certified engineers, trainers, and bloggers.  It allows for the non‑production use of Veeam Agent for Microsoft Windows. This license is valid for five workstations and two servers.

 
ivanovnAuthor Commented:
I attempted the first solution you suggested. I open a connection, execute an insert query, close the connection. However, this caused the same problem. After several hundreds insert statements, one of them shows that connection has been created, but the insert statement never finishes and connection does not get closed until I exit the application.

However, I noticed that even though I was opening and closing connections, the postgres server showed one persistant postgres.exe process that showed up when I started the application and disappeared when I killed the application. I expected to see a postgres.exe process show up and go away for every insert statement (aka every connection), but that didn't happen. Why?

The reason I started partitioning was slow data-mining due to large index, and I also wanted to be able to move data between different filesystems. For example I could move old data to a slower data storage unit by grabbing the appropriate partition and placing it to a new location. It is much easier to do this when you have 100 tables with 10 million records each as opposed to one table with 1 billion records.
0
 
m1tk4Commented:
>>However, I noticed that even though I was opening and closing connections, the postgres server
>>showed one persistant postgres.exe process that showed up when I started the application and
>>disappeared when I killed the application. I expected to see a postgres.exe process show up and
>>go away for every insert statement (aka every connection), but that didn't happen. Why?

I think it uses threaded model in Win32 to serve requests - if you start monitoring number of threads for the process you'll probably notice them.

I kind of suspect the problem may not be in your Postgres but the client libraries. To be sure, can you try adding these records by executing psql from a command line and passing an insert statement to it? Depending on the outcome, you would then either start looking for a bug in Postgres or your client libraries. There might be a workaround for this problem, but I wouldn't be betting on it.

Most likely you'll probably be back to using partial indexes. And also - think in terms of how many new and fast hard drives you could buy for the cost of the time you are spending on troubleshooting partitioned tables in PG ;)))
0
 
earth man2Commented:
Is the client running on the same machine as the database ?

If not is there some network fault ?
0
 
ivanovnAuthor Commented:
The client is not running on the same machine as the database. I am not aware of any network issues. I have several other applications accessing the same server in the same fashion via network, and those are not having any issues.

I guess I could probably move the problem application over to the server for the time being and see if I observe the same problem.
0
 
ivanovnAuthor Commented:
I ran the application from the server directly, bypassing the network, and I got the same problem. The application ran for a while and then got stuck at an insert. So it appears not to be a network issue.
0
 
ivanovnAuthor Commented:
I am still having the same problem.

However, I had an interesting discovery today that I don't understand.

I left the insert query that was stuck running. Then, without changing
any parameters, I ran reload configuration, aka. sending SIGHUP, and
the query immediately completed and went on to the next insert.

I also noticed that I am having the similar issue with SELECT
statements. Same SELECT statement will run fine 3-4 times in a row and
then get stuck and run forever with no results. But if I reload the
configuration like above, the query immediatelly returns with correct
result.

Why does this happen, and could this give any insight in my problem?
0
 
earth man2Commented:
I would suspect Windows specific problem or postgresql.jar.  Is Windows still considered as Trial status ?
0
 
ivanovnAuthor Commented:
Starting with version 8.0, PostgreSQL runs natively on MS Windows although they do caution that it's not as well supported by the extensive experience gained through years of running on Unix platforms (source: http://www.postgresql.org/docs/current/static/release-8-0.html).

For now I wrote a batch file executed through scheduler that signals SIGHUP to the DB once a minute. This seems to keep my application, but it is just a temporary hack. I am still looking for a real solution.

0
 
earth man2Commented:
Is this a threading issue, ie are your threads locking up because you or the driver are not doing yield ?
0
 
ivanovnAuthor Commented:
Possible, but unlikely. The only thread is spawned from the wrapper application, and for testing purposes I removed the wrapper and ran the application as a single thread and I still had the same problem.

In addition the SELECT statement issues come from pgsql, which eliminates possibility of Java implementation being the problem.
0
 
sibtayCommented:
>For now I wrote a batch file executed through scheduler that signals SIGHUP to the DB once a minute. This seems to keep my >application, but it is just a temporary hack. I am still looking for a real solution.

speaking of temporary hacks, try running each of your insert statements in an explict transaction block (BEGIN; END;).
0
 
ivanovnAuthor Commented:
Thanks for the suggestion.

I did try placing each insert into its own transaction, but that caused the same problem. I took even a more drastic measure and tried disconnecting after each insert and reconnecting for the next insert. That caused the same issue.

So far I haven't gotten a solution for it, but there were some suggestions of 8.1.3 SYSV semaphore bug as a possible cause. I haven't had a chance to apply a rather questionable patch.
0
 
sibtayCommented:

Here are a couple of more quick solutions, in case you have'nt tried them out yet

1) if constraint_exclusion variable is set to
true then disable it.

2) If you think its a locking issue then have you
tried to debug locks? If not then compile pgsql with LOCK_DEBUG compile time switch and then see whether there is a locking conflict or not.

However i doubt that its a locking conflict since there aren'nt any concurrent transactions around and pg_locks doest report any problems.
0
 
ivanovnAuthor Commented:
I haven't tried disabling the constraint_exclusion, so I will try that. However, that defeats the purpose of partitioning in the sense of performance since the planner would not eliminate partitions that do not contain the data satisfying the where clause.

I doubt that it's a locking issue since it's mostly one sessions accessing this table at all times. I guess I could check into it anyway.

Thanks.
0
 
gheistCommented:
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.
I will leave the following recommendation for this question in the Cleanup topic area:

Split between m1tk4 http:#16747538, earthman2 http:#16776435 and sibtay http:#16984256

Any objections should be posted here in the next 4 days. After that time, the question will be closed.

gheist
EE Cleanup Volunteer
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

  • 9
  • 3
  • 3
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now