• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 23401
  • Last Modified:

PostgreSQL performance -- VERY slow on INSERTing into tables without indexes in a small db

I have installed the latest PostgreSQL native Windows beta (8.0.0-beta4 using the Windows MSI installer) on a Windows XP Pro SP2 system.  I accepted all of the installation defaults, as far as I remember.  It is running successfully (I can store from my app, "pgAdmin III" shows the results, I can query using "psql")  and my application that stores data via ODBC (psqlodbc.dll, version 8.0.0.2) is working.  But it is VERY slow to store.  A single INSERT takes an elapsed time of one second according to the query log, although the actiual query time is only 10 milliseconds (typically).  The tables being inserted into have no indexes.  The disk drive buzzes without stop as many INSERT's are done.   It does not matter whether they are Prepared or Direct INSERT's.  I have increased most of the configuration parameters in file "postgresql.conf" that relate to memory and buffer sizes to large numbers (see below), but nothing changes.  I have been sure to re-start the server each time so that the new conf parameters should take effect.  My app is explicitly beginning and committing transactions, one transaction over many INSERT's.  The size of "C:\Program Files\PostgreSQL\8.0.0-beta4\data\base" is 44 Mb.  My computer has 512 MB of RAM.

Why is it so slow?  Why all the disk activity?  How can I speed it up?  I am not asking for real performance tuning.  I just want to fix what is "broken".

customized Parameters:
shared_buffers = 8000
work_mem = 10240
bgwriter_percent = 0
bgwriter_maxpages = 0
fsync = false
wal_buffers = 1000
checkpoint_timeout = 3600
checkpoint_warning = 0

log_min_error_statement = debug
log_min_duration_statement = 0

0
GomerJones
Asked:
GomerJones
  • 13
  • 4
  • 3
  • +3
1 Solution
 
earth man2Commented:
An insert should not take that long.  Is logging/tracing turned on in your ODBC layer ?  Turn it off.
If you are concerned about speed maybe you should be using Linux to serve your postgresql database.  You are using a beta version on Windows !
0
 
earth man2Commented:
To get best performance turn autocommit off and commit at end of transactions.
Using prepared insert statement is best.
0
 
earth man2Commented:
in postgresql.conf file try increasing
checkpoint_segments=20
0
Configuration Guide and Best Practices

Read the guide to learn how to orchestrate Data ONTAP, create application-consistent backups and enable fast recovery from NetApp storage snapshots. Version 9.5 also contains performance and scalability enhancements to meet the needs of the largest enterprise environments.

 
earth man2Commented:
in postgresql.conf file try increasing
checkpoint_segments=10
0
 
earth man2Commented:
you could try postgresql.conf setting
fsync=false
JUST to see if this impoves matters.  Leaving fsync disabled increases the likelihood that a power cut will result in a corrupted database.
for bulk inserts you could backup data then set fsync=false, do bulk insert then set fsync=true.
if anything goes awry you can restore backup.
0
 
robbertCommented:
Just for the record, and as I'm interested in this question:

I recently did some rough benchmarking, comparing PostreSQL 8.0 beta and SQL Server 2000 SP3a on Windows XP SP1, using JDBC (Java 1.5) to execute read and write statements on a database that's been setup (slightly modified) according to the TPC-W benchmark standard. About 10 tables, some 700,000 records or so, and just one or two test runs per database and task (thus, not representative). Both databases had their tablespaces on the same partition. No Stored Procedure / functions tests, but just plain INSERT and SELECT statements. Default configuration for both databases.

Results:
Writing:  SQL Server is 25 times faster than PostgreSQL (and the hard disk didn't make those "buzzy" sounds the questioner mentioned)
Reading:  SQL Server is 100 times faster.

- From what I've read in the web, I hadn't expected such a result.  Man, people are doing data warehousing with PostgreSQL, having database sizes of - often - several TeraBytes, or they're serving several thousand users...

Is it because it's on Windows?  -  I don't understand the hype, or otherwise don't understand that poor performance I've experienced.  -  I've seen benchmarks that appeared to prove PostgreSQL is as fast as Oracle..., now it's slower than even Access?!
0
 
earth man2Commented:
You are using layers of products to evaluate database speed.  JDBC and ODBC drivers have not had the hours of development that mainstream products have.

Postgresql has been developed on Linux file systems and only recently ported to M$.  Performance per buck it is infinitely better than SQL Server.

To  compare use Linux with libpq calls to evaluate performance of Postgresql product.
0
 
earth man2Commented:
also
log_min_error_statement = PANIC
0
 
earth man2Commented:
I quote from the documentation.

Beginning in PostgreSQL 8.0, there is a separate server process called the background writer, whose sole function is to issue writes of "dirty" shared buffers. The intent is that server processes handling user queries should seldom or never have to wait for a write to occur, because the background writer will do it. This arrangement also reduces the performance penalty associated with checkpoints. The background writer will continuously trickle out dirty pages to disk, so that only a few pages will need to be forced out when checkpoint time arrives, instead of the storm of dirty-buffer writes that formerly occurred at each checkpoint. However there is a net overall increase in I/O load, because where a repeatedly-dirtied page might before have been written only once per checkpoint interval, the background writer might write it several times in the same interval. In most situations a continuous low load is preferable to periodic spikes, but the parameters discussed in this section can be used to tune the behavior for local needs.

bgwriter_delay (integer)

    Specifies the delay between activity rounds for the background writer. In each round the writer issues writes for some number of dirty buffers (controllable by the following parameters). The selected buffers will always be the least recently used ones among the currently dirty buffers. It then sleeps for bgwriter_delay milliseconds, and repeats. The default value is 200. Note that on many systems, the effective resolution of sleep delays is 10 milliseconds; setting bgwriter_delay to a value that is not a multiple of 10 may have the same results as setting it to the next higher multiple of 10. This option can only be set at server start or in the postgresql.conf file.
bgwriter_percent (integer)

    In each round, no more than this percentage of the currently dirty buffers will be written (rounding up any fraction to the next whole number of buffers). The default value is 1. This option can only be set at server start or in the postgresql.conf file.
bgwriter_maxpages (integer)

    In each round, no more than this many dirty buffers will be written. The default value is 100. This option can only be set at server start or in the postgresql.conf file.

Smaller values of bgwriter_percent and bgwriter_maxpages reduce the extra I/O load caused by the background writer, but leave more work to be done at checkpoint time. To reduce load spikes at checkpoints, increase the values.
0
 
earth man2Commented:
Of course if you have a memory slot spare add more memory.  XP is hungry for resource so you may be swapping memory to disk.
0
 
rjkimbleCommented:
Make sure that you're vacuuming the databases periodically. It's a necessity due to the way that PostgreSQL handles deletes.
0
 
robbertCommented:
Just for the record:

> Postgresql has been developed on Linux file systems and only recently ported to M$.

In the meantime, I also ran the aforementioned benchmark on a Linux box, partioned with ReiserFS. - There didn't seem to be a difference to running it on Windows 2000 or XP.

> To  compare use Linux with libpq calls to evaluate performance of Postgresql product.

Do you mean, the PostgreSQL ODBC and JDBC drivers are incredibly much slower than the native drivers?
0
 
earth man2Commented:
you should use ext2 type partitions for database files else you are buffering chages twice.
0
 
robbertCommented:
Do you think performance would increase 25 or 50 times if I did?

Do you think performance would increase just 2 times?
0
 
earth man2Commented:
This is not even your question robett.  If you need help to solve your problem, you need to post enough information so the problem can be replicated and solution formulated.
0
 
rjkimbleCommented:
Just out of curiosity, have you flagged the data directory for compression? That could slow things down, although what you're describing seems way out of line for that.
0
 
robbertCommented:
rjkimble: No, there is nothing fancy.

earthman2: You're right, this question has not been posted by me. Anyways, my (reproducable) findings do fit particularly well.

I've executed TPCW_Populate.java (from http://www.ece.wisc.edu/~pharm/tpcw/tpcw.tar.gz) for the writing benchmarks. For the reading benchmarks I've executed SELECT statements on every table, returning the first 1000 rows only ("SELECT TOP", resp. "SELECT ... LIMIT").
You can read about the software (and partly, hardware) environments above. As for the hardware: ~400 MB free RAM, 1.8 GHz CPU (and lower: tested on 2 PCs).

- After all, I wouldn't post a duplicate question to this at this place. I'd say - from my findings, I'm disappointed myself --

if you - at last - use an x86 processor, PostgreSQL can't compete with SQL Server at any rate.
0
 
earth man2Commented:
Well on a 1GHz Via Nehemiah test took 45 minutes.  How fast was SQLServer ?
0
 
rjkimbleCommented:
Here's an interesting article: http://www.lyris.com/lm_help/8.0/tuning_postgresql.html

In short, you might try setting fsync = false in postgresql.conf.
0
 
gheistCommented:
Basically you need to know how much resources you will assign to PostgreSQL.
Lets take you will run webserver postgresql and windows
This makes three important applications, so we give one third of resources to postgresql:
512M=128000 4k pages or 64k 8k pages

shared_buffers = 20000 # 8k blocks here , this is the missing part
checkpoint_segments =1 # do small (16M) work at once, generally slower, but less long interruptions

basically do not customize rest of parameters

this will make your massive loads of indexed inserts smooth

sort_mem = 4096 # kilobytes here


basically you tune shared buffers up until you get sustainable disk loads
then you use sort_mem to tune indexed inserts and sorted selects

vacuum often.

Common Windows tuning precautions apply:
IOPageLockLimit must cover at least second of full-speed disk work
System must be tuned for small disk cache and fast net servers, DMA disk access, etc.

I guess that *_maxpages=0 makes limits infinite and swapping inevitable, instead of automatic tuning

You probably look for these:

http://www.postgresql.org/docs/7.4/static/runtime-config.html

Please specify more exact environment for your benchmarks, since hsqldb completed import in 20mins on PIII 550 - what drivers you used etc
0
 
gheistCommented:
and postgresql jdbc does not prepeare statements by default

try adding ;prepareThreshold=1 to url

0
 
earth man2Commented:
TPCW_Populate.java  would benefit from using  Batch Update processing and prepared statements.

PreparedStatment stmt = con.prepareStatement( "INSERT INTO X( COL1, COL2) VALUES ( ?, ? )" );
stmt.setInt(1, 1001)
stmt.setString(2, "Interesting" );
stmt.addBatch();

int upcounts = stmt.executeBatch();
co.commit();
0
 
DonVFCommented:
>>>>Performance per buck it is infinitely better than SQL Server

Well The fact that it is free that kinda goes without saying doesn't it?

PostgreSQl is high maintenance, High memory footprint...
I am forced to use it everyday for my job...

There are a lot of select , update statements in what it is used for...

And MySQL smokes postgreSQL on inserts, selects, updates...etc...
The server is running on SuSe....both MySQL and POstgreSQL and the database size is about 1.5GB
I ported The postgresql database to MySQL just for testing this...Normal query statements...MySQL smokes postgreSQl...so unless you need stored procedures, and a lot of the features like oracle uses...MySQL is the best choice....ove MSSQL, and POstgreSQL
0
 
DonVFCommented:
Another hint though....if you aren't running PostgreSQL on linux or FreeBSD....forget using it...the windows port sucks at teh moment
0
 
Anvesh PatelCommented:
Nice Article !
This is my pleasure to read your article.
Really this will help to people of PostgreSQL Community.

I have also prepared one article about, How to increase the INSERTION performance in PostgreSQL.
You can also visit my article, your comments and reviews are most welcome.

http://www.dbrnd.com/2016/08/postgresql-how-to-increase-the-performance-of-bulk-insertion-insert-copy-command-disable-trigger-index-vacumming/
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 13
  • 4
  • 3
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now