Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 281
  • Last Modified:

Optimisation of a select statement on 1 table using GROUP & ORDER

Hi all,

I'm looking to get more performance out of a single query which I've built and tested with the help of guys on here over the past weeks.

So first the table and the indexes:

/*Column Information*/
----------------------

FIELD    TYPE                    COLLATION  NULL    KEY     DEFAULT  Extra   PRIVILEGES                       COMMENT
-------  ----------------------  ---------  ------  ------  -------  ------  -------------------------------  -------
link_id  INT(10) UNSIGNED        (NULL)     NO      PRI     (NULL)           SELECT,INSERT,UPDATE,REFERENCES         
word_id  MEDIUMINT(10) UNSIGNED  (NULL)     NO      PRI     (NULL)           SELECT,INSERT,UPDATE,REFERENCES         
cat_id   TINYINT(3) UNSIGNED     (NULL)     NO      PRI     (NULL)           SELECT,INSERT,UPDATE,REFERENCES         
pos      TINYINT(3) UNSIGNED     (NULL)     NO      MUL     (NULL)           SELECT,INSERT,UPDATE,REFERENCES         

/*Index Information*/
---------------------

TABLE      Non_unique  Key_name  Seq_in_index  Column_name  COLLATION  Cardinality  Sub_part  Packed  NULL    Index_type  COMMENT
---------  ----------  --------  ------------  -----------  ---------  -----------  --------  ------  ------  ----------  -------
linkindex           0  PRIMARY              1  word_id      A               (NULL)    (NULL)  (NULL)          BTREE              
linkindex           0  PRIMARY              2  link_id      A               (NULL)    (NULL)  (NULL)          BTREE              
linkindex           0  PRIMARY              3  cat_id       A             18512312    (NULL)  (NULL)          BTREE              
linkindex           1  pos                  1  pos          A                  110    (NULL)  (NULL)          BTREE              

/*DDL Information*/
-------------------

CREATE TABLE `linkindex` (
  `link_id` INT(10) UNSIGNED NOT NULL,
  `word_id` MEDIUMINT(10) UNSIGNED NOT NULL,
  `cat_id` TINYINT(3) UNSIGNED NOT NULL,
  `pos` TINYINT(3) UNSIGNED NOT NULL,
  PRIMARY KEY (`word_id`,`link_id`,`cat_id`),
  KEY `pos` (`pos`)
) ENGINE=MYISAM DEFAULT CHARSET=latin1

Open in new window


And a typical query used against this table:

SELECT link_id, SUM(cat_id) AS score 
FROM linkindex 
WHERE word_id IN (107, 27) 
GROUP BY link_id 
ORDER BY score DESC LIMIT 0,16;

Open in new window


Explain results from the query:

id      select_type      table      type      possible_keys      key      key_len      ref      rows      Extra
1       SIMPLE       linkindex       range       PRIMARY       PRIMARY       3       (NULL)       700079       Using where; Using index; Using temporary; Using filesort

And my.cnf file - (server has 1 core - 2 threads, 1GB memory)

max_allowed_packet      = 16M
thread_stack            = 192K
thread_cache_size       = 8

#This is for filesort operations
tmp_table_size = 48M
#Same thing as tmp_table_size to the HEAP type tables
max_heap_table_size = 48M
read_buffer_size = 256K
#Needs to be fairly large as lots of GROUP BY/ORDER BY queries
sort_buffer_size = 256K
read_rnd_buffer_size = 256K
join_buffer_size = 256K
#Can't be higher than the available memory (max prob 75% memory)
max_tmp_tables = 24
#Used for sorting indexes
myisam_sort_buffer_size = 2M
key_buffer_size=64M

Open in new window


I added comments to each of the vars as I learnt about them.

Now, the biggest time chunk is 'Copying to tmp table' - I believe this is in memory otherwise it would say 'to disk' and be much slower. I increased tmp_table_size to avoid this.

The table in question has 20 million rows but all fairly small integer fields.

I read from mysql it is possible to do a query using GROUP BY on 1 table without invoking tmp tables:

Reference: http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html

I'm not sure this can be applied but any help in improving the speed would be great.

Speeds are not so bad at the mo but they could be improved as common word_ids sometimes take 2 secs + which isn't the best.

Cheers
David

0
dolythgoe
Asked:
dolythgoe
  • 5
  • 2
  • 2
2 Solutions
 
Neil RussellTechnical Development LeadCommented:
Your BEST optimization would be to upgrade the server!
0
 
dolythgoeAuthor Commented:
That's kind of an obvious one that frames everything but I need to be sure I've done everything first regardless of server. More specifically, avoiding tmp tables if possbile with the query stated.
0
 
dolythgoeAuthor Commented:
And which bit of the server - open more cores?
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
Aaron TomoskyTechnology ConsultantCommented:
I think it must be paging to disk as 1gb of ram cannot possibly be enough. The os takes almost half that and 20 million rows cant be tiny
0
 
dolythgoeAuthor Commented:
Well the table of 20 million rows is not actually as big as you might think as it's int,medium int, tiny int and tiny int.  The lookup at most is 1 million rows.

It's a db slave so just has ubuntu and mysql installed.

What I've requested is an additional 2GB ram and another 3 cores to increase to 8 threads. The threads negate the need for as much memory since it can create and destroy temp tables faster in memory reducing the concurrency window that can bloat the memory. That is, if mysql v5.56 makes good use of those threads which it appears to do so.

So from the initial question - does it appear that no more query optimisation can be done and it's down to raw power? Happy with that, if that's the answer just wanted to confirm with you guys before paying for more juice!
0
 
Aaron TomoskyTechnology ConsultantCommented:
Unsigned ints, with indexes on all fields.  Can't think of any way to optimize that. Query is simple with only a sum ona column. Unless you use this same query often and can just make a view off a summary table and update that table when data changes....looks clean to me.
0
 
dolythgoeAuthor Commented:
Thanks,

I just ran MySQL Tuner https://github.com/rackerhacker/MySQLTuner-perl

Which spat this out:

-------- Performance Metrics -------------------------------------------------
[--] Up for: 5m 57s (226 q [0.633 qps], 109 conn, TX: 104K, RX: 15K)
[--] Reads / Writes: 100% / 0%
[--] Total buffers: 128.0M global + 6.4M per thread (151 max threads)
[!!] Maximum possible memory usage: 1.1G (110% of installed RAM)
[OK] Slow queries: 0% (0/226)
[OK] Highest usage of available connections: 0% (1/151)
[!!] Key buffer size / total MyISAM indexes: 64.0M/685.3M
[!!] Key buffer hit rate: 94.0% (469K cached / 28K reads)
[!!] Query cache efficiency: 0.0% (0 cached / 99 selects)
[OK] Query cache prunes per day: 0
[!!] Sorts requiring temporary tables: 50% (2 temp sorts / 4 sorts)
[OK] Temporary tables created on disk: 22% (48 on disk / 212 total)
[OK] Thread cache hit rate: 99% (1 created / 109 connections)
[OK] Table cache hit rate: 23% (35 open / 147 opened)
[OK] Open file limit used: 7% (74/1K)
[OK] Table locks acquired immediately: 100% (62 immediate / 62 locks)

-------- Recommendations -----------------------------------------------------
General recommendations:
    MySQL started within last 24 hours - recommendations may be inaccurate
    Reduce your overall MySQL memory footprint for system stability
    Enable the slow query log to troubleshoot bad queries
Variables to adjust:
  *** MySQL's maximum memory usage is dangerously high ***
  *** Add RAM before increasing MySQL buffer variables ***
    key_buffer_size (> 685.3M)
    query_cache_limit (> 1M, or use smaller result sets)
    sort_buffer_size (> 2M)
    read_rnd_buffer_size (> 2M)

Open in new window


Looks like it needs more memory! A handy tool.

0
 
Neil RussellTechnical Development LeadCommented:
so like I said the optimization you need is a server upgrade. Your query is as simple as it gets nearly
0
 
dolythgoeAuthor Commented:
Thanks for confirming this guys.
0

Featured Post

NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

  • 5
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now