Help optimising my complex mysql query

I'm trying to extract some data from a pretty large mySQL database for a report. I've got most of what I want, but when I add in an extra level of detail, my query takes too long to run. Here is the SQL: I've already rewritten it to take out the JOIN syntax and I've got indexes on the parts I am using:

SELECT o.name, o.id, c.name, c.id as cid, box.site_id as siteid
, (select count(bf.id)
FROM batches b, batchfile bf, files f, boxes bo  
where
bf.batch_id = b.id
and bf.file_id = f.id
and f.box_id = bo.id
and b.batchtype = 0 and b.requestDate between '2014-03-01' and '2015-03-01' and b.client_id = cid and ((siteid is null and bo.site_id is null) or (bo.site_id = siteid))) as 'file deposits'
FROM boxes box, clients c, organisations o, batches b, batchfile bf, files f
where bf.batch_id = b.id
and bf.file_id = f.id
and f.box_id = box.id
and b.client_id = c.id
and c.organisation_id = o.id
and b.requestDate between '2014-03-01' and '2015-03-01'
and batchtype = 0
group by siteid, c.id;

Explain query gives me this:
1      PRIMARY      b      range      PRIMARY,FKEC39FD68A2287183,requestDate,batchtype,id_requestDate,idx_requestYear,batchtype_requestDate      batchtype_requestDate      10            2302      Using where; Using temporary; Using filesort
1      PRIMARY      c      eq_ref      PRIMARY,FK334B86089EC80063      PRIMARY      8      fileexpress.b.client_id      1      
1      PRIMARY      o      eq_ref      PRIMARY      PRIMARY      8      fileexpress.c.organisation_id      1      
1      PRIMARY      bf      ref      FKC5B09FB681F11E23,FKC5B09FB682CEBEB1      FKC5B09FB682CEBEB1      9      fileexpress.b.id      8      Using where
1      PRIMARY      f      eq_ref      PRIMARY,FK5CEBA77DCAF4AD1      PRIMARY      8      fileexpress.bf.file_id      1      
1      PRIMARY      box      eq_ref      PRIMARY      PRIMARY      8      fileexpress.f.box_id      1      
2      DEPENDENT SUBQUERY      b      ref      PRIMARY,FKEC39FD68A2287183,requestDate,batchtype,id_requestDate,idx_requestYear,batchtype_requestDate      FKEC39FD68A2287183      9      func      25      Using where
2      DEPENDENT SUBQUERY      bf      ref      FKC5B09FB681F11E23,FKC5B09FB682CEBEB1      FKC5B09FB682CEBEB1      9      fileexpress.b.id      8      Using where
2      DEPENDENT SUBQUERY      f      eq_ref      PRIMARY,FK5CEBA77DCAF4AD1      PRIMARY      8      fileexpress.bf.file_id      1      
2      DEPENDENT SUBQUERY      bo      eq_ref      PRIMARY,FK59943B93218C643,site_id      PRIMARY      8      fileexpress.f.box_id      1      Using where

Any help would be greatly appreciated. I feel as if I'm missing something really obvious and stupid :-(

many thanks!
mercury1ltdAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

pcelbaCommented:
You should tell when box.site_id can be NULL.
And also, why do you calculate the COUNT() in a subquery?

The first simplification could look like this code:
SELECT o.name, o.id, box.site_id, c.id, COUNT(bf.id) AS 'file deposits'
  FROM boxes box, clients c, batches b, batchfile bf, files f, organisations o 
 WHERE bf.batch_id = b.id
   and bf.file_id = f.id 
   and f.box_id = box.id
   and b.client_id = c.id
   and c.organisation_id = o.id
   and b.requestDate between '2014-03-01' and '2015-03-01' 
   and batchtype = 0 
 GROUP BY box.site_id, c.id, o.name, o.id ;

Open in new window

The between should also be reviewed and adjusted depending on the requestDate data type (date should be handled differently from datetime).

Very important for optimization are indexes. In addition to primary keys I would recommend indexes on some foreign keys:
batchfile.batch_id (important)
batches.requestDate (important when the table contains data for many years)
batches.batchtype (important when the table contains data for many batchtypes)
files.box_id (just my guess)
box.site_id (just my guess)

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
mercury1ltdAuthor Commented:
Thanks very much. That helps. The reason that I had the count in a subquery was because this was only part of a bigger query which had various subqueries to form a longer report. I've now broken it down and used my Excel skills instead to combine :)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
MySQL Server

From novice to tech pro — start learning today.