Link to home
Start Free TrialLog in
Avatar of Lakshmana S
Lakshmana S

asked on

I have a requirment to select n% of records for each customers for audit , is that possible to achieve in mYSQL query

I googled but i was only able to get top n rows or random rows is that possible to fetch it % of rows for every customers .
Avatar of Julian Hansen
Julian Hansen
Flag of South Africa image

Try using two queries

The first gets the total number of rows - then use that to calculate a percentage and use in a LIMIT clause
// ASSUME $mysqli HOLDS REFERENCE TOMySQLi OBJECT
$query = "SELECT COUNT(*) AS totalRows FROM yourtablename";
$result = $mysqli->query($query);
$totalrows = '';
If ($result) {
    $row = $result->fetchObject();
    $totalrows = $row->totalRows * PERC;
}
$query = "SELECT * FROM yourtablename LIMIT 1, {$totalrows}";

Open in new window

try something like this

SELECT *
  FROM ( SELECT *,
                @custid_rank := IF(@current_custid = custid, @custid_rank + 1, 1) AS custid_rank,
                @current_custid := custid
           FROM auditTable
          ORDER BY custid, eventdate DESC
       ) rn
 WHERE rn <= 10;

Open in new window


* this should give top 10 records per customer from auditTable
Avatar of Lakshmana S
Lakshmana S

ASKER

Hi , Thanks for assistance , the above queries i was able to fetch % of records but i need to group that every user . below is the scenario am expecting      

My table
=========
Issue Id      USER ID  
1                  A
2                  A
3                  A
4                  A
5                  B
6                  B
7                  C

output
===========
ISSUE ID   USER ID
1            A
4            A
6            B
7            C

Also I am trying to round of to minimum 1 record if tthe user has only one record . can anyone assist . I am stuck with this module
Don't understand the first two rows - why is A repeated?
Hi , I given sample record, Issue id is ticket id and user A is the one who resolved the Issue id . The User may resolve n number of issue ids , I am trying to get only certain percentage for each user for the purpose of auditing , hope i was clear . please let me know if it still unclear
So the N% is now to be applied to a grouping rather than the entire recordset?

Why a % and not a max value?
yes n% is applied to each user value  in column .  we use % since all users resolve different number of issues one migh resolve 25 and other user 10 and other just 2 so we feel good to keep as percent . if percent is complicated kindly share how max can help me to overcome
HainKurt's post touches on how to do this but it has errors.
This link discusses this and other methods
https://stackoverflow.com/questions/2129693/using-limit-within-group-by-to-get-n-results-per-group

The second answer seems to be a reasonable approach - adapted from this approach - you will need to customise for your table.
SET @currcount = NULL, @currvalue = NULL;
SELECT issueid, userid FROM (
    SELECT
        issueid, userid
        @currcount := IF(@currvalue = userid, @currcount + 1, 1) AS rank,
        @currvalue := userid AS whatever
    FROM YOURTABLE
    ORDER BY userid, issueid DESC
) AS whatever WHERE rank <= 5

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of HainKurt
HainKurt
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The code is now working fine .  :)
Is there a way to round of the value to nearest integer if custid has only one rank.
maybe one of these fits:

WHERE rn < 1 + (select count(1)*0.20 from auditTable y where x.custid=y.custid);

Open in new window


WHERE rn <=  (select round(count(1)*0.20) from auditTable y where x.custid=y.custid);

Open in new window


WHERE rn <=  (select round(count(1)*0.20 +1) from auditTable y where x.custid=y.custid);

Open in new window


but round makes

ROUND(0.4512) >>> 0
ROUND(0.5276) >>> 1

Open in new window


maybe you are looking for CEIL

WHERE rn <=  (select CEIL(count(1)*0.20) from auditTable y where x.custid=y.custid);

Open in new window


CEIL(0.2536) >>> 1
CEIL(1.3136) >>> 2
CEIL(5.9999) >>> 6

Open in new window

Thank you so much . you made my day . Its working great
Hi when i use count it was fine but when i am using ceil it assign ranking incorrectly and fetches all data. Have attached out images for both the with and without ceil . Kindly help me with that
image-when-i-use-only-count.PNG
output-when-i-use-ceil.PNG
if you look at rn column they are all 1
meaning they are first records in each group
so it is fine to me...
you want at least 1 record from each group!

if result is not what you expected, then your grouping/ranking logic is wrong!
or your data is wrong!
check your query...
probably you changed query and put some invalid grouping!
the one accepted here, should give only 1 record per customer id

what query did you use?
yeah . I need to check the logic. but  feel updating the assigned rank would help me .

Is that possible update the table with assigned rank  from below query

SET @custid_rank = NULL, @current_custid = NULL;
SELECT *
  FROM ( SELECT *,
                @custid_rank := IF(@current_custid = custid, @custid_rank + 1, 1) AS rn,
                @current_custid := custid
           FROM auditTable
          ORDER BY custid, eventdate DESC
       ) x
 WHERE rn <= 10;
you can dump that result into a new table, say you have a temp table tmpRank

insert into tmpRank
(above query here)

then it is a simple join/update query

it may be possible to do all in one as well..

but... because underlying data may change any time (?) ranking columns should be updated whenever it happens...
better use this query and forget about fixed columns :)