asked on # Access -- calculate "median" and "mode"

Experts:

Please find attached two files:

1. "Example Data -- Statistics" (Excel)

2. "Mean, StDev, Var" (Access)

The Excel file contains the following five stats:

- Mean/Average

- StDev

- Var

- Median

- Mode

I've imported the 10 values (B1:B10) into Table1 of the Access file. In Query1, the mean, stdev, and variance can be easily derived.

My question: How can I compute/calculate the **median** and **mode** in the database?

Note: Please keep in mind that I'm currently (in the actual database) using a SQL statement that re-generates queries for other input. That said, if VBA is required for the calculation of these two statistical measures of dispersion, I need to be able to "call that function" from within Query1 in order to dynamically create/update other queries.

Ideally, any proposed solution includes a working query1 that includes these two measures.

Thank you for your assistance in advance.

EEH

Example-Data----Statistics.xlsx

Mean--StDev--Var.accdb

Please find attached two files:

1. "Example Data -- Statistics" (Excel)

2. "Mean, StDev, Var" (Access)

The Excel file contains the following five stats:

- Mean/Average

- StDev

- Var

- Median

- Mode

I've imported the 10 values (B1:B10) into Table1 of the Access file. In Query1, the mean, stdev, and variance can be easily derived.

My question: How can I compute/calculate the **median** and **mode** in the database?

Note: Please keep in mind that I'm currently (in the actual database) using a SQL statement that re-generates queries for other input. That said, if VBA is required for the calculation of these two statistical measures of dispersion, I need to be able to "call that function" from within Query1 in order to dynamically create/update other queries.

Ideally, any proposed solution includes a working query1 that includes these two measures.

Thank you for your assistance in advance.

EEH

Example-Data----Statistics.xlsx

Mean--StDev--Var.accdb

Microsoft AccessVisual Basic Classic

Calculate Mode:

SELECT TOP 1 Amt, Count(*) AS Mode

FROM YourTable

GROUP BY Amt

ORDER BY Count(*) DESC, Amt DESC;

This method ignores ties.

SELECT TOP 1 Amt, Count(*) AS Mode

FROM YourTable

GROUP BY Amt

ORDER BY Count(*) DESC, Amt DESC;

This method ignores ties.

@Rey:

Thank you for the link... I've read the information. I wasn't successful calling the function "DMedian" from the query. I'm sure I did something wrong. Btw, the link to the sample db isn't working any longer.

@PatHartmann:

Thank you... I was able to calculate the mode in a query (based on your feedback). Ideally though, based on my requirement for using dynamic SQL statements, is there a way to use the SQL code in a module/function... and then call the function in a query via, e.g., "ModeValue: Mode([LikertValu])? If yes, how can this be done?

Thank you,

EEH

Thank you for the link... I've read the information. I wasn't successful calling the function "DMedian" from the query. I'm sure I did something wrong. Btw, the link to the sample db isn't working any longer.

@PatHartmann:

Thank you... I was able to calculate the mode in a query (based on your feedback). Ideally though, based on my requirement for using dynamic SQL statements, is there a way to use the SQL code in a module/function... and then call the function in a query via, e.g., "ModeValue: Mode([LikertValu])? If yes, how can this be done?

Thank you,

EEH

Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!

James Murphy

You can also "reference" the Excel Median/Mode function, ...then use code to use this in a function in Access

I have a simple sample file I can upload if you like.

I have a simple sample file I can upload if you like.

There are several EE articles you might want to check out. Look in the right hand sidebar of this page under "Deeper MS Access Learning". Patrick Matthews has an article which covers Median and Mode thoroughly with sample code, queries and a sample database:

https://www.experts-exchange.com/Database/MS_Access/A_2529-Median-Mode-Skewness-and-Kurtosis-in-MS-Access.html

https://www.experts-exchange.com/Database/MS_Access/A_2529-Median-Mode-Skewness-and-Kurtosis-in-MS-Access.html

Jeffrey Coachman -- yes, I definitely would like to view your example. Thank you in advance for sharing it!

mbizup -- I'll check out the reading material. Thanks for posting the link.

mbizup -- I'll check out the reading material. Thanks for posting the link.

Get an unlimited membership to EE for less than $4 a week.

Unlimited question asking, solutions, articles and more.

I will see if I can dig up that sample tonight.

In the mean time you can investigate the info that mbizup posted...

In the mean time you can investigate the info that mbizup posted...

Will do... thank you, Jeffrey.

Log in or sign up to see answer

Become an EE member today7-DAY FREE TRIAL

Members can start a 7-Day Free trial then enjoy unlimited access to the platform

or

Learn why we charge membership fees

We get it - no one likes a content blocker. Take one extra minute and find out why we block content.

Not exactly the question you had in mind?

Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.

ask a question
@ExpExchHelp

*Btw, the link to the sample db isn't working any longer.*

EE presently has bug in downloading .accdb files

If you right click them, change the dialog to 'All Files' and rename them to .accdb, they can be downloaded and run.

PITA, and I do have a bug notice in about it.

EE presently has bug in downloading .accdb files

If you right click them, change the dialog to 'All Files' and rename them to .accdb, they can be downloaded and run.

PITA, and I do have a bug notice in about it.

All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat

William Peck

The reason ties are ignored is that

a) a SQL "Select Top 1 ..." is only going to return one value and

b) It's a bit hellish to code

If you have a set 1,2,2,2,3 your mode function can return something numeric, like an integer or a long

If you have a set 1.1, 2.1. 2.2, 2.2, 2.2 ,3.0 you can still return something numeric like a single.

If you allow ties and need them, well

A set like 1,2,2,2,3,3,3,4,4,4,5,5,5,

And that's really no fun.

Testing the string, is there more than one element, what's the delimiter, running a Split, coercing the fragments...

Well, you get the idea.

Not a lot of fun, all the way down the line.

Nick67:

I think this is an excellent solution... thank you for providing it.

In the test database, all values appear to be computed correctly. I will integrate your solution into the actual database early next week. Some of my existing queries do not use the "group by"... so I have to see whether or not this may cause a conflict.

If I end up w/ a follow-up question, I will post any issues and refer to a new post (i.e., question).

I'd like to thank everyone for contributing to this solution. Nick's proposed solution, however, best addresses the requirements. Again, thanks!!

EEH

I think this is an excellent solution... thank you for providing it.

In the test database, all values appear to be computed correctly. I will integrate your solution into the actual database early next week. Some of my existing queries do not use the "group by"... so I have to see whether or not this may cause a conflict.

If I end up w/ a follow-up question, I will post any issues and refer to a new post (i.e., question).

I'd like to thank everyone for contributing to this solution. Nick's proposed solution, however, best addresses the requirements. Again, thanks!!

EEH

Excellent solution!!

Get an unlimited membership to EE for less than $4 a week.

Unlimited question asking, solutions, articles and more.

Nick67:

Quick follow-up question. As part of the dataset, a 5-point Likert scale is used. For survey respondents who didn't know the answer to a question, a value of 999 is automatically being entered. That criteria (999 = null) cannot change.

In order to exclude any 999 values (for calculation of average, variance, and standard deviation), I have been using the following expressions in my queries:

Average: Avg(IIf([LikertValue]=999,Null,[LikertValue]))

Based on the module (calculation for median and mode), I've tried to use the sample principle to change the expression for the, e.g., median from/to:

From:

Median: MedianValue("Table1","LikertValue")

To:

Median: IIf(MedianValue("Table1","LikertValue")=999,Null,MedianValue("Table1","LikertValue"))

Unfortunately, that did not work and it still shows, e.g., a median of "502" when using the following sample data:

LikertValue

1

2

2

5

5

999

999

999

999

999

My question: How can I modify the expression so that the 999 values are excluded from calculating the median and/or mode?

Thanks,

EEH

Quick follow-up question. As part of the dataset, a 5-point Likert scale is used. For survey respondents who didn't know the answer to a question, a value of 999 is automatically being entered. That criteria (999 = null) cannot change.

In order to exclude any 999 values (for calculation of average, variance, and standard deviation), I have been using the following expressions in my queries:

Average: Avg(IIf([LikertValue]=999,

Based on the module (calculation for median and mode), I've tried to use the sample principle to change the expression for the, e.g., median from/to:

From:

Median: MedianValue("Table1","Like

To:

Median: IIf(MedianValue("Table1","

Unfortunately, that did not work and it still shows, e.g., a median of "502" when using the following sample data:

LikertValue

1

2

2

5

5

999

999

999

999

999

My question: How can I modify the expression so that the 999 values are excluded from calculating the median and/or mode?

Thanks,

EEH

Alter the recordset opening command as appropriate. This for Mode:

Set rs = CurrentDb.OpenRecordset("SELECT TOP 1 " & FieldName & " , Count(*) AS Mode FROM " & BaseObjectName & " where " **& FieldName & " is not null** GROUP BY " & FieldName & " ORDER BY Count(*) DESC," & FieldName & " DESC;", dbOpenDynaset, dbSeeChanges)

We are already excluding Null

Now throw out 999 too

Set rs = CurrentDb.OpenRecordset("SELECT TOP 1 " & FieldName & " , Count(*) AS Mode FROM " & BaseObjectName & " where " & **FieldName & " is not null AND " & FieldName & " <> 999** GROUP BY " & FieldName & " ORDER BY Count(*) DESC," & FieldName & " DESC;", dbOpenDynaset, dbSeeChanges)

This for median

Set rs = CurrentDb.OpenRecordset("SELECT " & FieldName & " FROM " & BaseObjectName & " where " & **FieldName & " is not null AND " & FieldName & " <> 999 **ORDER BY " & FieldName & ";", dbOpenDynaset, dbSeeChanges)

Set rs = CurrentDb.OpenRecordset("S

We are already excluding Null

Now throw out 999 too

Set rs = CurrentDb.OpenRecordset("S

This for median

Set rs = CurrentDb.OpenRecordset("S

Nick67:

Brilliant!! I've modified the one for the median first... it now compute "2" in the dataset. :)

For the mode, I actually realized that I may need to include the '999' values. Here's why: As mentioned, '999' is an arbitrary value that stands for "I don't know the answer to the question" (or "N/A").

If the majority of survey respondents chose 999 for a specific answer, then I think I should return the mode as 999. Do you agree with that assumption? Or do you think it also should be excluded?

EEH

Brilliant!! I've modified the one for the median first... it now compute "2" in the dataset. :)

For the mode, I actually realized that I may need to include the '999' values. Here's why: As mentioned, '999' is an arbitrary value that stands for "I don't know the answer to the question" (or "N/A").

If the majority of survey respondents chose 999 for a specific answer, then I think I should return the mode as 999. Do you agree with that assumption? Or do you think it also should be excluded?

EEH

I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst

William Peck

It's your data, and you are the one analyzing it.

The mode function is tricky to use in programming.

We haven't built a mode function that returns data about multi-modal data.

We have told the function to return the same multi-modal result each time (smallest field value of the ties)

2,2,3,3,4,4,5,5 have a mode of 2, 3, 4 and 5

The function returns 2

Does the 999 value say something useful?

Then leave it in for mode

Is it so much higher than the 'real' values that if it were used in a median calculation it would skew the daylights out of the value, then leave it out

You could build functions that count 999 and total records and return the % of 999 in the sample.

It's your call

The mode function is tricky to use in programming.

We haven't built a mode function that returns data about multi-modal data.

We have told the function to return the same multi-modal result each time (smallest field value of the ties)

2,2,3,3,4,4,5,5 have a mode of 2, 3, 4 and 5

The function returns 2

Does the 999 value say something useful?

Then leave it in for mode

Is it so much higher than the 'real' values that if it were used in a median calculation it would skew the daylights out of the value, then leave it out

You could build functions that count 999 and total records and return the % of 999 in the sample.

It's your call

Nick67:

Thank you for the additional feedback... I appreciate it.

999 is used for coding any N/A (or "I don't know the answers) responses. So, in the event the majority of participants selected this value, having such information (as part of the mode) is of value to us as well.

For the median, I may duplicate the function (obviously renaming function/variable names) and then have "median with N/A" and "median without N/A".

Either way, your solutions for calculating both median and mode are most elegant.

Cheers,

EEH

Thank you for the additional feedback... I appreciate it.

999 is used for coding any N/A (or "I don't know the answers) responses. So, in the event the majority of participants selected this value, having such information (as part of the mode) is of value to us as well.

For the median, I may duplicate the function (obviously renaming function/variable names) and then have "median with N/A" and "median without N/A".

Either way, your solutions for calculating both median and mode are most elegant.

Cheers,

EEH

Glad to be of service

Nick67

Nick67

Get an unlimited membership to EE for less than $4 a week.

Unlimited question asking, solutions, articles and more.

Finding the Median Value for Fields in Access