Solved

What is the proper method to compare histograms (or statistical data) between two data sets?

Posted on 2013-12-24
1
293 Views
Last Modified: 2014-01-22
Hi,

I have a set of data for Company A and Company B which tells me the number of times people visited each company in the month of March.

I'm seeing a stark difference between the data for each company, but I don't know how to define it or explain it mathematically (or statistically?).

In my first column of data, I have the number of times a person visited the store.
In the second column of data, I have the total number of people who visited the store that many times.

When grouping these data points into buckets, I notice quite a different drop off rate (I don't think that's the correct term) from one bucket to the next. What I'd like to know is how I can better explain this phenomenon, with the right terms to use, and also would like guidance on the correct type of analysis to do.

I've included my spreadsheet in the attachment, and catch on to concepts quickly - just need some education and guidance here please.

Thank you!
ExpExchange-Question.xlsx
0
Comment
Question by:lizziesmalls23
1 Comment
 
LVL 100

Accepted Solution

by:
mlmcc earned 500 total points
Comment Utility
Let me make certain I understand the data.

For Visit Number n the visitiors is the number of distinct visitiors who viisited exactly n times.

So for store A you have 174727 different people who visited exactly once and 121058 different people who visited exactly twice.  The 121058 are not included in the 174727.
So store A had 1,345,997 different people visit

Store B had 913,080 differnet people visit.

The analysis you do is driven by the questions being asked and the information you need to provide.

One question to ask is why did one store (assuming my interpretation of the data is correct) have roughly 47% more visitors?

What questions are you trying to answer?
What issues need to be addressed?

Other issues to consider that will affect the numbers
Are these the same type of store?
  Different franchises of the same chain?
Are they  in the same city?
    If not similar cities?
Are they in similar neighborhoods?
Are they catering to the same socioeconomic class?
What is the mix of the competing stores in their neighborhood

Before commenting on the data itself these types of questions need to be answered

Example
Stores A and B are in the same city on opposite sides of the town, same store (MyMart), same size, same basic inventory of goods.  They are trying to provide inexspensive goods to the mass market.
Store A in is a lower to middle middle class neighborhood.  Heavily populated with blue collar workers.  Store B is in an upper middle class neighborhood with a majority of professional workers.

MyMart may not have the same appeal in neighborhood B.  Looking for a better merchandise, dress shirts/pants instead of khakis and jeans.
Neighborhood B may think of their time as more valuable so they try to get more out of each visit.  Go only when they really need something.  Plan better so they get all the groceries in 1 trip per week instead of making daily trips.  Have more $ or credit so they can buy for a full week or 2 at a time

mlmcc
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

This tutorial explains how to create a series of drop-down lists that are dependent upon prior selections to guide (“force”) the user to make the correct selection and reduce data errors within Microsoft Excel. Excel 2010 was used for this tutorial;…
This article seeks to propel the full implementation of geothermal power plants in Mexico as a renewable energy source.
The viewer will learn how to simulate a series of coin tosses with the rand() function and learn how to make these “tosses” depend on a predetermined probability. Flipping Coins in Excel: Enter =RAND() into cell A2: Recalculate the random variable…
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now