Filtering an exported Google Analytics report

Recently I recognized that my GA statistics were skewed by a few visitors whose visits accounted for a disproportionately large percentage of total sessions.  I've applied filters now, so that data going forward will exclude those visits.  I understand that filters can't be applied retroactively, but I've heard that data can be exported, and then filtered somehow.  If this is so, I'd appreciate step-by-step guidance on how to export data, and remove visits from certain cities, so that the bounce rate, duration of sessions and number of pages visited could be evaluated without the bias of those frequent visitors.
ddantesAsked:
Who is Participating?
 
Joe Winograd, Fellow&MVEConnect With a Mentor DeveloperCommented:
Hi David,

At GA, in the navigation on the left:

Audience
Geo
Location

Then in the links for the fields, click City:

ga cityAt the bottom, click the "Show rows" drop-down and select 5000 (unfortunately, there's no way to show all rows, so if there are more than 5,000, you'll have to piece them together):

ga show rowsThen at the top, click Export>Excel (XLSX):

ga exportYou'll now have an Excel spreadsheet with two tabs — Summary and Dataset1. Click Dataset1 and you'll have something like this:

ga spreadsheet citiesAt the bottom in the last row will be the totals:

ga spreadsheet bottomHowever, they are not formulas, so you'll need to build your own formulas, but that shouldn't be tough. Sort the spreadsheet by the City column and you'll be able to delete easily all the rows containing the cities that you want to exclude. Then delete the last row with the hard-coded totals and create formulas to give you the statistics that you want. Regards, Joe
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
Hi David,
I'm working on a solution for you. Should be able to post it here in less than a half-hour. Regards, Joe
0
 
ddantesAuthor Commented:
Hi Joe:  I had a premonition you might answer this question.  Thanks for your instructions!  Now that I've exported the data and removed rows from cities I wish to eliminate, I need to create formulas to replace the hard-coded data in the last row.  Sorry to admit, I'm clueless as to how to create those formulas.  I'm interested in bounce rate, time spent on site, and number of pages visited.  I increased the points for this question because the scope of it is more detailed than I had anticipated.
0
The IT Degree for Career Advancement

Earn your B.S. in Network Operations and Security and become a network and IT security expert. This WGU degree program curriculum was designed with tech-savvy, self-motivated students in mind – allowing you to use your technical expertise, to address real-world business problems.

 
Joe Winograd, Fellow&MVEDeveloperCommented:
Hi David,
The formulas below show the last row with a city as 73, meaning the Totals row is 74. This is arbitrary — just wanted a noticeable row number. Of course, you will change that number to whatever row in your spreadsheet has the last city (and plus one for the Totals row).

Bounce rate:

(1) At the end of Sessions (in cell B74), put in a Sum formula:

=SUM(B2:B73)

This will now be the total number of sessions with the "bad" cities removed.

(2) Add a new column (K) called Bounced Sessions. The formula for it is Sessions (column B) times Bounce Rate (column E). Put this in cell K2:

=B2*E2

(3) Do a Copy Down from K2 to K73. This will automatically adjust the formula, so K3 will be =B3*E3, K4 will be =B4*K4, etc. This means each cell in the K column will have the number of bounced sessions for each city.

(4) At the end of the Bounced Sessions column (in cell K74), put in a SUM formula:

=SUM(K2:K73)

This is the total number of bounced sessions with the "bad" cities removed.

(5) The overall Bounce Rate is the total number of Bounced Sessions (K74) divided by the total number of Sessions (B74):

=K74/B74

Put this in whatever cell you want, but, of course, after the last row or column so it doesn't interfere with calculations.

While you're working on Bounce Rate, I'll think about Pages/Session and Session Duration. Regards, Joe
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
Pages/Session:

(1) Add a new column (L) called Pages. The formula for it is Sessions (column B) times Pages/Session (column F). Put this in cell L2:

=B2*F2

(3) Do a Copy Down from L2 to L73. This will automatically adjust the formula, so L3 will be =B3*F3, L4 will be =B4*L4, etc. This means each cell in the L column will have the number of pages for each city.

(4) At the end of the Pages column (in cell L74), put in a SUM formula:

=SUM(L2:L73)

This is the total number of pages with the "bad" cities removed.

(5) The overall pages per session is the total number of Pages (L74) divided by the total number of Sessions (B74):

=L74/B74

Put this in whatever cell you want, but, of course, after the last row or column so it doesn't interfere with calculations.

Will work on Avg. Session Duration soon, but need to address an issue right now on the home front. Regards, Joe
0
 
ddantesAuthor Commented:
Perfect results so far, Joe.
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
> Perfect results so far, Joe.

Glad to hear it!

Avg. Session Duration:

(1) Add a new column (M) called Durations. The formula for it is Sessions (column B) times Avg. Session Duration (column G). Put this in cell M2:

=B2*G2

(3) Do a Copy Down from M2 to M73. This will automatically adjust the formula, so M3 will be =B3*G3, M4 will be =B4*G4, etc. This means each cell in the M column will have the duration of all sessions for each city.

(4) At the end of the Durations column (in cell M74), put in a SUM formula:

=SUM(M2:M73)

This is the total duration of all sessions with the "bad" cities removed.

(5) The overall average session duration is the total duration of all sessions divided by the total number of Sessions (B74):

=M74/B74

Put this in whatever cell you want, but, of course, after the last row or column so it doesn't interfere with calculations.

In the end, all three metrics could be handled with the same technique, but that wasn't obvious going in — I needed to look at exactly what data GA was capturing before I was able to reach that conclusion. Regards, Joe
0
 
ddantesAuthor Commented:
Thanks, Joe.  I've been relying on data which is skewed by sampling error, and drawing invalid inferences from it.  This will be a big help.
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
David,
You're very welcome. As always, happy to help — or at least try to. Regards, Joe
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.