• Status: Solved
• Priority: Medium
• Security: Public
• Views: 106

# Use Excel to solve somewhat complex data sorting and analysis

I am work for a fire department and we are trying to do some data analysis from our call information. We have been successful using this data in Excel to answer simple questions, but one question has been vexing us for some time. That's the ability to count how many times per day/month/year/etc we have certain units out of service all at the same time.

Attached is a small amount of data with some items redacted from our raw data dump (which is about 40k lines per year). Column U has a description of which unit is assigned, and columns V and Y have the start and end time of the calls. I'm trying to answer questions like, how many times in a given time period are BAM1, BAM5, BAM6 all unavailable at the same time (meaning that the start and stop times of the calls they are on all would have them all unavailable at the same time).

We know this happens a lot, and there are other sets of units we want to apply this logic to, but I just don't know how to use this data to answer this question. I'm hoping to get a number over a year for my larger data sets and then know many times per day different unit sets are all unavailable.
Sample-FD-Data-Set.xlsx
0
ssittig
• 13
• 11
• 11
2 Solutions

Commented:
Not clear can you further clarify what is needed ?

Mre your request is for units BAM1, BAM5, BAM6 but the data sample is missing BAM5 at least provide the minimum so if we build a formula we are bale to test it.

and appreciate in the sample you provide give us a clear example like as row so and so = so then we should get abc ... etc..

This would very much expedite an answer for you.

gowflow
0

Commented:
Also when you say all at the same time unavailable what exactly in this data shows this information. Col V and Y is dispatch time and finished time ?

gowflow
0

Commented:
if you put this formula in AP2 and drag it down
=COUNTIFS(U:U,"=BAM6",V:V,">=" & V2,Y:Y,"<=" & Y2)
it should give you a count each time BAM6 is not available

if you put this in AQ2 and drag down
=COUNTIFS(U:U,"=BAM1",U:U,"=BAM5",U:U,"=BAM6",V:V,">=" & V2,Y:Y,"<=" & Y2)
it should give you a count each time all of BAM1, BAM5 and BAM6 unavailable.

As you notice for the latter as there is not data for BAM5 you are getting a 0 all the way.
You may replace BAM6 by BAM1 in the first formula and see the results.

gowflow
0

Author Commented:
Gowflow,

Thanks for taking a shot at this. I can put up a bigger data sample if that's helpful. The whole data set for a year is about a 13MB excel file.

To clarify what is needed, here is what I'm looking for:

If BAM1 (Column U) goes out on a call (Column B) from 0900 (Column V) to 01000 (Column Y) and BAM6 goes out on a call from 0930 to 1030, then they are both unavailable (out of service in our terminology) from 0930 to 1000. This happens I would count this as 1 time that BAM1 and BAM6 are both unavailable. This takes calculating between multiple rows. I dont think you have to calculate Column B, only overlapping out of service times.

I'm hoping to use this logic to develop several counting situations (BAM1/BAM5/BAM6), (BAE1/BAE2), and (BAE4,BAE5,BAE6,BAE7) for starters and maybe some more. Those are the most important ones.
0

Commented:
Do you need the instances or the exact time slots where all three are out on call?

It would be nice to see some BAM5 rows.

Do you have MS Access?
0

Author Commented:
I just need the number of instances, not the actual time slot.

Attached is a months worth of data so you will have all kinds of stuff in there to play around with.

Yes, I have Access, and I realize that it may be the trick, but I similarly don't know what kind of Query i would use to produce this.
0

Commented:
0

Author Commented:
0

Commented:
I imported the Excel data into an Access database, creating an autonumber field (ID) in the process.  When I ran the following query, no rows were returned.
``````SELECT Q_28617288.[Inc Num], Q_28617288.[Apparatus Assigned], Q_28617288.[Dispatch Date], Q_28617288.[Available Date], Q_28617288_1.[Inc Num], Q_28617288_1.[Apparatus Assigned], Q_28617288_1.[Dispatch Date], Q_28617288_1.[Available Date], Q_28617288_2.[Inc Num], Q_28617288_2.[Apparatus Assigned], Q_28617288_2.[Dispatch Date], Q_28617288_2.[Available Date]

FROM Q_28617288, Q_28617288 AS Q_28617288_1, Q_28617288 AS Q_28617288_2

WHERE (
(Q_28617288.[Apparatus Assigned] In ("BAM1","BAM5","BAM6"))

AND Q_28617288_1.[Apparatus Assigned] In ("BAM1","BAM5","BAM6")
And Q_28617288.ID<[Q_28617288_1].ID
AND (
Q_28617288_1.[Dispatch Date] Between [Q_28617288].[Dispatch Date] And [Q_28617288].[Available Date]
OR
Q_28617288_1.[Available Date] Between [Q_28617288].[Dispatch Date] And [Q_28617288].[Available Date]
)

AND Q_28617288_2.[Apparatus Assigned] In ("BAM1","BAM5","BAM6")
And Q_28617288_1.ID<[Q_28617288_2].ID
AND (
Q_28617288_2.[Dispatch Date] Between [Q_28617288_1].[Dispatch Date] And [Q_28617288_1].[Available Date]
OR
Q_28617288_2.[Available Date] Between [Q_28617288_1].[Dispatch Date] And [Q_28617288_1].[Available Date]
)

)

ORDER BY Q_28617288.[Dispatch Date]
``````
0

Author Commented:
Just to validate your logic, if you run that query with just BAE1 and BAE2 do you get a count greater than zero?
0

Commented:
This two-item version of the query:
``````SELECT Q_28617288.[Inc Num], Q_28617288.[Apparatus Assigned],
Q_28617288.[Dispatch Date], Q_28617288.[Available Date],
Q_28617288_1.[Inc Num], Q_28617288_1.[Apparatus Assigned],
Q_28617288_1.[Dispatch Date], Q_28617288_1.[Available Date]

FROM Q_28617288, Q_28617288 AS Q_28617288_1

WHERE (
(Q_28617288.[Apparatus Assigned] In ("BAE1","BAE2"))

AND Q_28617288_1.[Apparatus Assigned] In ("BAE1","BAE2")
And Q_28617288.ID<[Q_28617288_1].ID
AND (
Q_28617288_1.[Dispatch Date] Between [Q_28617288].[Dispatch Date] And [Q_28617288].[Available Date]
OR
Q_28617288_1.[Available Date] Between [Q_28617288].[Dispatch Date] And [Q_28617288].[Available Date]
)

)

ORDER BY Q_28617288.[Dispatch Date]
``````
Yielded the following three row result:
``````Q_28617288.Inc Num	Q_28617288.Apparatus Assigned	Q_28617288.Dispatch Date	Q_28617288.Available Date	Q_28617288_1.Inc Num	Q_28617288_1.Apparatus Assigned	Q_28617288_1.Dispatch Date	Q_28617288_1.Available Date
BA13000092      	BAE1                        	1/2/13 17:07            	1/2/13 17:11            	BA13000092          	BAE2                        	1/2/13 17:07            	1/2/13 17:11
BA13000633       	BAE1                        	1/12/13 15:45            	1/12/13 15:53            	BA13000633          	BAE2                        	1/12/13 15:45            	1/12/13 15:47
BA13001653      	BAE1                        	1/31/13 11:44            	1/31/13 11:51            	BA13001653          	BAE2                        	1/31/13 11:44            	1/31/13 11:55
``````
0

Commented:
ok I am working for you on a different kind of solution that you will be able to apply for all your unit and that is flexible but I need sometime if you are interested.
Let me know
gowflow
0

Commented:
In this approach, I used a tally table (numbers table) to iterate the minutes in the time intervals.  In this example, the tallytable has values between 1 and 1000.  While I could have been as precise as one second in my iteration, performance and human readability (too many rows) were degraded.
``````SELECT DateAdd("n",[tallytable].[ID]-1,[dispatch date]) AS IntervalMin
FROM Q_28617288, tallytable
WHERE (((Q_28617288.[Apparatus Assigned]) In ("BAE1","BAE2")) AND ((DateAdd("n",[tallytable].[ID]-1,[dispatch date])) Between [dispatch date] And [available date]))
HAVING (((Count(Q_28617288.[Apparatus Assigned]))=2));
``````
The query produced the following results.
``````IntervalMin
1/2/2013 5:07:02 PM
1/2/2013 5:08:02 PM
1/2/2013 5:09:02 PM
1/2/2013 5:10:02 PM
1/12/2013 3:45:17 PM
1/12/2013 3:46:17 PM
1/31/2013 11:44:54 AM
1/31/2013 11:45:54 AM
1/31/2013 11:46:54 AM
1/31/2013 11:47:54 AM
1/31/2013 11:48:54 AM
1/31/2013 11:49:54 AM
1/31/2013 11:50:54 AM
``````
From these results, we see that both vehicles were out between these times.
1/2/2013 5:07:02 PM and 1/2/2013 5:10:02 PM
1/12/2013 3:45:17 PM and 1/12/2013 3:46:17 PM
1/31/2013 11:44:54 AM and 1/31/2013 11:50:54 AM

It would be fairly easy to create a VBA routine that would create and execute a similar query, based on the size of the array passed to it.
Note: the BAM1,5,6 test returned no rows.

0

Commented:
If you wanted to use a table for your vehicle list, this result might not need any VBA code.
0

Commented:
To illustrate this, I created a Q_28617288_Vehicles with a single text column for the vehicle name.  I populated the table with "BAE1" and "BAE2" row data.  The query can be simplified to the following and does not need to be created with VBA.
``````SELECT DateAdd("n",[tallytable].[ID]-1,[dispatch date]) AS IntervalMin
FROM Q_28617288, tallytable, Q_28617288_Vehicles
WHERE (((Q_28617288.[Apparatus Assigned]) = Q_28617288_Vehicles.VehicleName)
AND ((DateAdd("n",[tallytable].[ID]-1,[dispatch date])) Between [dispatch date] And [available date]))
HAVING (((Count(Q_28617288.[Apparatus Assigned]))=(Select Count(*) From Q_28617288_Vehicles)));
``````
0

Author Commented:
GowFlow,

I'm certainly interested in any Excel solution that is really simple to deploy. That's a little easier for me to use on the fly.

This being said:

Aikimark,

Your results are exactly what I'm looking to do. I'm not surprised that you don't have any results for BAM1/BAM5/BAM6, we've had some deployment changes that have that scenario occurring quite a bit more frequently but we have been unable to quantify what that looks like. This is Jan 13 data, the Jan 15 data is quite a bit different for this combination.

And bravo to you sir, that's amazing work. I'll read your article that you posted and see if I can figure it out. I'm not going to close this question quite yet as I'd like to see what GowFlow comes up with and I may have some follow up questions on how to get your queries into my Access.
0

Commented:
I looked at the max interval in the posted data and see it is less than 5000 seconds.  If you created a tally table with enough values to cover the worst/longest case dispatch, you could calculate with one second accuracy and not take too much of a performance hit.

I'm tempted to try to remove the seconds part of the datetime values, since we are at the minutes level of granularity.  I'll wait to see what your testing discovers.
0

Commented:
Here is my solution but for sure it is more visual and can be built on further but this will give you a quick overview on what are the units that overlap others per day. Based on this data one could just create graph or other formulas that would be more cater to specific needs.

This is meant for a very fast overview and time is up or minus the 5 minutes. like I put intervals be 5 minutes and tried to reduce the length so it would minimize as much as possible scrolling.

Simply run the macro called DisplayHoursMinutes and it will build the new layout basis the data in sheet1. Make sure sheet result is always there. It will clean it at every run.

gowflow
January-2013-Fire-Call-Data.xlsm
0

Commented:
This is an enhanced version whereas the unit dispatched will be colored in blue if it is simultaneously sent with an other unit you will see blue for those who time is overlapping the ones that are not blue means they were sent on time where no other unit during that specific day time overlapped.

gowflow
January-2013-Fire-Call-Data-V01.xlsm
0

Commented:
Here is a VBA routine to show the everyone-dispatched condition
``````Sub Q_28617288(parmVehicles)
'parmVehicles should be an array of vehicle names
Dim rng As Range
Dim dicMinutes As Object
Dim dicVehicles As Object
Dim vItem As Variant
Dim strTime As String
Dim lngMins As Long
Dim dtStart As Date, dtEnd As Date
Set dicMinutes = CreateObject("scripting.dictionary")
Set dicVehicles = CreateObject("scripting.dictionary")
For Each vItem In parmVehicles
If dicVehicles.exists(vItem) Then
Else
End If
Next

For Each rng In ActiveSheet.Range(ActiveSheet.Range("U1"), ActiveSheet.Range("U1").End(xlDown))
If dicVehicles.exists(rng.Value) Then
dtStart = DateAdd("s", -Second(rng.Offset(0, 1).Value), rng.Offset(0, 1).Value)
dtEnd = DateAdd("s", -Second(rng.Offset(0, 4).Value), rng.Offset(0, 4).Value)
For lngMins = 0 To DateDiff("n", dtStart, dtEnd)
strTime = Format(DateAdd("n", lngMins, dtStart), "yymmddhhnn")
If dicMinutes.exists(strTime) Then
dicMinutes(strTime) = dicMinutes(strTime) + 1
Else
End If
Next
End If
Next
'List the times (to the minute) when all vehicles were dispatched
For Each vItem In dicMinutes
If dicMinutes(vItem) = dicVehicles.Count Then
Debug.Print Format(DateSerial(Mid(vItem, 1, 2), Mid(vItem, 3, 2), Mid(vItem, 5, 2)) + TimeSerial(Mid(vItem, 7, 2), Mid(vItem, 9, 2), 0), "m/d/yyyy hh:nn")
End If
Next

End Sub
``````

When I invoke the routine like this:
Q_28617288 array("BAE1","BAE2")

It produces the following results in the Immediate window:
``````1/2/2013 17:07
1/2/2013 17:08
1/2/2013 17:09
1/2/2013 17:10
1/2/2013 17:11
1/5/2013 02:10
1/12/2013 15:45
1/12/2013 15:46
1/12/2013 15:47
1/12/2013 15:56
1/26/2013 17:24
1/31/2013 11:44
1/31/2013 11:45
1/31/2013 11:46
1/31/2013 11:47
1/31/2013 11:48
1/31/2013 11:49
1/31/2013 11:50
1/31/2013 11:51
``````
0

Author Commented:
Gowflow,

I like the idea of your spreadsheet, but I'm not sure that its giving me an a count of how many time this is happening. And I don't think the 5 min intervals work for what we are doing.

Aikimark,

You clearly have the solution. I'll admit that I'm not able to follow the entire path of how you get from the data table I gave you, to where you are able to calculate the number of instances. But you have exactly what I'm looking for. Would you mind sharing the DB you made so I can see if I can mentally reverse engineer it? I'm an IT guy, not a DB/programmer guy and I've learned just how amazingly far apart those two things are over the years.
0

Commented:
Well I already mentioned it this is not final this is a start this is more visual and this is not specific to certain units but gives an total overview of the whole data.

now if you want specifics we can build on that by adding a dropdown and combining more VBA to reach to what you specifically need. The +/- 5min is for visuals but not for specifics.

You had mentioned 1 scenario of BAM1,BAM5 and BAM6 but you had also mentioned couple more scenarios meaning that his was variable which for sure need catering for.

Here you go !!
I like challenges.

In this version I added a listbox in sheet result that will populate all the distinct units and you have 2 buttons
Button Clear:
Will clear selections that you do in a list box you can select several items to process if they follow press shift and select them if they don't follow keep pressing CTRL on every item you select.

Button Compute
When activate it will compute the data you selected in the list and here you have 2 options to display the data the check box that says exact selection if it is ticked then it will give you ONLY the instances of all items selected if found in sheet compute. If it is not selected then it will display all options at least where there are 2 units that overlap or more.

Pls play with it and let me know.
gowlfow
January-2013-Fire-Call-Data-V02.xlsm
0

Commented:
@ssittig

Have you tried my VBA code example?
Does it give you the correct results?

Do you want to see the database just for learning purposes?
0

Commented:
Here is a database with the tables and revised query.
Q-28617288.mdb
0

Author Commented:
Superb work.

You both have come up with great solutions for what I'm trying to do here. I'm still working on how to automate a count of those numbers of times. I'd love to award you both full points, but I'll have to settle for splitting them. Truly amazing work, and a nice foundation for me to learn how to use this stuff. Thank you.
0

Commented:

I'm still working on how to automate a count of those numbers of times.

What do you mean by count as number of times they all unavailable at the same time ? If in the solution I provided it is an easy formula to add at the end. Case you need this pls advise I will incorporate it or else then pls clarify.

gowflow
0

Author Commented:
I mean a number of instances that it occurs.

For example when I chose BAE1 and BAE2 on your spreadhsheet and hit compute, I have 3 results on the compute page where this phenomenon is occurring.

-S
0

Commented:
ok got it. Easy try this version will give you what you want.
gowflow
January-2013-Fire-Call-Data-V03.xlsm
0

Author Commented:
gowflow,

That's totally what I'm looking for. I'd like to ask if its possible for one additional refinement:

Attached is a larger subset of data. If you pick a combination of three units (lets just say BAM1, BAM5, BAM6), it will return results where all three are out, but also results where you have BAM6 out and then BAM5 out twice concurrently. (run this dataset against your spreadsheet and then see 1/11/2013 in the results).

How can we restrict the results to only show when the selected units are all out together?
2013-January-Unabridged.xlsx
0

Commented:
Yes I can see that but looking at the data they are legitimate. Ie it is not a fault but you now are saying that you only want to see the instance of each unit per day once ? or I am not reading you correctly ?

Lets take the examples:
BAM1      1/10/2013 10:27      1/10/2013 12:19
BAM5      1/10/2013 11:51      1/10/2013 11:56
BAM5      1/10/2013 11:56      1/10/2013 12:09

the second dispatch of BAM5 is still in the same slot as you can see BAM1 avail only at 12:19 and BAM5 out on 11:56 still legitimate as in 12:09 and still within the limit.
So what do you want to see here ? this one rejected coz BAM6 is not there I suppose right ?

This I would tend to agree that we need to look to make sure all selected items are out at least once. This would be ok with me to fix,

Now here comes the other scenario
BAM6      1/15/2013 18:45      1/15/2013 20:14
BAM5      1/15/2013 18:55      1/15/2013 19:07
BAM5      1/15/2013 19:14      1/15/2013 19:22
BAM1      1/15/2013 19:17      1/15/2013 19:31

what do you want to see here ? is it ok like this to show twice BAM5 as also legitimate here as well.

Let me know.
gowflow
0

Commented:
This version has been adjusted as found that there was a small hick-up in the previous as the date were wrongly compared.

Still the issue described in my previous post is not yet handled as waiting for your comments. Meanwhile you can see now that the results for BAM1, BAM5, BAM6 are now more precise (you had 29 occurrences now you have 45)

Visuals have been adjusted to prevent resizing of controls as well.
gowlfow
January-2013-Fire-Call-Data-V04.xlsm
0

Author Commented:
gowflow.

You are correct with your assumptions. Its OK to have one vehicle listed twice, but not having all of the selected vehicles would be grounds for that not to be an occurrence. I suspect there will be more than one occurrence a day for many of days, so I wouldn't want to limit it to once per day.
0

Commented:
ok great then we agree if we choose a selection of say 4 units and ask for all items check box then we will only list items where there are at least the 4 units (regardless if same unit is repeated twice in the time frame selected)

Prior comment I meant time frame selected and not same day actually it work by time frame.

I think you will like this version !!!
Let me know.
gowflow
January-2013-Fire-Call-Data-V05.xlsm
0

Author Commented:
Thats exactly right. Fantastic work gowflow. You are a true genius!

THANK YOU!
0

Commented:
Please feel free to put a link in here for any other question you may need help with.

BTW I was impressed with the figures when running BAM1,BAM5,BAM6 so many instances when same units go back and forth in the same time frame !!!

gowflow
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.