Link to home
Start Free TrialLog in
Avatar of Shino_skay
Shino_skayFlag for United States of America

asked on

VBA - Efficient Solution to count 2 rows based on multiple criterias.

Hi,
I am looking for an efficient VBA method to solve a problem at work. I enclosed a reference spreadsheet that illustrates what I'm trying to do.

We do a lot of trades with other local currencies and 99% of the time, 1 leg of the trade (pay or receive) is USD denomiated. Many of the local currencies have a synthetic currency such as Argentina Peso (ARS) and its synthetic CER counterparty.

The source data I work with has two columns, the PAY currency and the Rec currency. Last year, I came up with a silly logic that captures the true trades for the given period. The logic is below and also included in the refrence spreadsheet.

Group A (sum the total)
1) Count if all of Pay Currency Column = Prim CCY
2) Count if all of Rec Currency Column = Syn CCY

Group B (sum the total then subtract from Group A)
1) Sumproduct all of Pay Currency Column = Prim CCY & all of Rec Currency Column  = Prim CCY
2) Sumproduct all of Pay Currency Column = Syn CCY & all of Rec Currency Column = Syn CCY
3) Sumproduct all of Pay Currency Column = Prim CCY & all of Rec Currency Column = Syn CCY
4) Sumproduct all of Pay Currency Column = Syn CCY & all of Rec Currency Column = Prim CCY

The results are the true trade volume by region. I would then use a list of products (14 and counting) and do this same loop but this time by product. It takes about 4 mins with my formulas due to the fact I have to set the range to be over 5000 rows due to the fact that data range flucates so much.

I've gotten to the point in VBA where I can simply replace each formula in the cells with the identical VBA version but I feel this is not the best way to utilize VBA I am hoping someone on this forum can provide something that would be a bit more efficient and potentially more variable-based.

The second problem I have pertains to the 1% of the time where trades are not USD denominated on 1 side (pay or rec). There are trades that are let's say for example, GBP vs MXN or PEN vs EUR. In these cases, I have a list of 4 tiers where countries are assigned a # to identify its hiearchy. Argentina ARS are considered Tier 1 while the EUR is Tier 4, so if any trades that are Tier 4 vs Tier 1 (or Tier 1 vs Tier 4) the Tier 1 country is awarded the trade. I do not have a process setup for this and I just manually intervene when I notice a discrepancies in the figures.

Thanks everyone.

Sampe-migration.xls
Avatar of byundt
byundt
Flag of United States of America image

I suggest adding three auxiliary columns. The first would have a formula returning the currency region on the Pay side, regardless whether real or synthetic. The second would have a formula returning the currency region on the Rec side, again regardless whether real or synthetic. The third would have a formula that returns half the amount if the Pay and Rec currency regions are the same and the full amount if they are different. These formulas will be fast because they look only at the values on a single row.

To get your total amount of trade within a currency region, you would need two SUMIF formulas. The first one would be for the weighted amount and Pay currency region=target. The second would be for the weighted amount and Rec currency region = target. There would no longer be a need for the formulas in Group B. I bet these two formulas would be pretty fast, as well.

If you want to add in the complexity of handling the 1% of cases that aren't denominated in USD, then that would be incorporated into the formula in the third auxiliary column. In essence, you would evaluate the tier difference, then apply a conversion rate from a VLOOKUP. This would still be pretty fast, because all the grunting would be done only on a single row, and only in those cases that really need it.

As an alternative to the SUMIF, you could use a PivotTable. This will prepare the entire report for you.

Needless to say, either of the above approaches would be pretty simple to automate with VBA.
Avatar of Shino_skay

ASKER

Hi Brad, on the way home I realized I didn't fully describe the 2nd portion of the count which is to count each known product by each known region. Same method of looping except longer and I'm using sumproduct for Group A (product and 1 currency, then the other). So that's  14 products by 2x20 regions. Lots of cramming.

I'll quit whining here and start playing around with your 3 column idea and sumif. Anything better then the 20meg file I have currently. Thanks again. I also took a glance at your profile, MIT alum eh. That explains a lot.
After looking at your workbook, I see that you were building the table for each type of derivative. As a result, I switched from SUMIF to a SUMPRODUCT formula like:
=SUMPRODUCT((('Enter Source Data'!$G$2:$G$10000=$AP27)+('Enter Source Data'!$H$2:$H$10000=$AP27))*('Enter Source Data'!$C$2:$C$10000=SUBSTITUTE(AQ$26," ","")),'Enter Source Data'!$I$2:$I$10000)

A full recalc is under 5 seconds with your sample data.

The fly in the ointment is the fact that I am getting different results from you. I'm not quite sure what you are trying to calculate, but it appears as though you want to assign each trade to a single currency region.
Sample-migrationQ23567764.xls
I'm having trouble understanding the last part of your statement.

"but it appears as though you want to assign each trade to a single currency region."

After I used the the remove book vba method that you helped me with earlier, it was determined there are 679 unique trades (I verified that with my formula-based method). So now  I have to assign each trade to a specific region (Brazil, Japan, etc) and its associated product type. The output is on the "Results" tab.

If you're doing the same and getting different #'s, then I must have entered a wrong formula in my sample upload. I didn't know how to copy and paste the formulas without it actually referring to 20 meg file so I just did a quick type job. I'll double check the figures in the morning.

Since the last bit of data was on row 680, there must have been 679 unique trades.

If all you want to do is count trades, then does a Mexican Peso/US Dollar trade count as half of a trade for each currency? I didn't see any 1/2 values in your table.
oh haha I get it. Sorry for being really terrible at explaining things.

A mexican Peso Receive versus a USD Pay would be considered a mexican region trade. And of course what product it is. In the event there's no USD leg, it would be by the tiers, where the lowest # tier would be credited the trade.
there are no half trades haha, sorry for the confusion.
I feel another auxiliary column coming along--with a formula to define which currency region gets credit for the trade. Could you define the logic for classifying a trade? I'm thinking there might need to be a lookup table giving the tier number. This might be done using an extra column in the named range CurrencyRegion on worksheet Calculations in my posted version of your workbook.

In case I haven't made it explicit, my strategy on this question is to solve it first using worksheet formulas. Once that has been done, the next step is to decide how to automate it.

How many rows of data do you have, typically?
ASKER CERTIFIED SOLUTION
Avatar of byundt
byundt
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
rows can fluctuate depending on the time period. I use it weekly and sometimes monthly.  Weekly goes to about 1000 rows, monthly, 4500-5000. I haven't used it quarterly yet but I feel that if there was ever a need, the code should be flexible enough to capture it. The formula based version, I had 10,000 rows to make the scope broad enough.

The logic for classifying a trade would be based on the pay and rec currency columns. Regardless of which column, Tier 1 currencies have priorities over other tiers. There hasn't been a case yet when a tier 1 currency would be traded against another tier 1.

99% of the trades done are between a local currency and USD. Therefore, capturing the primary or synthetic would accurately reflect a true unique trade. The reconciliation process would be to ensure that all unique trades equal to the source data (minus all the books that were not on the list).

oh, i didn't see you just replied. Yea, the system generated source data has Swap_Options but in the presentation chart, my manager didn't want the underscore. Hey thanks a lot brad, you've been glancing at this since 5pm est. I wish there was something I could help you with besides giving you points.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi Brad, i must have been so out of it last night that I didn't notice you made an attachment with the formulas. Awesome! The formulas are a bit complex tho, it'll take me a bit of googling to understand the index and match. I remembered reading those 2 together can create a 2 directional vlookup.

Your formula looks so much cleaner then what I was using for the past year. I enclosed a sheet that I did all the product counting by. The 2 tabs are identical, one has only values and the other is formula based but I dont think the formulas will work  since it's still be referenced to the source data on my pc. But you can take a look and laugh at how I primitive I was doing it before.

I was thinking, after I understand your formulas, I can just write a code that'll enable VBA to paste these formulas in the 4 columns if the SourceData rows are not blank. The scope will be almost infinite. whoo!

Alright, time to get cracking. Appreciate it Brad.

product-count.xls
I think the MATCH you can figure out. The INDEX may be a little tougher, however.
=INDEX(Some2DTable,,3)          returns the third column of named range Some2DTable. The double comma is not a typo!

It is worth noting that the column J auxiliary formulas (the ones returning either 0.5 or 1.0) aren't used in the latest version of the spreadsheet--just the first three columns.

Brad
Bryan,
If you use the auxiliary columns, you can produce your report without need for SUMPRODUCT formulas at all. All you need to do is produce a simple PivotTable, with the CreditedRegion in the Row Items field and the Product horizontally arrayed. The Data field would be the CreditedRegion. See attached workbook. PivotTables are really fast--the recalc time when I refreshed the data was less than a second.

Brad
Sample-migrationQ23567764.xls
yeah Brad, I see it!  You're really onto something big here.

If a code is created to determines which credit regions each unique trades reside to, then a simple pivot would provide the same exact answer. ooooo, wow, this is really innovating. I started thinking about this problem believing I would have to replicate my original formulas into VBA somehow and then looping it. It would have taken quite some time with all those criterias. This out of the box thinking you showed me, it's sooo good. It's better then anything I could have ever dreamed of.

Now to code VBA to work with pivot tables. hahahha, this is just magnificent. Simply magnificent.
Bryan,
If you use a dynamic named range for the source of the PivotTable, then all you need to do is paste the new data in the "Enter Source Data" worksheet and refresh the PivotTable. No VBA at all would be required.

See attached workbook, which will allow up to 5000 rows of data in a 2.93 MB file. Most of the size is taken up by the formulas in the auxiliary columns, which extend to row 5000. If you did create those using VBA, then the workbook could be quite small.

Brad
Sample-migrationQ23567764.xls
Although I can get by with just the formulas you kindly provided,  I feel like my opportunities in life will be greatly expanded if I can somehow get a better grasp of VBA(I'm a recent college grad). The little taste I've gotten so far, has captivated my interest. I want to do everything in VBA just because the more hands on I get, the better I'll get. Plus, it's more impressive in the office if you can do coding (even though the results are the same). Thanks for your constant attention with this problem, you've really stepped beyond the requirements for helping a stranger.
I just have to say this one more time, this formula is BEYOND AWESOME, it's everything that I ever needed especially with the Tier system implemented. Wow. Just wanted to share some news with you, I've just received permission(finally!)  to take a 4 day VBA course from The Learning Tree. They specialize in IT training globally(supposedly at least). This is in late august so I hope to come back with broader sense of this world. Today has been a great day!
Just for grins, I copied the data to fill 5000 lines and wiped out all but one of the SUMPRODUCT formulas. The PivotTable refresh time was a fraction of a second. Not too shabby for a change in algorithm--from over 4 minutes to less than a second. FWIW, that's about the same degree of improvement I've seen in some of my own programs.

I also took a look at the syllabus for the Learning Tree class 209. A lot depends on the quality of the instructor. You'll probably find that in the features you already use, the class doesn't really add much depth. The value will really be in the features that you haven't used yet.

Brad
since I don't know too many features in general, I predict I will be spending a lot of time valuing....everything! I am excited!
Hi Brad, I think I figured out your lengthly equation.

Let's use the pay region as an example.

=IF(E2="","",IF(ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0)),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,3),0),1),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2),0),1)))

Broken down, there are 4 parts.
1) IF(E2="",""
If E2 is blank, then leave as blank, if false

2) IF(ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0))
If returned value is N/A for the If statment (find the position of "e2" in the currency region column 2)

If True then
3) INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,3),0),1),
return the value of (Currency Region, Row Position (find the position where E2 in "currency region",, column 3), Exactly match), Column Position 1 )

If False
4) INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2),0),1)))
Return the value of (CurrencyRegion, Row position (Find E2 in Currency Region column 2), Exact Match), Column Position 1)

I got this one down right Brad?
Bryan,
Full marks on your analysis!

In real world terms:
Broken down, there are 4 parts.
1) IF(E2="",""
If no data, then return an empty string (looks like a blank)

2) IF(ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0))
Test whether E2 contains a real currency. MATCH returns a #N/A error when E2 contains a synthetic currency.

If True then
3) INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,3),0),1),
Return the Currency Region by matching the currency to the real currency column

If False
4) INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2),0),1)))
Return the Currency Region by matching the currency in E2 to the synthetic currency column


If you don't want to paste data into a worksheet with formulas pasted down to row 5000, then the outer IF test is not required. And although more confusing, the following version of the formula is more compact:
=INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2+ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0))),0),1)


Brad
Ah, the file keeps getting smaller. I'll have to dissect your new formula. My heads spinning a bit right now. And I was thinking of a cheesy quick way to make this work with VBA. Something in the logic of

IF NOT ISEMPTY(SrcWorksheet.Cells(i,2) then
          srcworksheet.cells(i,2+i).value =  "=INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2+ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0))),0),1)"
End If

Putting the formula physically into the cell should give the same result and recalc speed? Even tho pasting down will take 5 seconds, I want VBA to paste down for me! It's a bit greedy minded but it's been my goal the entire time.

As a general rule, worksheet techniques (formulas, sorting, filtering, conditional formatting, etc.) are faster than VBA. The exceptions will be when there is really no good way to do something with worksheet techniques.

In your VBA logic, you should definitely be using the FillDown method rather than looping through cells as you insert formulas.

Brad
I see. That reminds me of a few articles I read such as

http://www.ozgrid.com/VBA/VBALoops.htm

which advised to avoid using loops to go through the data due to its lack of speed. I'll try what you suggested, tomorrow....It's been a long day here. Appreciate the extra tips as always.
hey brad, it seems that using cell.formula = "=IF(E2="","",IF(ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0)),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,3),0),1),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2),0),1)))"

returns the value of "false". I tried, cell.value = also, same result. Is there something I need to know about VBA to make worksheet functions work correctly?
You didn't double up on the double quotes. You need to do that inside a text string.
cell.formula = "=IF(E2="""","""",IF(ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0)),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,3),0),1),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2),0),1)))"
One thing I like to do when debugging formulas built by VBA is to put the formula in a string variable first. The advantage of this approach is that you can use ?frmla to display the value of frmla in the Immediate pane. You can also hold the cursor over frmla when you are stepping through the code.

Dim frmla As String
frmla = "=IF(E2="""","""",IF(ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0)),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,3),0),1),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2),0),1)))"
cell.formula = frmla
you explained this to me earlier when I had the sumproduct and evaluate question. I need to pay more attention here.  Let me give this a shot. Thanks.


P.S You're the MAN Brad!
Brad,

I got it to work beautifully with the .filldown after I paste the formulas into its respective columns. At first, I got stuck trying to add .cells(2, PayColumn) into the string and looping it instead of using "E2". I then remembered you saying to just use the filldown. The questoin I have is, would it be possible for VBA to evaluate a string, then fill in the relative cell instead of "e2" such as "cells (1+i)",  then proceed with the rest of string? I'm not sure if i'm explaining it right.

frmla = "=IF(E2="""","""",IF(ISNA(MATCH(E2,INDEX(CurrencyRegion,,2),0)),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,3),0),1),INDEX(CurrencyRegion,MATCH(E2,INDEX(CurrencyRegion,,2),0),1)))"

But all the E2's will be a vba code to refer to some cell. Thanks brad
Another trick for building formulas is to do it in steps:

Dim frmla As String, addr As String
addr = Cells(i+1,5).Address(RowAbsolute:=False, ColumnAbsolute:=False)         'Reference to cell E2
frmla = "=IF(" & addr & "="""","""",IF(ISNA(MATCH(" & addr & ",INDEX(CurrencyRegion,,2),0)),INDEX(CurrencyRegion,MATCH(" & addr & ",INDEX(CurrencyRegion,,3),0),1),INDEX(CurrencyRegion,MATCH(" & addr & ",INDEX(CurrencyRegion,,2),0),1)))"
Cell.Formula = frmla


You can also do it in R1C1 notation. It's convenient to use the equivalent of $E2 address style in this case.
Dim frmla As String
frmla = "=IF(R" & (i + 1) & "C5="""","""",IF(ISNA(MATCH(R" & (i + 1) & "C5,INDEX(CurrencyRegion,,2),0)),INDEX(CurrencyRegion,MATCH(R" & (i + 1) & "C5,INDEX(CurrencyRegion,,3),0),1),INDEX(CurrencyRegion,MATCH(R" & (i + 1) & "C5,INDEX(CurrencyRegion,,2),0),1)))"
Cell.FormulaR1C1 = frmla
Hi Brad, I finished coding this project (at least the part I've mentioned to you). The total size is (drum roll please.........798kb!)  OMG, this beyond awesome. It takes about 4 seconds to calculate. I haven't taken a look at your last code. But I'm simply blown away at this so far. 1 click processing and it's moron proof! Now I can go on vacation and someone can accurately fill in for me. I just wanted to share this news with you.

Hope you have a great weekend.
I can see why you are so excited about the project. If your boss isn't impressed by the reduction in size (96%) and increase in speed (6000%), you need to hire a new press agent.

Thanks for the feedback!

Brad
haha, I wish they knew about the technical details. I only demonstrated  the "1 click processing" with the usage of command button (after they drop in the source data). They were very impressed that it's more "streamlined". I doubt I'll get anything for it but at least I can pass the task to someone else and move onto bigger (more complicated things).  

I said this probably 100x over the past month but really, I appreciate your patience and insight. I don't think I would have ever thought to use index/match and filling down. I would have looped and looped everything until I got my desired results with the speed being marginally faster. I have to learn to keep in mind to always use excel functions and avoid vba unless it's absolutely necessary.

Going forward, I probably should make a pop up menu when a new currency is detected and have it drop into the currency region, something like that to prolong the shelf life.  I have 3 more similar type projects to convert from formula to code. I think with the power of the double quotes and the fill downs + the sumproduct, the transition shouldn't be painful at all :).

You Da MAN Brad!





You might want to consider writing a UDF to return the currency region. Doing so will eliminate the need for two of the three auxiliary columns, as you would be doing that logic within the UDF. You would then be left with a worksheet formula like:
=CreditedRegion(Pay Currency Code, Rec Currency Code)
where "Pay Currency Code" and "Rec Currency Code" would be replaced with the addresses of the cells that contain the abbreviation for the currency being used.

It may take a little longer with the UDF than with the INDEX and MATCH formulas, but the workbook will be easier to maintain.

Brad
okay, I 'll give it a shot at work. I may not use this particular UDF but the knowledge of knowing how to create one is invaluable. Can't hurt to try and fail then try and fail a few more times before getting it. This will add another wrinkle to my brain. :)
hey brad, here's a silly question. I now have 2 pivot tables with the ideal results. I want to format it for presenation purposes. Are there any dangers of writing a code to format the pivot table itself? Or should I use the tablerange2 property and copy the pivot table to another sheet and work off of it there? It seems like the latter is safer but it's more coding. I'm just wondering if its a necessity. Silly question. Thanks Brad, no rush on this one.
Unless you just want a couple of tweaks, I suggest copying the values in the PivotTable to a new range and then formatting that. Every time I try to format a PivotTable, the exercise drives me crazy. And then when I refresh the PivotTable, some of the formatting becomes lost.
if it drives you crazy, I don't want to know what it'll do to me. I will paste special the values to another sheet. thanks brad. Another great tip!
hey brad, another silly one. Is there a way to sort the PivotTable.RowGrand? I've tried every autosort possibility I can think of. I figured it'll be easier to sort results in the pivot table then special paste values to a new worksheet. If it's not possible, that just means more lines of code. Thanks!