?
Solved

SUMIF help

Posted on 2012-08-10
24
Medium Priority
?
414 Views
Last Modified: 2012-08-14
Looking for some help with a SUMIF formula

Given:
A               B
--------------
This         10
That        11
Or           15
The         20
Other     25

I have a formula that counts the number of times a string (H2) appears within a range (using the CTRL+SHIFT+ENTER array formula):

{=IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",A:A)),1,0)),"")}

This works fine.  If I enter "the" in H2, it returns a count of 2 (cells A4 & A5 contain this string).  A search value of "th" returns a count of 4 (A1, A2, A4 & A5).

What I'm looking for help with is a related formula.  I want to sum the values found in column B if a cell is included in the count above.  So, for the first example - a search value of "the" would return a sum of 45 (20+25).  A search of "th" would return a sum of 66 (10+11+20+25).

I tried the following, but I'm guessing the syntax isn't correct:
{=SUMIF($A$2:$A$250000,ISNUMBER(SEARCH("*"&H2&"*",A:A))&"=TRUE",$B$2:$B$250000)}

I'd prefer not to use VBA if possible, but can if that's the only way.
0
Comment
Question by:mark_harris231
  • 11
  • 7
  • 3
  • +1
23 Comments
 
LVL 24

Expert Comment

by:Steve
ID: 38282138
=IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",A:A)),1,0)*B:B),"")
0
 
LVL 24

Assisted Solution

by:Steve
Steve earned 200 total points
ID: 38282168
If using Excel 2007-2010 with it's large number of rows I would either:

name the ranges A:A and B:B as named ranges:
=IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",RangeA)),1,0)*RangeB),"")
or reduce the range to just the rows with data:
=IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",$A$2:$A$999)),1,0)*$B$2:$B$999),"")
this will reduce the "calculating time"
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38282198
Get a #VALUE! error with the first formula.  Good advice concerning the ranges.  I planned to "optimize" once I get something that gives me result.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 10

Author Comment

by:mark_harris231
ID: 38282208
If I make it a discrete range (e.g., A2:A250000 & B2:B250000), it works.
0
 
LVL 24

Expert Comment

by:Steve
ID: 38282212
I had tested all formula... you are doing [ctrl]+[shift]+[enter]?

Is there any text or #NA in A:A or B:B?
0
 
LVL 93

Expert Comment

by:Patrick Matthews
ID: 38282219
In Excel 2003 and earlier, array formulas could not reference entire rows or columns.  This restriction was removed in Excel 2007.

Which version are you on?
0
 
LVL 24

Expert Comment

by:Steve
ID: 38282222
Ah, there is probably something in the header row causing the array to fail.
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38282232
Yes - entering as array formula

A:A is all text.  B:B has text header

This works in the first cell I place it, but if I drag it down, I get a Circular Reference error:
{=IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",$A$2:$A$250000)),1,0)*$B$2:$B$250000),"")}
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38282237
Excel 2010
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38282244
Have to leave off on this for a short bit.  I'll check back in a few hours.  Thanks for having a look.
0
 
LVL 24

Expert Comment

by:Steve
ID: 38282245
Could you post the workbook.
I am not sure why you would get a circular reference.
Will test it here, see if I get the same.
0
 
LVL 24

Expert Comment

by:Steve
ID: 38282262
OK, I am getting no circular reference errors.

Attached is workbook I am testing.

I am not sure why you would have circular references.
Book1.xlsx
0
 
LVL 50

Expert Comment

by:barry houdini
ID: 38282307
Simpler to use SUMIF with wildcards

=IF(H2="","",SUMIF(A:A,"*"&H2&"*",B:B))

regards, barry
0
 
LVL 50

Expert Comment

by:barry houdini
ID: 38282328
>In Excel 2003 and earlier, array formulas could not reference entire rows or columns

Hey Patrick!

In fact there's no problem using a whole row in an array formula, even in Excel 2003. Of course SUMIF can use a whole column in any version.

regards, barry
0
 
LVL 93

Expert Comment

by:Patrick Matthews
ID: 38282409
barry,

Interesting.  I always though whole rows were verboten as well.  I stand corrected :)

Patrick
0
 
LVL 24

Expert Comment

by:Steve
ID: 38282423
Trust in the Barry :)

kicking myself for just "fixing" the original formula and not trying to use sumif :)

live and learn.

Should the starting formula just be a simple Countif... wonder why it isn't.
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38287508
Sorry for the delay in responding, gents.

It appears that I neglected to include a criteria that appears to exclude the SUMIF solution.  The reason I was using the ISNUMBER(SEARCH()) combination was to make the search criteria case insensitive.

A little more backstory - I don't think it has any real bearing on the solution, but....

I'm involved in a product data taxonomy/schema development project in order to develop a unified web presentation across several companies within a division of our parent.  One factor that we want to weight heavily is the "voice of the consumer" (i.e., how do customers want and expect to see products listed).  One method to do this is to analyze search logs from our existing web site structures.  This is where this issue picks up.

I have 12-month list of search terms, along with a frequency (count) that they have been entered.  I'm trying to build a Search Term Aggregator within the spreadsheet that will let me develop a clearer picture of how customers are searching.  Column A contains the search terms.  Column B contains a count of the number of times a particular search term was entered.  E.g., A18: "SOCKET HEAD CAP SCREWS"; B18: 556)

In column H, I'm listing one or more search terms (e.g., H2: "Socket head"; H3: "socket+head", etc.).  In Column I, I'm displaying the count of the number of cells that the contain the search term (case insensitive).  In Column J, I want to show the sum of the values in B for each cell that matches the search term.

I'll post a scrubbed version of the spreadsheet shortly.
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38287639
On further testing, it looks like Barman's formula is correct, but Excel is choking on the number of cells being evaluated.  I can evaluate arrays up to 125,000 rows, but begin getting circular reference errors at some point above that.  Haven't dialed it in exactly, but 130,000 fails.

Works: =IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",$A$2:$A$125000)),1,0)*$B$2:$B$125000),"")

Fails: =IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",$A$2:$A$130000)),1,0)*$B$2:$B$130000),"")

At this point, I think I'll just chunk up the arithmetic into 125,000 row segments and add the results and award full points to Barman (unless anyone has any objections or other ideas).
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38287673
Segmentation and addition method works.  Awarding points to Barman.

Still curious as to why the formula breaks down after 125,000 records, but moving on.

Thanks for the attention and help.

Mark
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38287682
Thanks for your assistance!
0
 
LVL 50

Accepted Solution

by:
barry houdini earned 800 total points
ID: 38291572
Hello Mark,

I'm not really clear why SUMIF wouldn't be the best solution here as per my suggestion

=IF(H2="","",SUMIF(A:A,"*"&H2&"*",B:B))

SUMIF, like most Excel functions (SUBSTITUTE, FIND, EXACT and possibly others excepted) is not case-sensitive so I think it should give the same results. It would typically also be much quicker over large ranges (and I wouldn't expect there to be problems with 125,000+ rows)

If you were dealing with numeric values then SUMIF with wildcards might be problematic but as you seem to be dealing with text values I don't see a downside.

Note that with this formula:

=IF(H2<>"",SUM(IF(ISNUMBER(SEARCH("*"&H2&"*",$A$2:$A$125000)),1,0)*$B$2:$B$125000),"")

the wildcards in SEARCH are superfluous (SEARCH searches for the search value anywhere in the cell in any case), so this version should be sufficient

=IF(H2<>"",SUM(IF(ISNUMBER(SEARCH(H2,$A$2:$A$125000)),1,0)*$B$2:$B$125000),"")

regards, barry
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38291643
Barry - Thanks for the follow-up.  Quite frankly, I'm not sure what was going on with this sheet on Friday.  When I tried your formula on Monday AM, it was returning "0".  I just tried it again and it does, in fact, work and is faster than the other method.  Possibly some interference from other worksheets I had open, or...(?).

As you indicate, it does not have a problem with processing the entire range A:A and B:B.

Point noted concerning the case-insensitivity of SUMIF (and the other functions  you noted).  Thanks for the additional education in this regard.

I'll request a points adjustment.
0
 
LVL 10

Author Comment

by:mark_harris231
ID: 38292152
Points re-distributed.  Thanks, JARmod10
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article describes a serious pitfall that can happen when deleting shapes using VBA.
If you need to forecast numbers -- typically for finance -- the Windows and Mac versions of Excel 2016 have a basket of tools to get the job done.
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question