Link to home
Start Free TrialLog in
Avatar of cricketman
cricketman

asked on

Need formula to extract numbers from alphanumeric string

Cells in column "O" contain alphanumeric strings from which I need to extract a number and place it into cell in column "P".

* Strings are of different overall lengths
* Strings contain HTML code
* Strings may have more than one number at different locations in string
* Number I need to extract may be one or two digits
* Number I need will always be followed by "-pack"

EXAMPLES

"11x11x7 Insulated Shipping Box with 3/4 Foam<br><font color=B40404>5-Pack</font>"
From this string I need to extract the "5" from "5-pack"

"9.75 96 oz Clear UN-Punched Cups <br>WITH LIDS<br><Font color=B40404>10-pack</Font>"
From this string I need to extract the "10" from "10-pack"

"Premium Natural Cotton Bag<br><font color=B40404>15.5 x 22.5 --- 10-Pack</font>"
From this string I need to extract the "10" from "10-pack"

4.5 32 oz Semi-Clear UN-Punched Cups <br>WITH LIDS - <Font color=B40404>25-pack</Font>"
From this string I need to extract the "25" from "25-pack"

If at all possible I prefer to stay away from UDF's or macros as I am not an terribly familiar with using them.  I HAVE used them, and COULD if that's the only way, but a FORMULA would be much better.

Thanks in advance to all who assist!!
Avatar of NBVC
NBVC
Flag of Canada image

Perhaps this may work... assuming first string is in A1:

=IFERROR(--MID(A1,SEARCH("-Pack",A1)-3,3),MAX(--RIGHT(MID(A1,SEARCH("-Pack",A1)-3,3),1),--RIGHT(MID(A1,SEARCH("-Pack",A1)-3,3),2)))

copied down
@NB_VC - your formula is limited to 3 numbers.  E.g., 1250-pack would result in 250.

@cricketman - is there a possibility of encountering commas in the value?  E.g., 1,000-pack
If this is a one-off or intermittent task, I'd suggest a different strategy: edit the flat file (CSV) using regex.

This one expression,  "(\d+,?\d*?)-[pP]ack", would find every instance, with or without commas.

I use Programmer's Notepad (pnotepad.org) for this kind of one-off cleanup/parsing work.
Avatar of cricketman
cricketman

ASKER

@NB_VC - your solution works for many of my cells, but some give errors.

"Premium Natural Cotton Bag<br><font color=B40404>11 x 16.5 --- 5-Pack</font>"
returns "-5" (it should return "5")

"11x11x7 Insulated Shipping Box with 3/4 Foam<br><font color=B40404>5-Pack</font>"
returns #Value!

"15x15x7 Insulated Shipping Box with 3/4 Foam<br><font color=B40404>1-Pack</font>"
returns #Value!

"7x7x7 Insulated Reptile Shipping Boxes with 3/4 Foam<br><font color=B40404>1-Pack</font>"
returns #Value!

"7x7x7 Insulated Reptile Shipping Boxes with 3/4 Foam<br><font color=B40404>5-Pack</font>"
returns #Value!

@mark_harris231 - yes, in some rare cases there may be 4 digits e.g. "1,000-pack"
One of your initial criteria was that the number would be 1 or 2 digits.  That is what my formula was based on.
Everything I have so far *is* only 1 or 2 digits, but there is a possibility that could increase to 3 (or 4) digits in the future.
SOLUTION
Avatar of krishnakrkc
krishnakrkc
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
And an extra SUBSTITUTE added to Kris' solution incase you have a --5-Pack  (i.e. dashes up against the 5)....


=--TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(REPLACE(A1,SEARCH("-Pack",A1),255,""),"-"," ")," ",REPT(" ",100)),">",REPT(" ",100)),100))
Both of the SUBSTITUTE formulas return the #Value! error, even in the cell that before returned "-5" where it should be "5".

@NB_VC - your first formula is very close!!!
I am sorry, it is hard keeping up with new comments in this forum....

I just tried my formula in every single sample you gave in both of your posts above and every single time got the correct response.

These are possible reasons for it giving you the error...

1. You are referencing the wrong cell in your formula
2. You don't have the exactly the phrase "-pack" directly following the number (no spaces between)
3. You don't have a number before the -pack

Most likely it's the second.. you may need to check that -pack is spelled exactly that way with no spaces after the number and after the dash....
Thanks for looking again!

Here are two of the exact scenarios that are giving the VALUE error:

=====================================
value in cell O6716: 7x7x7 Insulated Reptile Shipping Boxes with 3/4 Foam<br><font color=B40404>5-Pack</font>"

value in cell P6716: 1

formula I entered into cell Q6716: =IFERROR(--MID(O6716,SEARCH("-Pack",O6716)-3,3),MAX(--RIGHT(MID(O6716,SEARCH("-Pack",O6716)-3,3),1),--RIGHT(MID(O6716,SEARCH("-Pack",O6716)-3,3),2)))

result in cell Q6716: #VALUE!
=====================================

value in cell O6718: 11x11x7 Insulated Shipping Box with 3/4 Foam<br><font color=B40404>5-Pack</font>

value in cell P6718: 1

formula I entered into cell Q6718: =IFERROR(--MID(O6718,SEARCH("-Pack",O6718)-3,3),MAX(--RIGHT(MID(O6718,SEARCH("-Pack",O6718)-3,3),1),--RIGHT(MID(O6718,SEARCH("-Pack",O6718)-3,3),2)))
=====================================


I noticed two things that I thought might make a difference: 1) the value in cell O6716 has a quotation mark at the end of the value but not one at the beginning, but there are others that have that & they don't result in an error, and 2) in both of these cases the "P" in "Pack" is a capital P whereas in MOST other cases it's a small p... but then I found some instances where there is a capital P that also did not return an error.

I also checked the formatting of the cells with errors - it's GENERAL, but the other cells (that AREN'T giving an error) are General format also.

It's GOTTA be something simple, right??......

Thanks again for helping me solve this!!
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@NB_VC - that does it!

I'm awarding MOST points to you b/c you were first to respond, most active participant, and provided the final solution.

I'm awarding SOME points to kris b/c his suggestion was used to get to the final & best solution.

THANKS GUYS!!!!!