Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
Solved

# Data cleansing puzzle

Posted on 2014-03-11
Medium Priority
182 Views
See attached.

For the sake of the example.

Assume that I sell car parts.

Column A is the part number.
Column B is supposed to be a serial number - but there was no validation on this field and there is all sorts of funny values.

I want to extract potential VALID serial numbers data from column B and put it into column C.

See example in sheet.

A serial number is deemed to be valid if there are 4 (or more) numeric digits in a row.

E.g.
123456CATS is valid - ie. 123456
123CATS1234 is NOT valid
19090CATS123 is valid - i.e. 19090

It may be easier to follow in the attached sheet.

Objective: Populate column C with valid serial number that are found in column B.
SerialNumbersPuzzle.xlsm
0
Question by:Patrick O'Dea
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 3

LVL 46

Accepted Solution

aikimark earned 2000 total points
ID: 39921333
``````Option Explicit

Public Function GetPartNum(parmSN)
Static oRE As Object
Static oMatches As Object
If oRE Is Nothing Then
Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True
oRE.Pattern = "(\d{4,9})"
End If
If oRE.test(parmSN) Then
Set oMatches = oRE.Execute(parmSN)
GetPartNum = oMatches(0).submatches(0)
Else
GetPartNum = vbNullString
End If
End Function
``````
Then add a formula that invokes the function into the cells you want populated with the part number.
Example:
``````=getpartnum(B2)
``````
0

LVL 33

Expert Comment

ID: 39921388
Why wouldn't example 2 123CATS1234 be valid? It has 4 digits in a row.

Or does the 4 digits have to be at the beginning?

If just the first 4 characters have to be a number then you can use the ISNUMBER function.

Assuming serial number in A2:

=ISNUMBER(LEFT(A2,4)*1)

The *1 forces excel to recognise the result as a number rather than a string of text that happens to look like a number.

You can then wrap that within an IF statement to get Valid / Invalid:

=IF(ISNUMBER(LEFT(A2,4)*1),"Valid","Invalid")

Thanks
Rob H
0

Author Comment

ID: 39921843

Rob H , you spotted an error in my sheet.  The digits do NOT have to be at t he beginning.  They can be anywhere.

I will evaluate aikimark's suggestion now.
0

Author Comment

ID: 39921873
aikimark,

I have done as you suggest but get #NAME?

See attached.

Cell D2

(I am obviously missing something ... the function does not seem to be recognised)??
SerialNumbersPuzzle.xlsm
0

Author Closing Comment

ID: 39922072
Thanks !

Works very well.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tutorial explains how to create a series of drop-down lists that are dependent upon prior selections to guide (“force”) the user to make the correct selection and reduce data errors within Microsoft Excel. Excel 2010 was used for this tutorial;…
If you need to forecast numbers -- typically for finance -- the Windows and Mac versions of Excel 2016 have a basket of tools to get the job done.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
Many functions in Excel can make decisions. The most simple of these is the IF function: it returns a value depending on whether a condition you describe is true or false. Once you get the hang of using the IF function, you will find it easier to us…
###### Suggested Courses
Course of the Month7 days, 20 hours left to enroll