• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 185
  • Last Modified:

Data cleansing puzzle

See attached.

For the sake of the example.

Assume that I sell car parts.

Column A is the part number.
Column B is supposed to be a serial number - but there was no validation on this field and there is all sorts of funny values.

I want to extract potential VALID serial numbers data from column B and put it into column C.

See example in sheet.

A serial number is deemed to be valid if there are 4 (or more) numeric digits in a row.

E.g.
123456CATS is valid - ie. 123456
123CATS1234 is NOT valid
19090CATS123 is valid - i.e. 19090

It may be easier to follow in the attached sheet.

Objective: Populate column C with valid serial number that are found in column B.
SerialNumbersPuzzle.xlsm
0
Patrick O'Dea
Asked:
Patrick O'Dea
  • 3
1 Solution
 
aikimarkCommented:
Add this function to your project:
Option Explicit

Public Function GetPartNum(parmSN)
    Static oRE As Object
    Static oMatches As Object
    If oRE Is Nothing Then
        Set oRE = CreateObject("vbscript.regexp")
        oRE.Global = True
        oRE.Pattern = "(\d{4,9})"
    End If
    If oRE.test(parmSN) Then
        Set oMatches = oRE.Execute(parmSN)
        GetPartNum = oMatches(0).submatches(0)
    Else
        GetPartNum = vbNullString
    End If
End Function

Open in new window

Then add a formula that invokes the function into the cells you want populated with the part number.
Example:
=getpartnum(B2)

Open in new window

0
 
Rob HensonIT & Database AssistantCommented:
Why wouldn't example 2 123CATS1234 be valid? It has 4 digits in a row.

Or does the 4 digits have to be at the beginning?

If just the first 4 characters have to be a number then you can use the ISNUMBER function.

Assuming serial number in A2:

=ISNUMBER(LEFT(A2,4)*1)

The *1 forces excel to recognise the result as a number rather than a string of text that happens to look like a number.

You can then wrap that within an IF statement to get Valid / Invalid:

=IF(ISNUMBER(LEFT(A2,4)*1),"Valid","Invalid")

Thanks
Rob H
0
 
Patrick O'DeaAuthor Commented:
Thanks for comments.

Rob H , you spotted an error in my sheet.  The digits do NOT have to be at t he beginning.  They can be anywhere.

I will evaluate aikimark's suggestion now.
0
 
Patrick O'DeaAuthor Commented:
aikimark,

I have done as you suggest but get #NAME?


See attached.

Cell D2

(I am obviously missing something ... the function does not seem to be recognised)??
SerialNumbersPuzzle.xlsm
0
 
Patrick O'DeaAuthor Commented:
Thanks !

Works very well.
0

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now