Solved

Data cleansing puzzle

Posted on 2014-03-11
5
172 Views
Last Modified: 2014-03-11
See attached.

For the sake of the example.

Assume that I sell car parts.

Column A is the part number.
Column B is supposed to be a serial number - but there was no validation on this field and there is all sorts of funny values.

I want to extract potential VALID serial numbers data from column B and put it into column C.

See example in sheet.

A serial number is deemed to be valid if there are 4 (or more) numeric digits in a row.

E.g.
123456CATS is valid - ie. 123456
123CATS1234 is NOT valid
19090CATS123 is valid - i.e. 19090

It may be easier to follow in the attached sheet.

Objective: Populate column C with valid serial number that are found in column B.
SerialNumbersPuzzle.xlsm
0
Comment
Question by:Patrick O'Dea
  • 3
5 Comments
 
LVL 45

Accepted Solution

by:
aikimark earned 500 total points
ID: 39921333
Add this function to your project:
Option Explicit

Public Function GetPartNum(parmSN)
    Static oRE As Object
    Static oMatches As Object
    If oRE Is Nothing Then
        Set oRE = CreateObject("vbscript.regexp")
        oRE.Global = True
        oRE.Pattern = "(\d{4,9})"
    End If
    If oRE.test(parmSN) Then
        Set oMatches = oRE.Execute(parmSN)
        GetPartNum = oMatches(0).submatches(0)
    Else
        GetPartNum = vbNullString
    End If
End Function

Open in new window

Then add a formula that invokes the function into the cells you want populated with the part number.
Example:
=getpartnum(B2)

Open in new window

0
 
LVL 31

Expert Comment

by:Rob Henson
ID: 39921388
Why wouldn't example 2 123CATS1234 be valid? It has 4 digits in a row.

Or does the 4 digits have to be at the beginning?

If just the first 4 characters have to be a number then you can use the ISNUMBER function.

Assuming serial number in A2:

=ISNUMBER(LEFT(A2,4)*1)

The *1 forces excel to recognise the result as a number rather than a string of text that happens to look like a number.

You can then wrap that within an IF statement to get Valid / Invalid:

=IF(ISNUMBER(LEFT(A2,4)*1),"Valid","Invalid")

Thanks
Rob H
0
 

Author Comment

by:Patrick O'Dea
ID: 39921843
Thanks for comments.

Rob H , you spotted an error in my sheet.  The digits do NOT have to be at t he beginning.  They can be anywhere.

I will evaluate aikimark's suggestion now.
0
 

Author Comment

by:Patrick O'Dea
ID: 39921873
aikimark,

I have done as you suggest but get #NAME?


See attached.

Cell D2

(I am obviously missing something ... the function does not seem to be recognised)??
SerialNumbersPuzzle.xlsm
0
 

Author Closing Comment

by:Patrick O'Dea
ID: 39922072
Thanks !

Works very well.
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

A little background as to how I came to I design this code: Around 5 years ago I designed an add-in that formatted Excel files to a corporate standard, applying different cell colours and font type depending on whether the cells contained inputs,…
INDEX and MATCH can be used to great effect to replace HLOOKUP and VLOOKUP as it does not have the limitation of needing the data to be sorted so that the reference value is in the first column or row. It also has the ability to perform a bi-directi…
Viewers will learn the basics of slicers and timelines for both PivotTables and standard Excel tables in Excel 2013.
The viewer will learn how to use the =DISCRINV command to create a discrete random variable, use this command to model a set of probabilities and outcomes in a Monte Carlo simulation, and learn how to find the standard deviation of a set of probabil…

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now