?
Solved

Custom delimiter (text to column)

Posted on 2014-08-04
18
Medium Priority
?
229 Views
Last Modified: 2014-08-05
Experts:

I need some assistance with developing a >> customer delimiter (text to columns) << function.

As you know, Excel provides a built-in capability that allows to use "text to columns" (either based on character delimitation or fixed width).

Based on the data set I have (with hundreds of records), the fixed width does not work for me... length of values varies for all records.   Also, a delimiter equal to "." (single character) won't work that great given that some customers have additional details in, e.g., their names (prefixes, suffixes, etc.).   Thus, alike values are across a number of columns.

Please see attached XLS with only 3 sample (mock) records.    My question/requirements:  

1. How can I develop a macro/module that would look at entire words/string vs. single-character delimitation?    
2. However, given the variety of, e.g., organization names, I would need to be able to have several string values in a single module.   That is, instead of running them sequentially, I would want to use some form of "OR" statement for a single run (e.g., "US Army" or "US Air Force" or "US Navy").
3. Then, once all records have been separated by these 3 values, I certainly could re-run the module based on different/multiple criteria.

Any idea as to how this could be accomplished?   Again, please see attached XLS for additional details.

Thanks,
EEH
Text-delimiter.xlsx
0
Comment
Question by:ExpExchHelp
  • 6
  • 5
  • 3
  • +2
17 Comments
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 40239599
You can use any character you want for the text to columns.  

For something like that I would normally use a pipe (|) or tilde (~).

Do you have the ability to change the dataset?
0
 

Author Comment

by:ExpExchHelp
ID: 40239692
Kyle:

Copy that... I don't want to update each of the organization and "prefix" them with a "~"... in order to separate by this character.    In a sense, this would be too inefficient to run all the find/replace and so forth.

Again, good idea but I'm hoping for a different solution.

Thanks,
EEH
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 40239696
Understood you don't want to do it manually, but if the person generating this can do it for you it would solve your issues.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:ExpExchHelp
ID: 40239737
Kyle:

Small team effort here... we don't have such support system in place.

Going back to the drawing board w/ respect to creating a module/macro that would automate/semi-automate this process.

Thanks,
EEH
0
 
LVL 27

Expert Comment

by:Glenn Ray
ID: 40239738
I agree with Kyle; if this data is coming out of MS Exchange (which your example set looks like), it should be possible to export in a delimited format at the source.

If this isn't possible, there should still a possibility of devising an algorithm for intellegently parsing out this data.  The problem here is that this is a "forest-for-the-trees" problem.  I - as a person who has no familiarity with your data set - see some possible pattern matching, but not all.  For example, I might not have associated "LCDR" or "LTC" as part of an employee name, but rather the leading text for the organization. And while some internal delimiters such as the less-than sign (<) are obvious, the possibility of stray characters like a lone apostrophe make this tricky.

That all said, if you have a good understanding of the component data itself and could reasonably separate a large amount of variations to meet your desired solution, there's no reason that logic could not be encoded in a VBA routine or perhaps complex formulas.

Regards,
-Glenn
0
 
LVL 27

Expert Comment

by:Glenn Ray
ID: 40239751
Followup:  Your example only lists three branches of military service: how many could there possibly be?
0
 

Author Comment

by:ExpExchHelp
ID: 40239832
Glenn:

Thanks for chiming in... I appreciate it.

As mentioned in the original post, the XLS contains only sample (dummy) values.    These were not extracted from MS Exchange.    And maybe I was a bit "lazy" by only including 3 sample (and to some degree alike) records.

In actuality, while some records may follow a similar pattern, there will be -- more than likely -- as many records which don't follow the similar pattern.   Thus, although possible, it makes the creation of a "clean" dataset (for further manipulation purposes) quite time-consuming.

So, if there was some form of automation that looks at a specific string/word in a record -- no matter what the position is -- it would be quite valuable for parsing the records.

If you know of a process that allows me to add those "keywords" to a module, I'd greatly appreciate it.

Thanks,
EEH
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40239857
I applied the following regex pattern to your sample data
(.*?) (U[A-Z]{1,3}[^<]*)(<[^>]+>)\r\n

Open in new window

and got the following
Name Part:
Smith, Mike E Jr LCDR
Jones, Jackie H LTC
Wolters, Tommy

Org Part:
USN Organization_X (US)'
USA Organization_Y (US)
USAF Organization_Z (US)

email Part:
<michael.e.smith@domain.name>
<jackie.h.jones@domain.name>
<tommy.wolters@domain.name>
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40239862
Of course, we don't have to worry about CrLf terminated lines when we are working with cell data.
0
 
LVL 27

Accepted Solution

by:
Glenn Ray earned 2000 total points
ID: 40239868
I do indeed have a working example that may work, so long has you have key words that can help you identify/separate the data.

First, you set up a range of substrings that would be the delimiting values.  They can be anywhere in your workbook.  For example, based on your sample data, a range of abbreviations of branches of military service would be created in a column like so:
substring value set up
Then you insert two array functions that point to your original source strings.  If the first source string is in A2 and the substrings to check are on Sheet2 in cells A2:A7, the formulas are (entered with [Ctrl+[Shift]+[Enter]):
=LEFT(A2,SUMPRODUCT(IFERROR(FIND(Sheet2!$A$2:$A$7,A2)-2,0)))
=MID(A2,SUMPRODUCT(IFERROR(FIND(Sheet2!$A$2:$A$7,A2),0)),100)
These will return the left and right parts of the master string before any matching substring on Sheet2.

Example file attached.

-Glenn
EE-Text-delimiter.xlsx
0
 
LVL 24

Expert Comment

by:Ejgil Hedegaard
ID: 40239875
If all organisations start with US, then you could use 2 columns with formulas to split the text.
For the left part, before USN, USA, USAF
B2: =LEFT(A2,SEARCH("US",A2)-2)
For the rest
C2: =RIGHT(A2,LEN(A2)-LEN(B2)-1)
0
 

Author Closing Comment

by:ExpExchHelp
ID: 40239888
Glenn:

I've started "playing" with it and added new records (with existing organizations) into the source area.   It appears to be great.

I'll continue to test it out a bit longer but for right now (I think) you've provided an excellent solution.    In the event I need additional assistance (e.g., creating a secondary lookup areas with different string types), I hope you won't mind my following up.

Again, thanks!
EEH
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40239901
Here is a sample routine to parse the data, using that regex pattern.
Sub Q_28490280()
    Dim oRE As Object
    Dim oMatches As Object
    Dim oM As Object
    Dim rng As Range
    Dim rngCell As Range
    
    Set rng = ActiveSheet.Range(ActiveSheet.Range("A2"), ActiveSheet.Range("A2").End(xlDown))
    
    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = False
    oRE.Pattern = "(.*?) (U[A-Z]{1,3}[^<]*)(<[^>]+>)"
    'NOTE: uncomment the following statement for best performance
    'Application.ScreenUpdating = False
    For Each rngCell In rng
        If oRE.test(rngCell.Value) Then
            Set oMatches = oRE.Execute(rngCell.Value)
            With oMatches(0)
                rngCell.Offset(0, 1).Value = .submatches(0)
                rngCell.Offset(0, 2).Value = .submatches(1)
                rngCell.Offset(0, 3).Value = .submatches(2)
                'Debug.Print .submatches(0), .submatches(1), .submatches(2)
            End With
        End If
    Next
    Application.ScreenUpdating = True
End Sub

Open in new window

0
 
LVL 46

Expert Comment

by:aikimark
ID: 40239919
@Glenn

Do not encourage or request that people include your name in their question text.  I have deleted your latest comment.

aikimark -- zone advisor
0
 

Author Comment

by:ExpExchHelp
ID: 40241008
aikimark -- impressive solution!!!    It was very easy to use and separated the content correctly based on the 3 records (in sample data set)..    I'm just not sure how it works (in case I need to modify it given the actual data set).

Would you mind offering additional info on the pattern identification?

Thanks,
EEH
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40241021
@EEH

Does that mean you're using the regular expression solution?  You should open a new question and post the new question's URL in this thread.  I will help you in the new thread.
0
 

Author Comment

by:ExpExchHelp
ID: 40241154
@aikimark:

Thanks... I've opened a new question.    It'll be great if you could please offer additional comments on the logic.

http://www.experts-exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/Excel/Q_28490830.html

Cheers,
EEH
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
You need to know the location of the Office templates folder, so that when you create new templates, they are saved to that location, and thus are available for selection when creating new documents.  The steps to find the Templates folder path are …
This Micro Tutorial demonstrate the bugs in Microsoft Excel for Mac with Pivot Charts.
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question