Regex extract only the closest numbers at both sides of a word (word boundaries?)

Posted on 2014-08-04
Last Modified: 2014-08-04
I want to retrieve only numbers (can have a dot or coma)  that are just before or after one (see below) specific word ( "size" in this case).

123 size 23456 bbb    ->  123  and 23456
aa 123size23456             ->  123 and 23456
1.234 size 55.567       -> 1234 (or 1.234) and 55567
1234 size forget  345            -> 1234
aa12 size 14587          -> 14587

The ideal was to use the expression for several words  as    ?:size|area|length and also ignore characters not belonging to alphabet as space, /,-    ( example "1245/size/ 12"   would return   1245    12) if not too complex!
Can you help me?
Question by:novreisb
    LVL 44

    Assisted Solution

    This regex pattern seems to come close:

    Open in new window

    With your data, it parsed the following:
    Match 0 Start(0) Length(14) 
    	SubMatch 0: 123
    	SubMatch 1: 23456
    Match 1 Start(23) Length(12) 
    	SubMatch 0: 123
    	SubMatch 1: 23456
    Match 2 Start(37) Length(17) 
    	SubMatch 0: 1.234
    	SubMatch 1: 55.567
    Match 3 Start(56) Length(10) 
    	SubMatch 0: 1234
    	SubMatch 1: 
    Match 4 Start(81) Length(13) 
    	SubMatch 0: 12
    	SubMatch 1: 14587

    Open in new window

    Question: In the last line, why aren't you getting the 12?
    LVL 44

    Assisted Solution

    This pattern does not pick up the aa12 in the last line of your data.

    Open in new window

    please post some more examples of non-space characters used as delimiters.
    LVL 44

    Assisted Solution

    When using this pattern:

    Open in new window

    the space delimited strings still parse correctly as well as this:
    123 /size/ 23456 bbb
    1245/size/ 12

    Open in new window


    Author Comment

    Hi aikimark,
    Thank you very much by the solution! Just one very small aspect was not contemplated! The hypothesis of having more than one "keyword". But it is not relevant!

    Author Comment

    I've requested that this question be closed as follows:

    Accepted answer: 0 points for novreisb's comment #a40239982

    for the following reason:

    The solution is not mine  and was my mistake , so I ask you to correct!
    LVL 44

    Accepted Solution

    as far as different 'keywords' is concerned, you would just replace the "size" part of the pattern.  If you have a limited number of such keywords, they can be part of the regex pattern.

    Featured Post

    Enabling OSINT in Activity Based Intelligence

    Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

    Join & Write a Comment

    Suggested Solutions

    Title # Comments Views Activity
    Submitting to a REST API via a CLR in C# 12 51
    Problem to file 3 29
    Best book to learn C++ 3 32
    software license audit 6 10
    I use more than 1 computer in my office for various reasons. Multiple keyboards and mice take up more than just extra space, they make working a little more complicated. Using one mouse and keyboard for all of my computers makes life easier. This co…
    Let’s list some of the technologies that enable smooth teleworking. 
    The viewer will learn how to create multiple layers to apply various filters and how to delete areas from each layer’s filter.
    This is Part 3 in a 3-part series on Experts Exchange to discuss error handling in VBA code written for Excel. Part 1 of this series discussed basic error handling code using VBA.…

    734 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    22 Experts available now in Live!

    Get 1:1 Help Now