Solved

info between words

Posted on 2014-04-22
22
151 Views
Last Modified: 2014-04-23
I have the following string

JE: ** Job dv:9987 . a1:311 a2:565 a3:1204 --- ty, 311762, --- a5, 31178747, --- b3, 31178384, ---a7, 31178381, --- a1, 15808387, --- ty, 3184, --- a5, 12045-05, ** Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384, --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **Job dv:456 . a1:31231 a2:565 a3:12054 --- ty, 3141762, --- a5, 4311748747, --- b3, 311784384, ---a7, 311478381, --- a1, 158048387, --- ty, 344184, --- a5, 1442045-05, ** Job dv:45654 a1:3441321 a2:56544 a3:124404 --- ty, 31174642, --- a2, 311767874447, ---a5, 311434378384, --- a1, 3117344658381, --- ty, 15855533308387, --- ty, 33434184, --- a5, 120456-05,

I have a variation of a question ask before:

using this RegEx:

(?<=--- ty, ).*?(?=,)

How I can modify to obtain the numbers of the ty corresponding to a specific dv

Any idea?
0
Comment
Question by:joyacv2
  • 12
  • 7
  • 3
22 Comments
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
What is a div?
0
 
LVL 1

Author Comment

by:joyacv2
Comment Utility
dv is a name of a device
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
OK. I read div :)
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
Too late for me. Here's what I did so far:
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){0})
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1})
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2})
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){3})
---

Open in new window

This will exclude the last n dv's from the results.
So (?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1}) will exclude the last dv, (?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2}) will exclude the last 2 dvs and so on.

Will try again tomorrow, if no one manages to get a working regexp.

HTH,
Dan
0
 
LVL 1

Author Comment

by:joyacv2
Comment Utility
Hi Dan,

I will wait for tomorrow to continue the problem, thanks!
0
 
LVL 35

Expert Comment

by:Terry Woods
Comment Utility
I'd build the device number into the pattern like this, and do a replace:
(?s)^.*?Job dv:532.*?--- ty, (\d+).*$

Open in new window

The (?s) activates single line mode.

Replacing with:
$1

Open in new window


If you're using PHP this would be:
$devicenum = "532";
$my_ty_number = preg_replace("#^.*?Job dv:{$devicenum}.*?--- ty, (\d+).*$#s", "$1", $data);

Open in new window

(activated single line mode with the pattern modifier after the pattern, rather than within the pattern)

The result is 311762
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
@Terry: There can be multiple ty's in the same dv. For the 532 dv you chose, you have:
--- ty, 311762,
--- ty, 158533308387,
--- ty, 334184,
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
So, next step. This will remove the first n-1 dv's from matching:
(?<=(\*\*[^*]*?){2})(?<=--- ty, ).*?(?=,)
(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,)
(?<=(\*\*[^*]*?){4})(?<=--- ty, ).*?(?=,)

Open in new window

(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,) will prevent the first 2 dvs from matching.

Now let's see how we can combine those...
0
 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 334 total points
Comment Utility
(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1}) will restrict the matching to the 3rd dv (456)

(?<=(\*\*[^*]*?){2})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2}) will restrict the matching to the 2nd dv (532)

So your problem is solvable if you know how many dv groups you have (basically how many "**" groups you have - note that I assumed the last group does not have a ** at the end, as per your sample).

If you have n groups and you want to only match the m-th one, then you would use:
(?<=(\*\*[^*]*?){m})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){n-m})
0
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 166 total points
Comment Utility
If using PHP, I'd split the task into 2 parts, like this, as the patterns can be kept simpler and are thus easier to maintain if any changes are ever needed:
<?
$data = 'JE: ** Job dv:9987 . a1:311 a2:565 a3:1204 --- ty, 311762, --- a5, 31178747, --- b3, 31178384, ---a7, 31178381, --- a1, 15808387, --- ty, 3184, --- a5, 12045-05, ** Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384, --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **Job dv:456 . a1:31231 a2:565 a3:12054 --- ty, 3141762, --- a5, 4311748747, --- b3, 311784384, ---a7, 311478381, --- a1, 158048387, --- ty, 344184, --- a5, 1442045-05, ** Job dv:45654 a1:3441321 a2:56544 a3:124404 --- ty, 31174642, --- a2, 311767874447, ---a5, 311434378384, --- a1, 3117344658381, --- ty, 15855533308387, --- ty, 33434184, --- a5, 120456-05, ';

$dv_to_get = '532';
$dv = preg_replace("#^.*(Job dv:{$dv_to_get}(?:(?!Job dv).)*).*$#s", '$1', $data);
print "\$dv: $dv\n";

preg_match_all('#--- ty, (\d+)#', $dv, $matches);
print "ty's:";
print_r($matches[1]);

Open in new window

Result:
$dv: Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384,
 --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **
ty's:Array
(
    [0] => 311762
    [1] => 158533308387
    [2] => 334184
)

Open in new window

0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
Nice, Terry!

@joyacv2: So there you have it. 2 solutions, one that requires you to know the dv's position, the other that requires you to know the dv's name/id.
If you only want to parse the string, my solution is easier to parse in a loop, giving you each dv's ty's in an iteration.
If you're only interested in finding the ty's for a particular dv (for which you know the id), Terry's solution will give you that, without the need to know the position.
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 35

Expert Comment

by:Terry Woods
Comment Utility
Thanks for your feedback Dan; you deserve a points split if my solution is accepted.
0
 
LVL 1

Author Comment

by:joyacv2
Comment Utility
Hi Terry and Dan:

Hi Dan:

In this lines:
(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1}) will restrict the matching to the 3rd dv (456)

(?<=(\*\*[^*]*?){2})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2}) will restrict the matching to the 2nd dv (532)

Open in new window


What are the 3 and 1 in the first line, and the 2 and 2 in the second line? How I can modify to always refers to the last item?

Hi Terry:

Your code will always work with the last match?

To both:

I ask this because I have some cases that the Job dv: number is the same but the other numbers change. I always select the corresponding ty of the last Job dv


Thank You very much to both!
0
 
LVL 34

Accepted Solution

by:
Dan Craciun earned 334 total points
Comment Utility
If you only want the last dv, regardless of dv's id, this will do it:
(?<=--- ty, )\d*(?!.*\*\*)

Open in new window


If you want the last dv of a certain id, I don't think there is a pure regexp solution. You're going to have to use php, powershell, python, perl or some other p-starting concoction :)
0
 
LVL 1

Author Comment

by:joyacv2
Comment Utility
Hi Dan,

This line of code not return nothing, can you check?
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
It's returning this in RegexBuddy, using PCRE or .NET engines:
31174642
15855533308387
33434184

Open in new window

using this as test data:
JE: ** Job dv:9987 . a1:311 a2:565 a3:1204 --- ty, 311762, --- a5, 31178747, --- b3, 31178384, ---a7, 31178381, --- a1, 15808387, --- ty, 3184, --- a5, 12045-05, ** Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384, --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **Job dv:456 . a1:31231 a2:565 a3:12054 --- ty, 3141762, --- a5, 4311748747, --- b3, 311784384, ---a7, 311478381, --- a1, 158048387, --- ty, 344184, --- a5, 1442045-05, ** Job dv:45654 a1:3441321 a2:56544 a3:124404 --- ty, 31174642, --- a2, 311767874447, ---a5, 311434378384, --- a1, 3117344658381, --- ty, 15855533308387, --- ty, 33434184, --- a5, 120456-05, 

Open in new window

0
 
LVL 1

Author Comment

by:joyacv2
Comment Utility
sorry dan, let me check again, i make a mistake
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
What are the 3 and 1 in the first line, and the 2 and 2 in the second line? How I can modify to always refers to the last item?
See here: http://www.experts-exchange.com/Programming/Languages/Regular_Expressions/Q_28417719.html#a40016728

You have 4 dv groups in your sample data, so n is 4.
0
 
LVL 1

Author Comment

by:joyacv2
Comment Utility
Yesss, this return the last three corresponding to the last job dv,

I am using prey_match_all('/(?<=--- ty, )\d*(?!.*\*\*)/',$data,$res,PREG_PATTERN_ORDER)

but when i make

echo count($res) the result is 1, but the returning array give me 3 ty's

like i test:

$matches[0][0]."<br>";
$matches[0][1]."<br>";
$matches[0][2]."<br>";

Why is count function returning an incorrect number when the test works well?
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
echo count($res) is giving you 1, because it's a 2 dimensional array.
To see the actual number of results try:
echo count($res[0]);
0
 
LVL 1

Author Closing Comment

by:joyacv2
Comment Utility
I have to say that these solutions are excellent and works perfect for my implementation in my code!!!!

Thank You very much to Dan and Terry!
0
 
LVL 34

Expert Comment

by:Dan Craciun
Comment Utility
You're welcome and I'm glad I could help!
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
As most anyone who uses or has come across them can attest to, regular expressions (regex) are a complicated bit of magic. Packed so succinctly within their cryptic syntax lies a great deal of power. It's not the "take over the world" kind of power,…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now