Solved

info between words

Posted on 2014-04-22
22
154 Views
Last Modified: 2014-04-23
I have the following string

JE: ** Job dv:9987 . a1:311 a2:565 a3:1204 --- ty, 311762, --- a5, 31178747, --- b3, 31178384, ---a7, 31178381, --- a1, 15808387, --- ty, 3184, --- a5, 12045-05, ** Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384, --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **Job dv:456 . a1:31231 a2:565 a3:12054 --- ty, 3141762, --- a5, 4311748747, --- b3, 311784384, ---a7, 311478381, --- a1, 158048387, --- ty, 344184, --- a5, 1442045-05, ** Job dv:45654 a1:3441321 a2:56544 a3:124404 --- ty, 31174642, --- a2, 311767874447, ---a5, 311434378384, --- a1, 3117344658381, --- ty, 15855533308387, --- ty, 33434184, --- a5, 120456-05,

I have a variation of a question ask before:

using this RegEx:

(?<=--- ty, ).*?(?=,)

How I can modify to obtain the numbers of the ty corresponding to a specific dv

Any idea?
0
Comment
Question by:joyacv2
  • 12
  • 7
  • 3
22 Comments
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40015784
What is a div?
0
 
LVL 1

Author Comment

by:joyacv2
ID: 40015789
dv is a name of a device
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40015792
OK. I read div :)
0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40015909
Too late for me. Here's what I did so far:
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){0})
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1})
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2})
(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){3})
---

Open in new window

This will exclude the last n dv's from the results.
So (?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1}) will exclude the last dv, (?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2}) will exclude the last 2 dvs and so on.

Will try again tomorrow, if no one manages to get a working regexp.

HTH,
Dan
0
 
LVL 1

Author Comment

by:joyacv2
ID: 40015931
Hi Dan,

I will wait for tomorrow to continue the problem, thanks!
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40016237
I'd build the device number into the pattern like this, and do a replace:
(?s)^.*?Job dv:532.*?--- ty, (\d+).*$

Open in new window

The (?s) activates single line mode.

Replacing with:
$1

Open in new window


If you're using PHP this would be:
$devicenum = "532";
$my_ty_number = preg_replace("#^.*?Job dv:{$devicenum}.*?--- ty, (\d+).*$#s", "$1", $data);

Open in new window

(activated single line mode with the pattern modifier after the pattern, rather than within the pattern)

The result is 311762
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40016715
@Terry: There can be multiple ty's in the same dv. For the 532 dv you chose, you have:
--- ty, 311762,
--- ty, 158533308387,
--- ty, 334184,
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40016720
So, next step. This will remove the first n-1 dv's from matching:
(?<=(\*\*[^*]*?){2})(?<=--- ty, ).*?(?=,)
(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,)
(?<=(\*\*[^*]*?){4})(?<=--- ty, ).*?(?=,)

Open in new window

(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,) will prevent the first 2 dvs from matching.

Now let's see how we can combine those...
0
 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 334 total points
ID: 40016728
(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1}) will restrict the matching to the 3rd dv (456)

(?<=(\*\*[^*]*?){2})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2}) will restrict the matching to the 2nd dv (532)

So your problem is solvable if you know how many dv groups you have (basically how many "**" groups you have - note that I assumed the last group does not have a ** at the end, as per your sample).

If you have n groups and you want to only match the m-th one, then you would use:
(?<=(\*\*[^*]*?){m})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){n-m})
0
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 166 total points
ID: 40016729
If using PHP, I'd split the task into 2 parts, like this, as the patterns can be kept simpler and are thus easier to maintain if any changes are ever needed:
<?
$data = 'JE: ** Job dv:9987 . a1:311 a2:565 a3:1204 --- ty, 311762, --- a5, 31178747, --- b3, 31178384, ---a7, 31178381, --- a1, 15808387, --- ty, 3184, --- a5, 12045-05, ** Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384, --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **Job dv:456 . a1:31231 a2:565 a3:12054 --- ty, 3141762, --- a5, 4311748747, --- b3, 311784384, ---a7, 311478381, --- a1, 158048387, --- ty, 344184, --- a5, 1442045-05, ** Job dv:45654 a1:3441321 a2:56544 a3:124404 --- ty, 31174642, --- a2, 311767874447, ---a5, 311434378384, --- a1, 3117344658381, --- ty, 15855533308387, --- ty, 33434184, --- a5, 120456-05, ';

$dv_to_get = '532';
$dv = preg_replace("#^.*(Job dv:{$dv_to_get}(?:(?!Job dv).)*).*$#s", '$1', $data);
print "\$dv: $dv\n";

preg_match_all('#--- ty, (\d+)#', $dv, $matches);
print "ty's:";
print_r($matches[1]);

Open in new window

Result:
$dv: Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384,
 --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **
ty's:Array
(
    [0] => 311762
    [1] => 158533308387
    [2] => 334184
)

Open in new window

0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40016739
Nice, Terry!

@joyacv2: So there you have it. 2 solutions, one that requires you to know the dv's position, the other that requires you to know the dv's name/id.
If you only want to parse the string, my solution is easier to parse in a loop, giving you each dv's ty's in an iteration.
If you're only interested in finding the ty's for a particular dv (for which you know the id), Terry's solution will give you that, without the need to know the position.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40016741
Thanks for your feedback Dan; you deserve a points split if my solution is accepted.
0
 
LVL 1

Author Comment

by:joyacv2
ID: 40017417
Hi Terry and Dan:

Hi Dan:

In this lines:
(?<=(\*\*[^*]*?){3})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){1}) will restrict the matching to the 3rd dv (456)

(?<=(\*\*[^*]*?){2})(?<=--- ty, ).*?(?=,)(?=.*(?:\*\*.*?){2}) will restrict the matching to the 2nd dv (532)

Open in new window


What are the 3 and 1 in the first line, and the 2 and 2 in the second line? How I can modify to always refers to the last item?

Hi Terry:

Your code will always work with the last match?

To both:

I ask this because I have some cases that the Job dv: number is the same but the other numbers change. I always select the corresponding ty of the last Job dv


Thank You very much to both!
0
 
LVL 34

Accepted Solution

by:
Dan Craciun earned 334 total points
ID: 40017443
If you only want the last dv, regardless of dv's id, this will do it:
(?<=--- ty, )\d*(?!.*\*\*)

Open in new window


If you want the last dv of a certain id, I don't think there is a pure regexp solution. You're going to have to use php, powershell, python, perl or some other p-starting concoction :)
0
 
LVL 1

Author Comment

by:joyacv2
ID: 40017490
Hi Dan,

This line of code not return nothing, can you check?
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40017493
It's returning this in RegexBuddy, using PCRE or .NET engines:
31174642
15855533308387
33434184

Open in new window

using this as test data:
JE: ** Job dv:9987 . a1:311 a2:565 a3:1204 --- ty, 311762, --- a5, 31178747, --- b3, 31178384, ---a7, 31178381, --- a1, 15808387, --- ty, 3184, --- a5, 12045-05, ** Job dv:532 a1:31321 a2:5654 a3:1204 --- ty, 311762, --- a2, 3117678747, ---a5, 3113378384, --- a1, 3117yy658381, --- ty, 158533308387, --- ty, 334184, --- a5, 120456-05, **Job dv:456 . a1:31231 a2:565 a3:12054 --- ty, 3141762, --- a5, 4311748747, --- b3, 311784384, ---a7, 311478381, --- a1, 158048387, --- ty, 344184, --- a5, 1442045-05, ** Job dv:45654 a1:3441321 a2:56544 a3:124404 --- ty, 31174642, --- a2, 311767874447, ---a5, 311434378384, --- a1, 3117344658381, --- ty, 15855533308387, --- ty, 33434184, --- a5, 120456-05, 

Open in new window

0
 
LVL 1

Author Comment

by:joyacv2
ID: 40017495
sorry dan, let me check again, i make a mistake
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40017503
What are the 3 and 1 in the first line, and the 2 and 2 in the second line? How I can modify to always refers to the last item?
See here: http://www.experts-exchange.com/Programming/Languages/Regular_Expressions/Q_28417719.html#a40016728

You have 4 dv groups in your sample data, so n is 4.
0
 
LVL 1

Author Comment

by:joyacv2
ID: 40017521
Yesss, this return the last three corresponding to the last job dv,

I am using prey_match_all('/(?<=--- ty, )\d*(?!.*\*\*)/',$data,$res,PREG_PATTERN_ORDER)

but when i make

echo count($res) the result is 1, but the returning array give me 3 ty's

like i test:

$matches[0][0]."<br>";
$matches[0][1]."<br>";
$matches[0][2]."<br>";

Why is count function returning an incorrect number when the test works well?
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40017672
echo count($res) is giving you 1, because it's a 2 dimensional array.
To see the actual number of results try:
echo count($res[0]);
0
 
LVL 1

Author Closing Comment

by:joyacv2
ID: 40017685
I have to say that these solutions are excellent and works perfect for my implementation in my code!!!!

Thank You very much to Dan and Terry!
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 40017691
You're welcome and I'm glad I could help!
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

791 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question