Having a optional named group in the text I regex and return group = ""

Hi!

I have a small .NET REGEX problem. See the following text:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero. Sed cursus ante dapibus diam. Sed nisi. Nulla quis sem at nibh elementum imperdiet.

Insured      Birth date / Organisation no
Name:      Smith, John Thomas
Address:      Main road 160      07.05.1965
Post no:      4053      Sted:      UPTOWN
Contact person      Account
Name:      XXX Insurance      for eventual payment
Telephone:
Telefax:      55262640
E-mail:      Answer the sent e-mail
MOTOR VEHICLE / BOAT
Policy no      Reg.no      Specificaton      Main due date      Removing date
unknown      BN12345      Merc 123      21.12.2017
OTHER INSURANCE
Policy no      Policy line no      Specificaton      Main due date      Removing date
unknown      Home      Main road 160      21.12.2017
unknown      Travel      Smith, John Thomas      21.12.2017
PROXY

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero. Sed cursus ante dapibus diam. Sed nisi. Nulla quis sem at nibh elementum imperdiet.

To get what I want into named groups I can use e.g.:
Name:\t(?<owner>[\w, ]*)\r\nAddress:(?:[\w\n\t \r.:-]*)MOTOR VEHICLE / BOAT\r\n(?:[\w\t .]+)date\r\n(?<reg_and_date>[\w\t .,\r\n]+)OTHER INSURANCE\r\n(?:[\w\t .]+)date\r\n(?<other>[\w\t .,\r\n]+)PROXY

Open in new window

This gives me the following result:

Group $owner = "Smith, John Thomas"
Group $reg_and_date = "unknown      BN12345      Merc 123      21.12.2017 "
Group $other = "unknown      Home      Main road 160 21.12.2017
unknown      Travel      Smith, John Thomas      21.12.2017 "

which is exactly what I want. This is a report that I get several times a day and I need to regex it to get the information I need. Now to my problem. Sometimes the following section is completely missing from the reports and I therefore want the group in question to be set to "". It should be there, but empty:
OTHER INSURANCE
Policy no      Policy line no      Specificaton      Main due date      Removing date
unknown      Home      Main road 160      21.12.2017
unknown      Travel      Smith, John Thomas      21.12.2017

So the result of my REGEX should in the case where this section is missing be like this:

Group $owner = "Smith, John Thomas"
Group $reg_and_date = "unknown      BN12345      Merc 123      21.12.2017 "
Group $other = ""

I am sure it is a simple fix, but don't seem to be able to achieve it. Can someone please help?

Thanks!

Brgds
IVer in Oslo
Iver Erling ArvaSenior consultantAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

aikimarkCommented:
Try this pattern:
Name:\t(?<owner>[\w, ]*)\r\nAddress:(?:[\w\n\t \r.:-]*)MOTOR VEHICLE / BOAT\r\n(?:[\w\t .]+)date\r\n(?<reg_and_date>[\w\t .,\r\n]+)(OTHER INSURANCE\r\n(?:[\w\t .]+)date\r\n(?<other>[\w\t .,\r\n]+))?PROXY

Open in new window

0
Iver Erling ArvaSenior consultantAuthor Commented:
Hi, aikikark!

Thanks for that suggestion. That was one of the things I already tried and thought would work. Unfortunately, although it gives a match, the result ends up in the <reg_and_date> group and not in the <other> group. When I include the OTHER INSURANCE-section all ends up in <reg_and_date> and the <other>-group is empty.
I am using the regexstorm.net/tester to test the regex. It returns the groups contents as a tidy table and also matches the result I get in my Blue Prism application, so something is still wrong.

If you want to test for yourself, please be adviced that the Experts-exchange webpage replaces <tab> with <space> so, the included text doesn't quite work.  E.g. after Name: there is a <tab>, not spaces. But I guess you can see where it is relevant from the regex and the \t's

So, close, but no sigar! :-)

Brgds
IVer
0
aikimarkCommented:
Does PROXY exist when the optional text doesn't exist?

Use the CODE tag for your text to prevent the tab-to-text translation.  Optionally, upload a text file.
0
Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

Iver Erling ArvaSenior consultantAuthor Commented:
Yes, PROXY always exists.

IVer
0
aikimarkCommented:
Please test this pattern:
Name:\t(?<owner>[^\r]*)\r\n(?:[\S\s]+?)Removing date\r\n(?<reg_and_date>[^\r]+)\r\n(OTHER INSURANCE\r\n(?:[\w\t .]+)date\r\n(?<other>[\w\t .,\r\n]+))?PROXY

Open in new window

This is the text that I tested against.  Using a CODE tag preserves tab characters.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero. Sed cursus ante dapibus diam. Sed nisi. Nulla quis sem at nibh elementum imperdiet. 
Insured      Birth date / Organisation no
Name:	Smith, John Thomas
Address:	Main road 160      07.05.1965
Post no:	4053      Sted:	UPTOWN
Contact person      Account
Name:	XXX Insurance      for eventual payment
Telephone:
Telefax:	55262640
E-mail:	Answer the sent e-mail
MOTOR VEHICLE / BOAT
Policy no      Reg.no      Specificaton      Main due date      Removing date
unknown      BN12345      Merc 123      21.12.2017
OTHER INSURANCE
Policy no      Policy line no      Specificaton      Main due date      Removing date
unknown      Home      Main road 160      21.12.2017
unknown      Travel      Smith, John Thomas      21.12.2017
PROXY

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Iver Erling ArvaSenior consultantAuthor Commented:
Thanks! That seems to work exactly as I wanted. Thanks a lot!

Brgds
IVer
0
Iver Erling ArvaSenior consultantAuthor Commented:
Great help as always! THanks!
0
aikimarkCommented:
You're welcome.

This is a slightly simpler pattern:
Name:\t(?<owner>[^\r]*)\r\n[\S\s]+?Removing date\r\n(?<reg_and_date>[^\r]+)\r\n(OTHER INSURANCE\r\n(?:[\w\t .]+)date\r\n(?<other>[\w\t .,\r\n]+))?PROXY

Open in new window

0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
.NET Programming

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.