Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

Troubleshooting
Research
Professional Opinions
Ask a Question
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

troubleshooting Question

Regular Expression - Matching HTML INPUT value - with no double quotes around it

Avatar of aaron900
aaron900 asked on
C#Regular Expressions
6 Comments1 Solution490 ViewsLast Modified:
So I guess this is why everybody advises against matching HTML with regexs ;-) Unfortunately, with my app, I don't have a choice to read it from the DOM... I have an IE add-on reading HTML and passing it on to me to parse it server-side, and I'm seeing some things I've never seen. For some reason IE is at times returning values and names without double quotes around them.

<INPUT onchange=setdirty(0); value=555-555-1212 type=text name=phn_Agent_Phone_CF15>
<INPUT onchange=setdirty(0); value=test@test.com type=text name=email_Agent_Email_CF15>
<INPUT onchange=setdirty(0); value=1/1/2010 type=text name=dt_HFTrip_from_date_CF15>

I don't need one gigantic regex that can capture all values - need one that I can define the name in, and get the value. But the trick is, sometimes the name has quotes around it, sometimes it doesn't (I made this optional for the name, see below). But the value of the tag is really throwing me off. I can't seem to define in my capture a way to tell it that if it has no double quotes around the value, to go ahead and allow it, but stop capturing after the first space it encounters.

One regular expression, that can capture the value of of an HTML INPUT tag based on the name I define it, with double quotes being totally optional, would be awesome, but not necessary... I have a regular expression that works great when there are quotes around it, but I can't seem to get the one to work that has no double quotes around the value.

Here's what I'm trying for the email address, *when it has double quotes* (it works great):
<input [^>]*(?<=name="?email_Agent_Email_CF15"?[^<]*)(?<=value="([^"]*)"[^<]*)>

But no matter what I try for accommodating for ones with no quotes, just stopping at the first space, I always end up getting extra markup that shouldn't be returned.

Your help is GREATLY appreciated - thanks!