Avatar of AlexPonnath
AlexPonnath
Flag for United States of America asked on

How can i find all input names and the coresponding values in text file

I am in need of parsing field name and values from an html form to add to my db. I know i can go and do a find
 "input name='" then start another find to find the closing "'" and get the data via mid function then do the same
 for value via find "value='"
 I was wondering if there is an easier way to loop the doc and extract all input names and the associated values ?

 Below is a sample of what my page to parse looks like

<input name='a_glare'
                        value='B'
                        class='inputbox-highlighted-false'
                        size='1'
                        maxlength='1'>  
        </td>



                 <td align="center">


                    <input name='a_testani'
                        value=''
                        class='inputbox-highlighted-false'
                        size='1'
                        maxlength='1'>  

                 </td>

                 <td align="center">

                    <input name='a_tksig'
                        value='EC'
                        class='inputbox-highlighted-false'
                        size='2'
                        maxlength='2'>  


                 </td>

                 <td align="center">

                    <input name='a_sacnon'
                        value=''
                        class='inputbox-highlighted-false'
                        size='1'
                        maxlength='1'>  

                 </td>

                 <td align="center">

                    <input name='a_ot'
                        value=''
                        class='inputbox-highlighted-false'
                        size='1'
                        maxlength='1'>  

                 </td>


                 <td align="center">

                    <input name='a_ovlp'
                        value=''
                        class='inputbox-highlighted-false'
                        size='1'
                        maxlength='1'>  
Visual Basic.NETVisual Basic Classic.NET Programming

Avatar of undefined
Last Comment
Ian

8/22/2022 - Mon
Michael Fowler

ASKER CERTIFIED SOLUTION
Ian

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Ian

Hi there AlexPonnath,

If you can get regular expressions going under program control, then you would just need to itterate over the html page,  feeding in the first pattern match (name=('|")(\w*)\1\s*value=('|")(\w*)\3.*$) and select the match 2 and match 4 from the result. Note depending on the routines,  a match number 0 is usually returned which is the whole pattern. (In addition to match1, match2, match3 and match4).

Ian
AlexPonnath

ASKER
Thanks, I have ultraedit which supports the regular expressions and it works as advertised. Great job, I ended up with
exactly what you said after running the 3 passes over the file. Now I just have to figure out how I can do this in my code.

I am a bit confused on your last comment on match 2 and 4 , I assume match1 is the pattern match (name=('|")(\w*)\1\s*value=('|")(\w*)\3.*$) but not sure about 2 and 4 and how I would access them

Thanks
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
Ian

Sorry, I didnt explain the numbering scheme very well.

Each set of parens ( up to matching ) is a kept match, numbered 1, 2, ...  

The numbering is by the order of the left paren, so that you have a method of uniquly numbering even with nested matches.

So in
name=    ('|")    (\w*)   \1    \s*   value=   ('|")    (\w*)    \3     .*    $
----------    ===    ====   ---    ----    ---------   ===    ====    ---     ---    --
                   1         2                                       3          4

the bits underlined with ===  are kept in numbered sequence, the bits underlined with ---- are not kept (except the whole string that is matched is available as number 0.

For better doco, you will need to search the web.  There is loads of doco about regular expressions.  Just be warned that the complicated bits can vary between implementations.  All the basis stuff is pretty much the same the world over!
Ian

If running under program control, I would just do the match

/name=('|")(\w*)\1\s*value=('|")(\w*)\3.*$/

and retrieve sub-match 2 and sub-match 4.    => itterate over the whole source HTML document.

[[[ Often the program functions want the string enclosed in  /  and  / as I have done here.  Read the notes on the PCRE functions you will use to see what it wants. ]]]


The replacement

\n#$2\t$4
and successive matches
^[^#].*$\R
and
\R\R

were there because with an editor you don't have other storage to put found stuff. Under program control you can just pick the matches off and store in an array or whatever.
Ian

Not sure what how the result of the PCRE you would use would go.

Maybe it will return an array of strings.

X[0]  ->  string which matches the whole name= ..... value= ...'   bit
X[1] ->   single/double quote
X[2]  ->  <name>
X[3]  ->  single/double quote
X[4]  ->  <value>

so for each itterated match, just get the returned X, and save X[2] and X[4], throw away the rest.

.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.