Need help with regular expression

i'm trying to extract the information from an HTML:

<ul>
                    			                                <li>
                                    <div class="col1"><a href="">http://filejungle.com/f/JHYjbd/RandomFileName.rar</a></div>
                                    <div class="col2">RandomFileName.rar</div>
                                    <div class="col3">277.52 MB</div>
                                    <div class="col4"><span class="icon approved">&nbsp;</span><span class="left">Available</span></div>
                                </li>
                                                                
                                                                
                            </ul>

Open in new window


what I need is:

File name:
File size:
Available / Not Available.

how could this be done and if so, please explain your answer.
LVL 4
rotem156Asked:
Who is Participating?
 
käµfm³d 👽Commented:
Based on your example, you should be able to use something like:

<div\s+class="col2">(?<filename>[^<]+)</div>\s*<div\s+class="col3">(?<filesize>[^<]+)</div>\s*<div\s+class="col4"><span[^>]*>[^<]*</span><span[^>]*>(?<availability>[^<]+)

Open in new window


where each datum that you are seeking is in its own named capture group. You can see the names inside the angle brackets above (i.e. filename, filesize, and availability). You would access these group in this manner:

Match m = Regex.Match(input, pattern);
string filename = m.Groups["filename"].Value;
string filesize = m.Groups["filesize"].Value;
string availability = m.Groups["availability"].Value;

Open in new window


If you need an explanation of the pattern, I am glad to accommodate.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.