Need Help With Regular Expression for Meta Tag Description

Hi guys-

I'm having a problem writing a regex to pull the Description Meta Tag out of the body of a CFHTTP request.

I'm working with..

#REFind("<meta.*content=(\"|\')(.*)(\"|\')\s.*>", cfhttp.fileContent)#

I'm thinking I need to escape the quote characters.  What am I missing?
Who is Participating?
trailblazzyr55Connect With a Mentor Commented:
in the case you may be using single or double quotes, you could do something like so...

<cfset myFind = refindnocase("<meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>",CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3])>

Give this a shot...

REReplaceNoCase(CFHTTP.FileContent, "<meta .*?>", "", "ALL")
Oops sorry you want to get it, not replace it....

REFind("<meta .*?>", "CFHTTP.FileContent")
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

you'd need to do something like so....

<cfset myFind = refindnocase('<meta.*content=\s?"(.*)"\s.*>',CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2])>

your regex was very close, however due to the checking for single and double quotes, you needed to break up the regex into portions so CF didn't get the quotes situation confused...

This regex is basically broke into parts and put together to form one regex...

"<meta.*content=\s?('|"      &      '")(.*)("|'       &      "')\s.*>" is broken into sections....

"<meta.*content=\s?('|"    and     '")(.*)("|'     and     "')\s.*>"

notice the quote usage, this helps to preserve the quotes in this situation...

hope that helps,
SiriusPhilAuthor Commented:
Thanks trailblazzyr55 -

Thats the solution I was looking for.  Specifically, I need to address sites that use single quotes and ones that use double quotes..

Thanks for the explanation.  I'm still learning RegEx.  Still looks greek to me though.

cool, glad that was able to help... also you may already be aware of this, but I thought I'd point it out...

it has to do with references to the regex and the [pos] and [len] used in the mid function....

with: <meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>
                                          1      2      3

notice in this regex there's three sections that use the rounded brackets ()'s

when you want to grab what's found in one of those sections

1) mid(CFHTTP.FileContent,myFind.pos[1],myFind.len[1]) ---- returns everything that's found with the regex
2) mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2]) ---- returns what's found in the first set of ()'s
3) mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3]) ---- returns what's found in the second set of ()'s
4) mid(CFHTTP.FileContent,myFind.pos[4],myFind.len[4]) ---- returns what's found in the third set of ()'s

similar to backreferences and the ()'s ... this greatly helps grab portions in what you're searching!

Anyway, glad I could help, good luck! I love regex's so I'm always up for questions on them ;o)
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.