Need Help With Regular Expression for Meta Tag Description

Posted on 2006-11-05
Medium Priority
Last Modified: 2013-12-24
Hi guys-

I'm having a problem writing a regex to pull the Description Meta Tag out of the body of a CFHTTP request.

I'm working with..

#REFind("<meta.*content=(\"|\')(.*)(\"|\')\s.*>", cfhttp.fileContent)#

I'm thinking I need to escape the quote characters.  What am I missing?
Question by:SiriusPhil
  • 4
  • 2

Expert Comment

ID: 17878306
Give this a shot...

REReplaceNoCase(CFHTTP.FileContent, "<meta .*?>", "", "ALL")

Expert Comment

ID: 17878312
Oops sorry you want to get it, not replace it....

REFind("<meta .*?>", "CFHTTP.FileContent")
LVL 20

Expert Comment

ID: 17881201
you'd need to do something like so....

<cfset myFind = refindnocase('<meta.*content=\s?"(.*)"\s.*>',CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2])>

Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

LVL 20

Accepted Solution

trailblazzyr55 earned 2000 total points
ID: 17881290
in the case you may be using single or double quotes, you could do something like so...

<cfset myFind = refindnocase("<meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>",CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3])>

LVL 20

Expert Comment

ID: 17881335
your regex was very close, however due to the checking for single and double quotes, you needed to break up the regex into portions so CF didn't get the quotes situation confused...

This regex is basically broke into parts and put together to form one regex...

"<meta.*content=\s?('|"      &      '")(.*)("|'       &      "')\s.*>" is broken into sections....

"<meta.*content=\s?('|"    and     '")(.*)("|'     and     "')\s.*>"

notice the quote usage, this helps to preserve the quotes in this situation...

hope that helps,

Author Comment

ID: 17881417
Thanks trailblazzyr55 -

Thats the solution I was looking for.  Specifically, I need to address sites that use single quotes and ones that use double quotes..

Thanks for the explanation.  I'm still learning RegEx.  Still looks greek to me though.

LVL 20

Expert Comment

ID: 17881484
cool, glad that was able to help... also you may already be aware of this, but I thought I'd point it out...

it has to do with references to the regex and the [pos] and [len] used in the mid function....

with: <meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>
                                          1      2      3

notice in this regex there's three sections that use the rounded brackets ()'s

when you want to grab what's found in one of those sections

1) mid(CFHTTP.FileContent,myFind.pos[1],myFind.len[1]) ---- returns everything that's found with the regex
2) mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2]) ---- returns what's found in the first set of ()'s
3) mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3]) ---- returns what's found in the second set of ()'s
4) mid(CFHTTP.FileContent,myFind.pos[4],myFind.len[4]) ---- returns what's found in the third set of ()'s

similar to backreferences and the ()'s ... this greatly helps grab portions in what you're searching!

Anyway, glad I could help, good luck! I love regex's so I'm always up for questions on them ;o)

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Meet the world's only “Transparent Cloud™” from Superb Internet Corporation. Now, you can experience firsthand a cloud platform that consistently outperforms Amazon Web Services (AWS), IBM’s Softlayer, and Microsoft’s Azure when it comes to CPU and …
This installment of Make It Better gives Media Temple customers the latest news, plugins, and tutorials to make their VPS hosting experience that much smoother.
The video provides a quick and easy steps to migrate MBOX file to well known Outlook PST and Office 365. Besides this, it also supports and migrates more than 20 email clients of MBOX which include AppleMail, Opera, Thunderbird and SeaMonkey effortl…
In the video, one can understand the process of resizing images in single or bulk. Kernel Bulk Image Resizer is an easy to use tool for resizing large number of images. One can add and resize multiple images with this tool in single go. The video sh…
Suggested Courses

624 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question