Solved

Need Help With Regular Expression for Meta Tag Description

Posted on 2006-11-05
7
341 Views
Last Modified: 2013-12-24
Hi guys-

I'm having a problem writing a regex to pull the Description Meta Tag out of the body of a CFHTTP request.

I'm working with..

#REFind("<meta.*content=(\"|\')(.*)(\"|\')\s.*>", cfhttp.fileContent)#

I'm thinking I need to escape the quote characters.  What am I missing?
0
Comment
Question by:SiriusPhil
  • 4
  • 2
7 Comments
 
LVL 7

Expert Comment

by:aseusainc
ID: 17878306
Give this a shot...

REReplaceNoCase(CFHTTP.FileContent, "<meta .*?>", "", "ALL")
0
 
LVL 7

Expert Comment

by:aseusainc
ID: 17878312
Oops sorry you want to get it, not replace it....

REFind("<meta .*?>", "CFHTTP.FileContent")
0
 
LVL 20

Expert Comment

by:trailblazzyr55
ID: 17881201
you'd need to do something like so....

<cfset myFind = refindnocase('<meta.*content=\s?"(.*)"\s.*>',CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2])>

<cfoutput>#myContent#</cfoutput>
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 
LVL 20

Accepted Solution

by:
trailblazzyr55 earned 500 total points
ID: 17881290
in the case you may be using single or double quotes, you could do something like so...

<cfset myFind = refindnocase("<meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>",CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3])>

<cfoutput>#myContent#</cfoutput>
0
 
LVL 20

Expert Comment

by:trailblazzyr55
ID: 17881335
your regex was very close, however due to the checking for single and double quotes, you needed to break up the regex into portions so CF didn't get the quotes situation confused...

This regex is basically broke into parts and put together to form one regex...

"<meta.*content=\s?('|"      &      '")(.*)("|'       &      "')\s.*>" is broken into sections....

"<meta.*content=\s?('|"    and     '")(.*)("|'     and     "')\s.*>"

notice the quote usage, this helps to preserve the quotes in this situation...

hope that helps,
~trail
0
 

Author Comment

by:SiriusPhil
ID: 17881417
Thanks trailblazzyr55 -

Thats the solution I was looking for.  Specifically, I need to address sites that use single quotes and ones that use double quotes..

Thanks for the explanation.  I'm still learning RegEx.  Still looks greek to me though.

Phil
0
 
LVL 20

Expert Comment

by:trailblazzyr55
ID: 17881484
cool, glad that was able to help... also you may already be aware of this, but I thought I'd point it out...

it has to do with references to the regex and the [pos] and [len] used in the mid function....

with: <meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>
                                          1      2      3

notice in this regex there's three sections that use the rounded brackets ()'s

when you want to grab what's found in one of those sections

1) mid(CFHTTP.FileContent,myFind.pos[1],myFind.len[1]) ---- returns everything that's found with the regex
2) mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2]) ---- returns what's found in the first set of ()'s
3) mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3]) ---- returns what's found in the second set of ()'s
4) mid(CFHTTP.FileContent,myFind.pos[4],myFind.len[4]) ---- returns what's found in the third set of ()'s

similar to backreferences and the ()'s ... this greatly helps grab portions in what you're searching!

Anyway, glad I could help, good luck! I love regex's so I'm always up for questions on them ;o)
Thanks,
~trail
0

Featured Post

Create the perfect environment for any meeting

You might have a modern environment with all sorts of high-tech equipment, but what makes it worthwhile is how you seamlessly bring together the presentation with audio, video and lighting. The ATEN Control System provides integrated control and system automation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In our day to day coding, how many times have we come across a necessity to check whether a URL is a broken link or not? For those of you that answered countless and are using ColdFusion like myself, then this article is for you.  It will show yo…
When it comes to showing a 404 error page to your visitors, you do not want that generic page to show, and you especially do not want your hosting provider’s ad error page to show either. In this article, I will show you how to enable the custom 40…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question