Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Need Help With Regular Expression for Meta Tag Description

Posted on 2006-11-05
7
Medium Priority
?
345 Views
Last Modified: 2013-12-24
Hi guys-

I'm having a problem writing a regex to pull the Description Meta Tag out of the body of a CFHTTP request.

I'm working with..

#REFind("<meta.*content=(\"|\')(.*)(\"|\')\s.*>", cfhttp.fileContent)#

I'm thinking I need to escape the quote characters.  What am I missing?
0
Comment
Question by:SiriusPhil
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
7 Comments
 
LVL 7

Expert Comment

by:aseusainc
ID: 17878306
Give this a shot...

REReplaceNoCase(CFHTTP.FileContent, "<meta .*?>", "", "ALL")
0
 
LVL 7

Expert Comment

by:aseusainc
ID: 17878312
Oops sorry you want to get it, not replace it....

REFind("<meta .*?>", "CFHTTP.FileContent")
0
 
LVL 20

Expert Comment

by:trailblazzyr55
ID: 17881201
you'd need to do something like so....

<cfset myFind = refindnocase('<meta.*content=\s?"(.*)"\s.*>',CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2])>

<cfoutput>#myContent#</cfoutput>
0
Flexible connectivity for any environment

The KE6900 series can extend and deploy computers with high definition displays across multiple stations in a variety of applications that suit any environment. Expand computer use to stations across multiple rooms with dynamic access.

 
LVL 20

Accepted Solution

by:
trailblazzyr55 earned 2000 total points
ID: 17881290
in the case you may be using single or double quotes, you could do something like so...

<cfset myFind = refindnocase("<meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>",CFHTTP.FileContent,1,'yes')>
<cfset myContent = mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3])>

<cfoutput>#myContent#</cfoutput>
0
 
LVL 20

Expert Comment

by:trailblazzyr55
ID: 17881335
your regex was very close, however due to the checking for single and double quotes, you needed to break up the regex into portions so CF didn't get the quotes situation confused...

This regex is basically broke into parts and put together to form one regex...

"<meta.*content=\s?('|"      &      '")(.*)("|'       &      "')\s.*>" is broken into sections....

"<meta.*content=\s?('|"    and     '")(.*)("|'     and     "')\s.*>"

notice the quote usage, this helps to preserve the quotes in this situation...

hope that helps,
~trail
0
 

Author Comment

by:SiriusPhil
ID: 17881417
Thanks trailblazzyr55 -

Thats the solution I was looking for.  Specifically, I need to address sites that use single quotes and ones that use double quotes..

Thanks for the explanation.  I'm still learning RegEx.  Still looks greek to me though.

Phil
0
 
LVL 20

Expert Comment

by:trailblazzyr55
ID: 17881484
cool, glad that was able to help... also you may already be aware of this, but I thought I'd point it out...

it has to do with references to the regex and the [pos] and [len] used in the mid function....

with: <meta.*content=\s?('|"&'")(.*)("|'&"')\s.*>
                                          1      2      3

notice in this regex there's three sections that use the rounded brackets ()'s

when you want to grab what's found in one of those sections

1) mid(CFHTTP.FileContent,myFind.pos[1],myFind.len[1]) ---- returns everything that's found with the regex
2) mid(CFHTTP.FileContent,myFind.pos[2],myFind.len[2]) ---- returns what's found in the first set of ()'s
3) mid(CFHTTP.FileContent,myFind.pos[3],myFind.len[3]) ---- returns what's found in the second set of ()'s
4) mid(CFHTTP.FileContent,myFind.pos[4],myFind.len[4]) ---- returns what's found in the third set of ()'s

similar to backreferences and the ()'s ... this greatly helps grab portions in what you're searching!

Anyway, glad I could help, good luck! I love regex's so I'm always up for questions on them ;o)
Thanks,
~trail
0

Featured Post

Optimum High-Definition Video Viewing and Control

The ATEN VM0404HA 4x4 4K HDMI Matrix Switch supports 4K resolutions of UHD (3840 x 2160) and DCI (4096 x 2160) with refresh rates of 30 Hz (4:4:4) and 60 Hz (4:2:0). It is ideal for applications where the routing of 4K digital signals is required.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Most ColdFusion developers get confused between the CFSet, Duplicate, and Structcopy methods of copying a Structure, especially which one to use when. This Article will explain the differences in the approaches with examples; therefore, after readin…
If you don't have the right permissions set for your WordPress location in IIS, you won't be able to perform automatic updates. Here's how to fix the problem.
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
This is my first video review of Microsoft Bookings, I will be doing a part two with a bit more information, but wanted to get this out to you folks.

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question