Regex To Return A Link From Some Embedded Video Code

I use CKEditor for articles that are posted on our website.

With IE 7 and 8, the DOM automatically adds hyperlinks in the text. The problem comes if you are embedding video code and it auto-adds links. The resulting code is a mess, i.e.:

<object width="480" height="300"><param name="movie" value="<a href="http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1"></cke:param><cke:param">http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1"></param><param</a> name="allowFullScreen" value="true"></param><param name="allowScriptAccess" value="always"></param><embed src="<a href="http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1">http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1</a>" type="application/x-shockwave-flash" allowfullscreen="true" allowScriptAccess="always" width="480" height="300"></embed></object>

What I'd like to do is have a RegEx that will strip all of the code within an OBJECT OR EMBED tag and just return the link. So the regex would take the messy code above and return:

http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1

I can write a function to do this, but I wanted to see if there was a quick way to do it with a RegEx.
maniadiggityAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

_agx_Commented:
Maybe something like this
<cfset result = reFindNoCase('<a href="(.*?)">', messyText, 1, true)>
<cfif arrayLen(result.len)>
      <cfset theURL = mid(messyText, result.pos[2], result.len[2])>
         <!--- display results --->
      <cfoutput>#HTMLEditFormat(theURL)#</cfoutput>
</cfif>
0
Terry WoodsIT GuruCommented:
Use a pattern like this to capture an href from an object tag:
<object[^>]*>(?:(?!</object>).)*<a[^>]*href\s*=\s*"([^"]*)"(?:(?!</object>).)*</object>
0
_agx_Commented:
This ones a bit better than the one in my first post:
<cfset result = reFindNoCase('<a href="([^"]+)', messyText, 1, true)>
<cfif arrayLen(result.len) gt 1>
      <cfset theURL = mid(messyText, result.pos[2], result.len[2])>
      <cfoutput>#HTMLEditFormat(theURL)#</cfoutput>
</cfif>
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

_agx_Commented:
I didn't see TerryAtOpus's suggestion. I didn't test it, but if your text could contain other links, not inside the object tag, theirs might be more accurate.
0
maniadiggityAuthor Commented:
That's sort of what I was looking for, I should have posted the whole code. For instance, the following HTML code:
---------------------------------
Check out the video below:

<object width="480" height="300"><param name="movie" value="<a href="http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1"></cke:param><cke:param">http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1"></param><param</a>  name="allowFullScreen" value="true"></param><param name="allowScriptAccess" value="always"></param><embed src="<a href="http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1">http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1</a>" type="application/x-shockwave-flash" allowfullscreen="true" allowScriptAccess="always" width="480" height="300"></embed></object>

There is a followup here:

<object width="480" height="300"><param name="movie" value="<a href="http://www.youtube.com/v/9rdVumYBujg&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1"></cke:param><cke:param">http://www.youtube.com/v/9rdVumYBujg&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1"></param><param</a>  name="allowFullScreen" value="true"></param><param name="allowScriptAccess" value="always"></param><embed src="<a href="http://www.youtube.com/v/9rdVumYBujg&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1">http://www.youtube.com/v/9rdVumYBujg&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1</a>" type="application/x-shockwave-flash" allowfullscreen="true" allowScriptAccess="always" width="480" height="300"></embed></object>

We will have more on this story later.
---------------------------------

should return....

---------------------------------
Check out the video below:

http://www.youtube.com/v/pCRpFNgiTtE&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1

There is a followup here:
http://www.youtube.com/v/9rdVumYBujg&color1=0xb1b1b1&color2=0xcfcfcf&hl=en_US&feature=player_embedded&fs=1

We will have more on this story later.
---------------------------------
0
Terry WoodsIT GuruCommented:
Yes, my regex pattern will only pick up links in an object tag.
0
_agx_Commented:
You mean you want to replace the <object> stuff with just the URL? Then try using @TerryAtOpus's expression with
reReplaceNoCase()

<cfset result = reReplaceNoCase(messyText, '<object[^>]*>(?:(?!</object>).)*<a[^>]*href\s*=\s*"([^"]*)"(?:(?!</object>).)*</object>', "\1", "all")>
<cfoutput>#result#</cfoutput>

Or as a link...

<cfset result = reReplaceNocase(messyText, '<object[^>]*>(?:(?!</object>).)*<a[^>]*href\s*=\s*"([^"]*)"(?:(?!</object>).)*</object>',
            '<a href="\1">\1</a>', "all")>
<cfoutput>#result#</cfoutput>
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Terry WoodsIT GuruCommented:
Replace:

<object[^>]*>(?:(?!</object>).)*<a[^>]*href\s*=\s*"([^"]*)"(?:(?!</object>).)*</object>

With:

$1

to get the result you want. (It's the same pattern as my previous post)
0
_agx_Commented:
> With: $1

Yep.  CF uses  "\1" rather than "$1 though.
0
maniadiggityAuthor Commented:
TerryAtOpus, I'm not following, are you saying to replace the entire string:
<object[^>]*>(?:(?!</object>).)*<a[^>]*href\s*=\s*"([^"]*)"(?:(?!</object>).)*</object>

with just: $1     ?

Also agx, this seems to be working, how can it be altered to work for OBJECT or EMBED tags?
0
Terry WoodsIT GuruCommented:
Yes, that's right - agx has put my pattern into CF code for you already.
0
Terry WoodsIT GuruCommented:
To pick up embed tags in a similar way, it would pay to run 2 replacements to keep things a little simpler. Something like this might work (agx may be able to correct the CF code if it doesn't - I'm not a CF programmer):

<cfset result = reReplaceNoCase(messyText, '<object[^>]*>(?:(?!</object>).)*<a[^>]*href\s*=\s*"([^"]*)"(?:(?!</object>).)*</object>', "\1", "all")>
<cfoutput>#result#</cfoutput>
<cfset result2 = reReplaceNoCase(result, '<embed[^>]*>(?:(?!</embed>).)*<a[^>]*href\s*=\s*"([^"]*)"(?:(?!</embed>).)*</embed>', "\1", "all")>
<cfoutput>#result2#</cfoutput>
0
maniadiggityAuthor Commented:
Great, thanks guys, this worked perfect!
0
Terry WoodsIT GuruCommented:
You should give some points to agx too! You can get the question re-opened with the "request attention" link on the question near the top of the page.
0
maniadiggityAuthor Commented:
I'm new to this site, how do I give points to both of you after it's reopened?
0
Terry WoodsIT GuruCommented:
I think there's an option "Accept Multiple Answers", which then lets you assign points to each answer based on how helpful you find it.
0
_agx_Commented:
Thanks guys.  TerryAtOpus should get most of the points because it was their regex ;-)
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Regular Expressions

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.