jitz
asked on
regEx not parsing correctly
I'm trying to write my own BBCode to HTML converter in VB.Net 2008 using some code I found as a base. So far everything is working well, but for some reason I can't get RegEx to replace the "Quote" with a "blockquote".
Can someone take a look at this code and see if you can tell me what I'm doing wrong? This is driving me nuts!
The BBCode looks something like:
[QUOTE=SomeDude]Some example text goes here[/QUOTE]
I used almost the exact same code for the Font Size and Color and it works fine.
I sure hope someone can help! Thanks!
Can someone take a look at this code and see if you can tell me what I'm doing wrong? This is driving me nuts!
The BBCode looks something like:
[QUOTE=SomeDude]Some example text goes here[/QUOTE]
I used almost the exact same code for the Font Size and Color and it works fine.
I sure hope someone can help! Thanks!
Public Function BBtoHTML(BBCode as string) as String
Dim regExp As System.Text.RegularExpressions.Regex
Dim Ret as string = BBCode
regExp = New Regex("\[QUOTE=([^\]]+)\]([^\]]+)\[\/QUOTE\]")
Ret = regExp.Replace(Ret, "<blockquote style=""background-color: #CCCCCC; border-width: thin"">" & "Originally Posted by <strong>$1</strong><br /><em>$2</em></blockquote>")
Return Ret
End Function
Use nongreedy operator ? and [^\[] instead of [^\]] in the second bracket
\[QUOTE=([^\]]+?)\]([^\[+?)\[\/QUOTE\]
Sorry forgot a bracket
\[QUOTE=([^\]]+?)\]([^\[]+?)\[\/QUOTE\]
To add ? to the pattern would suggest that the [^\]]+ or [^\[]+ might overmatch or otherwise benefit from having ? added, which I don't see that being the case.
The actual issue with the original pattern was that the second ([^\]]+) needed to be ([^\[]+)
Here's the working code with my pattern:
The actual issue with the original pattern was that the second ([^\]]+) needed to be ([^\[]+)
Here's the working code with my pattern:
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
Console.WriteLine(BBtoHTML("[QUOTE=SomeDude]Some example text goes here[/QUOTE]"))
End Sub
Public Function BBtoHTML(ByVal BBCode As String) As String
Dim regExp As System.Text.RegularExpressions.Regex
regExp = New Regex("\[QUOTE=([^\]]+)\]([^\[]+)\[\/QUOTE\]")
Return regExp.Replace(BBCode, "<blockquote style=""background-color: #CCCCCC; border-width: thin"">Originally Posted by <strong>$1</strong><br /><em>$2</em></blockquote>")
End Function
End Module
ASKER
Sorry, I was sent out of town for some work. I'll try these suggestions and get back as soon as I can.
Thanks for all the help guys!
Thanks for all the help guys!
ASKER
I tried the example and it does work, but only if another quote isn't embedded within a quote. Does that make sense?
I've attached my test BBCode and my basic function for the conversion. There are still a few things in the BBCode that I haven't written any code for, but I'm still including it just in case. Like I said earlier, I found some of this code on the web and have been trying to add to it. Also is there anyway to use the regex without regard for case?
Heres my test BBCode (kept in an access database, all one string):
[B]The[/B] [FONT="Comic Sans MS"][SIZE="1"]quick[/SIZE] [/FONT] [U]brown[/U] [I]fox [COLOR="Red"]jumped[/COLOR ] [COLOR="DarkOrchid"]over[/ COLOR] the lazy dogs back![/i]
[QUOTE=SomeDude]This is a quote![QUOTE=AnotherDude]T his is a quote within a quote[QUOTE=ThirdDude]This is a quote within a quote within a quote[/QUOTE][/QUOTE][/QUO TE]
[IMG]http://upload.wikimedia.org/wikipedia/en/thumb/2/24/Lenna.png/200px-Lenna.png[/IMG]
[URL="http://www.microsoft.com"]Test Link[/URL]
[LIST=1]
[*]Item 1
[*]Item 2
[*]Item 3
[/LIST]
[LIST]
[*]Item 1
[*]Item 2
[*]Item 3
[/LIST]
[INDENT]Indented text[/INDENT]
Thanks!
I've attached my test BBCode and my basic function for the conversion. There are still a few things in the BBCode that I haven't written any code for, but I'm still including it just in case. Like I said earlier, I found some of this code on the web and have been trying to add to it. Also is there anyway to use the regex without regard for case?
Heres my test BBCode (kept in an access database, all one string):
[B]The[/B] [FONT="Comic Sans MS"][SIZE="1"]quick[/SIZE]
[QUOTE=SomeDude]This is a quote![QUOTE=AnotherDude]T
[IMG]http://upload.wikimedia.org/wikipedia/en/thumb/2/24/Lenna.png/200px-Lenna.png[/IMG]
[URL="http://www.microsoft.com"]Test Link[/URL]
[LIST=1]
[*]Item 1
[*]Item 2
[*]Item 3
[/LIST]
[LIST]
[*]Item 1
[*]Item 2
[*]Item 3
[/LIST]
[INDENT]Indented text[/INDENT]
Thanks!
Public Function ConvertBBCodeToHTML(ByVal BBCode As String) As String
Dim regExp As Regex
Dim Ret As String = BBCode
'//Regex for URL tag without anchor
regExp = New Regex("\[URL\]([^\]]+)\[\/URL\]")
Ret = regExp.Replace(Ret, "<a href=""$1"">$1</a>")
'//Regex for URL with anchor
regExp = New Regex("\[URL=([^\]]+)\]([^\]]+)\[\/URL\]")
Ret = regExp.Replace(Ret, "<a href=""$1"">$2</a>")
'//Image regex
regExp = New Regex("\[IMG\]([^\]]+)\[\/IMG\]")
Ret = regExp.Replace(Ret, "<img src=""$1"" />")
'//Bold text
regExp = New Regex("\[B\](.+?)\[\/B\]")
Ret = regExp.Replace(Ret, "<b>$1</b>")
'//Italic text
regExp = New Regex("\[I\](.+?)\[\/I\]")
Ret = regExp.Replace(Ret, "<i>$1</i>")
'//Underline text
regExp = New Regex("\[U\](.+?)\[\/U\]")
Ret = regExp.Replace(Ret, "<u>$1</u>")
'//Font size
regExp = New Regex("\[SIZE=([^\]]+)\]([^\]]+)\[\/SIZE\]")
Ret = regExp.Replace(Ret, "<font size=$1>$2</font>")
'//Font name
regExp = New Regex("\[FONT=([^\]]+)\]([^\]]+)\[\/FONT\]")
Ret = regExp.Replace(Ret, "<font face=$1>$2</font>")
'//Font color
regExp = New Regex("\[COLOR=([^\]]+)\]([^\]]+)\[\/COLOR\]")
Ret = regExp.Replace(Ret, "<font color=$1>$2</font>")
'//Quote
regExp = New Regex("\[QUOTE=([^\]]+)\]([^\[]+)\[\/QUOTE\]")
Ret = regExp.Replace(Ret, "<blockquote style=""background-color: #E4E4E4; border-width: thin"">"Originally Posted by <strong>$1</strong><br /><em>$2</em></blockquote>")
Ret = Ret.Replace(vbNewLine, "<br />")
Return Ret
End Function
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I couldn't quite figure out how to get the MatchEvaluator to work in my situation, but I'm obviously not the greatest coder anyway. :)
This may not the right way to do it, but it definately works! Now I just have to fix a few font issues and such and the "Quote" looks like it's good to go!
Oh, Btw, I found the RegexOptions.IgnoreCase option and that helped tremendously!
Thanks again!
This may not the right way to do it, but it definately works! Now I just have to fix a few font issues and such and the "Quote" looks like it's good to go!
Oh, Btw, I found the RegexOptions.IgnoreCase option and that helped tremendously!
Thanks again!
regExp = New Regex("\[QUOTE=([^\]]+)\]([^\[]+)\[\/QUOTE\]", RegexOptions.Multiline Or RegexOptions.IgnoreCase)
Do While regExp.IsMatch(Ret) = True
Ret = regExp.Replace(Ret, "<blockquote style=""background-color: #E4E4E4; border-width: thin"">"Originally Posted by <strong>$1</strong><br /><em>$2</em></blockquote>")
Loop
Thanks for the question and the points.
"\[QUOTE=([^\]]+)\]([^\[]+