[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now


Need help with TCL's regexp and regsub

Posted on 2004-09-17
Medium Priority
Last Modified: 2013-12-26
I code tcl scripts for eggdrops on IRC.
The thing is I am quite good with tcl scripting such as string matching and most of the list, string functions and I see pre-written scripts and get ideas from them.

I am having troubles with regexp, regular expressions and regsub complex matching. I don't really know how to use regexp to match certain types of patterns. Could anyone give me examples or tutorials on how to match regular expressions such as specific numbers, words, letters in a pattern with a text or all numbers/words in a text while using wildcards as well.

Question by:awwyeah
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
  • +1

Author Comment

ID: 12082728
Also don't give me lame websites or stuff I have searched alot, plus the sucky tcl manual only shows the syntax and command usage not detailed examples.

Author Comment

ID: 12091556
I presume no one's good with TCL on this forum. :o|
LVL 24

Accepted Solution

fridom earned 600 total points
ID: 12158111
Well have you checked http://www.tcl.tk/doc/howto/regexp81.html

A very good book about regular expressions is "Mastering Regular Expressions"

I don't know if you have found http://aspn.activestate.com/ASPN/Cookbook/Tcl/
which gives some good tips

You have to be a more specific on what kind of data you want to match, otherwise one just has to guess...

The base line is the better you know your data and the more restricted they are the easier is it to shape a good regular expression
Here's one example (removing words with a number in it from a line)

proc filter_string {string pattern} {
    foreach element [split $string] {
        if {! [regexp $pattern $element]} {
           lappend result $element
    return [join $result]

set line "A line with so3me words with and with0ut number2s"

 filter_string $line {\d}
A line with words with and


Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.


Assisted Solution

s_federici earned 600 total points
ID: 12352605
Well, as fridom said, it is not easy just give a good number of good examples on how to handle regexps. I use them practically all days, and by reading the manual page you can see that there is a lot to say about them. Just to give a comprehensive list of examples is not easy. So, I'll start with a short list, let me know how they works for you. Just one note. In this examples, as you say that string matching is very well known to you, I won't go into details that are too similar to common string matching (e.g. glob matching patterns).

A) regexp

1. specific numbers

> set match "NO ANSWER"; regexp "123" "890 123 456" match; set match

the advantage with respect to string matching is that if you want to find a complete number (that is not a part of a longer number) you don't have to use "tricks" such as putting spaces around all numbers in your string to search in. Let me give an example:

> set match "NO ANSWER"; regexp "123" "890 1237 456" match; set match

that is, even if "123" is just part of the number "1237", you still find it as a match. A viable string match solution is the following:

> set match "NO ANSWER"; regexp " 123 " " 890 1237 456 " match; set match

> set match "NO ANSWER"; regexp {\m123\M} "890 123 456" match; set match
> set match "NO ANSWER"; regexp {\m890\M} "890 123 456" match; set match

That is, the escape sequences "\m" and "\M" will allow the surrounded pattern to match only at the beginning and/or the end of a whole word. Note that I replaced double quotes in the pattern with braces this time. Indeed, escapes would be replaced by the corresponding chars (i.e. "m" and "M") if inside double quotes. This doesn't happen with braces (i.e. there is no escape substitution before evaluation). I know this is just "plain tcl", but some less experienced reader could check this answer.

2. specific words

The situation is pretty similar to the match of specific numbers

3. More than a number/word

With regexp you can also find match for more than just one word/number. Here you are a few examples:

> set match "NO ANSWER"; regexp {\m(123|890)\M} "890 123 456" match; set match
> set match "NO ANSWER"; regexp {\m(123|456)\M} "890 123 456" match; set match

Here you can see that the "(...|...)" notation will match whichever of the two patterns (the one on the left or the one on the right of the "|" char) come first.

B) regsub

With regsub you can ask to replace occurrences of a given pattern

4. All occurrences

By using just regsub you always replace the first occurrence of a pattern in a string
> regsub {\m(123|456)\M} "890 123 456" "xxx"
890 xxx 456

But with regsub you can also specify that you want to replace ALL occurrences of a pattern (whereas matching all occurrences doesn't make sense for regexp; with regexp you always -and only- match the first occurrence of the pattern in the string)

> regsub -all {\m(123|456)\M} "890 123 456" "xxx"
890 xxx xxx

5. Wildcards

Wildcards are more then in string matching. With regexp and regsub you have the following:

i) "." matches whichever char. Similar to "?" of string matching

> set match "NO ANSWER"; regexp "a.c" "abc"; set match

ii) "*" matches whatever number (0 or more) of occurrences of the previous char

> set match "NO ANSWER"; regexp "a.*c" "abbbbbbc" match; set match
> set match "NO ANSWER"; regexp "a.*c" "ac" match; set match

iii) "+" matches whatever number (1 or more) of occurrences of the previous char

> set match "NO ANSWER"; regexp "a.+c" "abbbbbbc" match; set match


> set match "NO ANSWER"; regexp "a.+c" "ac" match; set match

iv) "?" matches 1 or no occurrences of the previous char

> set match "NO ANSWER"; regexp "a.?c" "abc" match; set match
> set match "NO ANSWER"; regexp "a.?c" "ac" match; set match

Ok, I guess there is a lot more to say (classes, atoms, not-greedy quantifiers, etc). Let me know if it did help you

Expert Comment

ID: 12352628
Last note. By "(e.g. glob matching patterns)" I just meant bracketed expressions. Sorry for the possible misunderstanding.

Expert Comment

ID: 14362660
Well, I have given a few general examples about the patterns he mentioned, but having received no answer from the person who asked the question didn't help to give him exactly what he wanted.
LVL 24

Expert Comment

ID: 14362861
s_fedrici is right, it's annoying to get asked, giving some anser to often very vague questions and after that you do not hear anything again. So I vote for sharing points between me and s_federicy or just taking away the points from the original poster.


Expert Comment

ID: 14366935
I agree with fridom, both alternatives are ok to me.
LVL 20

Expert Comment

ID: 14368554
Points refund is NOT an option at all.. :) So I need to know if there is something valuable here or we should go for delete.
LVL 24

Expert Comment

ID: 14369038
Of course there is in both posting from s_federic and me. But no feed-back. The OP does not care to say anything.


Expert Comment

ID: 14369082
yes, both answers address the subject of the question. They can be of help for anyone looking for help on this subject.

Featured Post

Enroll in October's Free Course of the Month

Do you work with and analyze data? Enroll in October's Course of the Month for 7+ hours of SQL training, allowing you to quickly and efficiently store or retrieve data. It's free for Premium Members, Team Accounts, and Qualified Experts!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you use Adobe Reader X it is possible you can't open OLE PDF documents in the standard. The reason is the 'save box mode' in adobe reader X. Many people think the protected Mode of adobe reader x is only to stop the write access. But this fe…
Have you tried to learn about Unicode, UTF-8, and multibyte text encoding and all the articles are just too "academic" or too technical? This article aims to make the whole topic easy for just about anyone to understand.
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
Please read the paragraph below before following the instructions in the video — there are important caveats in the paragraph that I did not mention in the video. If your PaperPort 12 or PaperPort 14 is failing to start, or crashing, or hanging, …

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question