+ operator in awk FS string

Posted on 2008-11-13
Medium Priority
Last Modified: 2012-06-21
I'm parsing a HTML file witk awk, and I want my field separator to be any sequence of <> tags, leaving fields as the text outside of the tags.

Currently I have
FS = "[ \t]*<[^>]*>[ \t]*"
which works well for one tag (and any surrounding whitespace), but gives lots of empty fields when several tags <b><i> etc are next to each other.

I'd like something like
FS = "[[ \t]*<[^>]*>[ \t]*]+"
but that doesn't work. I've tried a number of permutations with brackets, parentheses and plus signs.

What's the proper way of doing this?
Question by:loveslave
1 Comment
LVL 85

Accepted Solution

ozo earned 500 total points
ID: 22953237
([ \t]*<[^>]*>[ \t]*)+

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The following is a collection of cases for strange behaviour when using advanced techniques in DOS batch files. You should have some basic experience in batch "programming", as I'm assuming some knowledge and not further explain the basics. For some…
Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question