Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Gawk - Find all occurences of a string within XML - Also include 100 bytes before and after the match

Posted on 2015-02-09
2
238 Views
Last Modified: 2015-02-10
On Windows, I am currently using gawk to find the first occurrence of a string + 100 bytes for all XMLs withing a directory:

gawk "/[some string]/" { match ( $0, /[some string]/); print substr($0,RSTART,RLENGTH + 100) FILENAME; }" C:\XML*.xml > C:\Results.txt

Open in new window


What I would like to do now is output all the matches (not just the first) to C:\Results.txt for each XML and also include 100 characters before the match + 100 characters after the match.

Is it possible to easily change this to get the desired results?

I understand that gawk might not be the best tool for the job, but this is just a one time task and if this is slow I can let this run overnight.
0
Comment
Question by:Mr P
2 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 40598318
If the 100 characters are on the same line as the match, you can use
match ( $0, /some string/){print substr($0,RSTART-100,RLENGTH + 200)FILENAME; }

if there can me more than one match on a line, and the matches are at least 100 characters apart, you might use
/some string/{while(match ( $0, /some string/)){ print substr($0,RSTART-100,RLENGTH + 200) FILENAME; $0=substr($0,RSTART+1)} }'
0
 

Author Closing Comment

by:Mr P
ID: 40602228
This worked great.  Thank you, Ozo.
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
topping1 challenge 7 110
backup program with robocopy 6 45
programming a polycom voip phone 3 20
learn programming 8 37
A short article about a problem I had getting the GPS LocationListener working.
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question