Avatar of Bob
Bob
Flag for United States of America asked on

Capturing Data In Large Text File Between A Pattern

Have very large file that contains random number of lines (strings) between a pattern in the file for example INS. I need to capture all the lines starting with the patter "INS" up to before the next INS. Not sure if, Select-String, or ,Where-Object,  is the proper method. Not really sure where to start with this one.
PowershellScripting Languages

Avatar of undefined
Last Comment
footech

8/22/2022 - Mon
footech

What do you want to do with the captured text?  Best to supply a sample file for input (doesn't have to be real, just representative of what to expect), and then a file (or other description) of what the output should be given the input file.

How many instances of the pattern will there be in the file?  Always two?  Or more?
Bob

ASKER
One file could have as many as 300 INS instances. I need to capture all the lines after the initial INS up to the next. I have attached a small sample to explain my issue.
sample.txt
aikimark

Use the regex object from the .Net framework with this pattern:
$re = [regex]'(?:^|\n)(INS(?:.|\n)+?)(?=\nINS)'

Open in new window


You can use the object's matchall method to do the parsing you desire.  Then iterate the resulting groups/captures.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
footech

I'm still not getting what you want to do with the captured text?  In what form should it be passed down the pipeline?
The way I'm reading it, given your sample input file the output would just be the entire file.  Reading your description another way I would just exclude lines that have "INS".  Sorry, it's just not clear to me.
Bob

ASKER
I need to break the entire file up into blocks of data with the "INS" as the delimiter. Not separate files, just one large file but with a start (INS) and finish (capturing all lines up to the next INS) for the accounts in the text file.

INS (start of first account)
second
third
fourth
fifth
INS (start of the next account)

There could be five lines or 20 lines, it all depends on the data that has changed for the users demographic information.
ASKER CERTIFIED SOLUTION
footech

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Bob

ASKER
Exactly works as it needs.

Love this place, constantly learning and constantly impressed with the depth of knowledge here.  Thanks again!
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
footech

OK, great!  Glad it's working for you.