Link to home
Start Free TrialLog in
Avatar of D B
D BFlag for United States of America

asked on

File Processing - Extract a Portion of a File

I'll bet there is a way to do this without having to examine a file line-by-line.
I want to read in a file that has data in this format:
- pre-data
- key-phrase
- post-data

I want to exclude all 'pre-phrase' data and create a new file that has only the key phrase and post-data. Thus, given the following file contents:

this is pre-data
this is also pre-data
and so is this
Once upon a time, in a land far, far away
there was a fairy princess ....

If the key-phrase is "once upon a time" then the output would look like:
Once upon a time, in a land far, far away
there was a fairy princess ....

Some givens:
- The key-phrase may be preceded by whitespace (tabs or spaces) but other than that will always be at the start of a new line
- I know I can read the file using $file = Get-Content "c:\temp\file.txt" -Raw
- I know I can write the contents out to a new file using $file | Out-File -FilePath "c:\temp\newfile.txt" -Force
- What I need to know is how to exclude everything preceding the key-phrase.
Avatar of footech
footech
Flag of United States of America image

How far does post-data extend?

this is pre-data
 this is also pre-data
 and so is this
 Once upon a time, in a land far, far away
 there was a fairy princess ....
is this post-data?
how about this?
and this?

Would it go all the way to the last line, or just the line after "once upon a time"?
Basically you can adjust how many lines are post-data by changing the -context parameter.
Select-String -Path ".\#ee-sample1.txt" -Pattern "once upon a time" -Context 4 |
 ForEach {$_.line; $_.context.postcontext} |
 Out-File newfile.txt

Open in new window

Avatar of D B

ASKER

Actually, until after defined end-keyword (inclusive). So, if the end-keyword was 'the end' and the text contained:
this is pre-data
 this is also pre-data
 and so is this
 Once upon a time, in a land far, far away
 there was a fairy princess ....
...
the end
Index
apples...1
bananas....2
...
in this case I would want to capture everything from key-phrase (inclusive) up to end-keyword (inclusive).
Avatar of D B

ASKER

Also, there could be hundreds of lines following the key-phrase.
Avatar of D B

ASKER

footech: thanks for bringing that up. I suppose in my case, I can make -Context be 1000 (that should be enough) but now the challenge is how to make it 'stop' after reading the end-keyword?
ASKER CERTIFIED SOLUTION
Avatar of footech
footech
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of D B

ASKER

Works like a charm!!! There needs to be an A++ grade! :-)
Thanks.