Parse plain text log file and extract specific values

I have an application that performs an action on pairs of files found in the current folder and outputs some statistics in a plain text log file.  This application can process thousands of file pairs in a run, therefore the log file can be very long.  I am looking for a way to parse this log file and pull out specific pieces of information and write them to a comma-delimited text file, so I can view them in table format.

An example of the output from a single run of the application is below, with the items I need to extract in bold.  This block of information is repeated many times in the log file, once for each pair of files processed.  I'd like each pair of files to be a separate line in the CSV file.

[FLASH] Starting FLASH v1.2.11
[FLASH] Fast Length Adjustment of SHort reads
[FLASH]  
[FLASH] Input files:
[FLASH]     880B-plate-1-H12_S96_L001_R1_001.fastq
[FLASH]     880B-plate-1-H12_S96_L001_R2_001.fastq
[FLASH]  
[FLASH] Output files:
[FLASH]     880B-plate-1-H12_S96_L001/MERGED.extendedFrags.fastq
[FLASH]     880B-plate-1-H12_S96_L001/MERGED.notCombined_1.fastq
[FLASH]     880B-plate-1-H12_S96_L001/MERGED.notCombined_2.fastq
[FLASH]     880B-plate-1-H12_S96_L001/MERGED.hist
[FLASH]     880B-plate-1-H12_S96_L001/MERGED.histogram
[FLASH]  
[FLASH] Parameters:
[FLASH]     Min overlap:           20
[FLASH]     Max overlap:           160
[FLASH]     Max mismatch density:  0.250000
[FLASH]     Allow "outie" pairs:   false
[FLASH]     Cap mismatch quals:    false
[FLASH]     Combiner threads:      16
[FLASH]     Input format:          FASTQ, phred_offset=33
[FLASH]     Output format:         FASTQ, phred_offset=33
[FLASH]  
[FLASH] Starting reader and writer threads
[FLASH] Starting 16 combiner threads
[FLASH] Processed 25000 read pairs
[FLASH] Processed 50000 read pairs
[FLASH] Processed 75000 read pairs
[FLASH] Processed 82099 read pairs
[FLASH]  
[FLASH] Read combination statistics:
[FLASH]     Total pairs:      82099
[FLASH]     Combined pairs:   65751
[FLASH]     Uncombined pairs: 16348
[FLASH]     Percent combined: 80.09%
[FLASH]  
[FLASH] Writing histogram files.
[FLASH]  
[FLASH] FLASH v1.2.11 complete!
[FLASH] 3.136 seconds elapsed

Can someone help with writing a Windows batch or VBS program that can do this?  I've attached a copy of one of the full log files with about 10 sets of data in it.
flash.log
I_play_with_DNAAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

NVITCommented:
Make a search.txt file:
Total pairs
Combined pairs
Uncombined pairs
Percent combined
_R1_

Open in new window


Enter this in CMD window:
findstr /r /g:search.txt flash.log

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
oBdACommented:
A Powershell solution:
$FlashLog = "D:\Temp\flash.log"
$ResultFile = "D:\Temp\flash.csv"
$ExpectFileName = $False
Get-Content -Path $FlashLog |
	% {
		If ($_ -eq "[FLASH] Input files:") {
			$ExpectFileName = $True
		} Else {
			If ($ExpectFileName) {
				$Result = "" | Select-Object -Property FileName, TotalPairs, CombinedPairs, UncombinedPairs, PercentCombined
				$Result.FileName = $_.Replace("[FLASH]", "").Trim()
				$ExpectFileName = $False
				"Processing '$($Result.FileName)'" | Write-Host
			} ElseIf ($_ -match '\[FLASH\]\s+Total pairs:\s*(?<TotalPairs>\d+)') {
				$Result.TotalPairs = $Matches["TotalPairs"]
			} ElseIf ($_ -match '\[FLASH\]\s+Combined pairs:\s*(?<CombinedPairs>\d+)') {
				$Result.CombinedPairs = $Matches["CombinedPairs"]
			} ElseIf ($_ -match '\[FLASH\]\s+Uncombined pairs:\s*(?<UncombinedPairs>\d+)') {
				$Result.UncombinedPairs = $Matches["UncombinedPairs"]
			} ElseIf ($_ -match '\[FLASH\]\s+Percent combined:\s*(?<PercentCombined>\d+)') {
				$Result.PercentCombined = $Matches["PercentCombined"]
				$Result | Write-Output
			}
		}
	} | Export-Csv -Path $ResultFile -NoTypeInformation
"Results exported to '$($ResultFile)'" | Write-Host

Open in new window

0
Martin LissOlder than dirtCommented:
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Shell Scripting

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.