I have an application that performs an action on pairs of files found in the current folder and outputs some statistics in a plain text log file. This application can process thousands of file pairs in a run, therefore the log file can be very long. I am looking for a way to parse this log file and pull out specific pieces of information and write them to a comma-delimited text file, so I can view them in table format.
An example of the output from a single run of the application is below, with the items I need to extract in bold. This block of information is repeated many times in the log file, once for each pair of files processed. I'd like each pair of files to be a separate line in the CSV file.
[FLASH] Starting FLASH v1.2.11
[FLASH] Fast Length Adjustment of SHort reads
[FLASH] Input files:
[FLASH] Output files:
[FLASH] Min overlap: 20
[FLASH] Max overlap: 160
[FLASH] Max mismatch density: 0.250000
[FLASH] Allow "outie" pairs: false
[FLASH] Cap mismatch quals: false
[FLASH] Combiner threads: 16
[FLASH] Input format: FASTQ, phred_offset=33
[FLASH] Output format: FASTQ, phred_offset=33
[FLASH] Starting reader and writer threads
[FLASH] Starting 16 combiner threads
[FLASH] Processed 25000 read pairs
[FLASH] Processed 50000 read pairs
[FLASH] Processed 75000 read pairs
[FLASH] Processed 82099 read pairs
[FLASH] Read combination statistics:
[FLASH] Total pairs: 82099
[FLASH] Combined pairs: 65751
[FLASH] Uncombined pairs: 16348
[FLASH] Percent combined: 80.09
[FLASH] Writing histogram files.
[FLASH] FLASH v1.2.11 complete!
[FLASH] 3.136 seconds elapsed
Can someone help with writing a Windows batch or VBS program that can do this? I've attached a copy of one of the full log files with about 10 sets of data in it.