I'm looking to parse some PDF or text (Saving PDF as text file from acrobat reader) documents and output an XML file. I'm not really sure which language would be best suited to this task but I'm guessing it's possible with PERL, Ruby or Python...
Here are the documents I'm interested in parsing all of the fields except for the tabular data.:
Example, I guess the "key:" could be used to create each section. "Vulnerability Key: " is where each nest should begin.
Some Key: V0012345
Some ID: CTX0100