I have some html files that show the difference between two different schema definitions (xsds). The attached Word file is an excerpt of a sample html file and depicts how it looks. The attached .txt file shows the underlying html source code for that excerpt. What I would like to do is write a Java program that would create a new text file in the following format:
Line 1 "OLD line(s): 235 [EMPTY] [becomes] New line(s): 236,246 [DATA] name = OtherSupportSumAmt
Line 2 "OLD line(s): 1759 [DATA] name=OrganizationTypeDesc [becomes] New line(s): 1770 [DATA] name=OrganizationTypeCd
Line 3 "OLD line(s): 1763 [N/A] [becomes] New line(s): 1774
Basically, I want
1) the old and new information on the same line of text with [becomes] (or some other delimiter) in between.
2) the word [DATA] where the substring "name=" exists followed by name=[whatever is in the quotes that follow].
3) the word [EMPTY] or [NULL] if no data follows the line(s) number.
4) the phrase "not applicable" or N/A where the substring "name=" does not exist.
In other words a new line of text for each old and new pairs of html data. DiffWebpage.docx DiffHTML.txt