So here i the situation:
At work we use a custom piece of software that does document watermarking for us. Now the way this works for the end user is that there are two shares. Share #1 is "Dropoff" and Share #2 is "Pickup". The users drop off an office document in the "dropoff" share and within 15-20 minutes an xps document with watermark appears in the pickup folder. (The time delay is due to a scheduled task set to run every 20 minutes).
Now here is the issue. We have noticed that watermarking fails for certain documents. The common theme is any document with a hyperlink, specifically a malformed hyperlink. For example if someone types in http://*:3389
word converts this to a hyperlink.
Now to give you some more info on the process. The way the documents are watermarked is via a powershell script. What happens is this script runs every 20 minutes and checks for any files added to the "dropoff" folder. It then converts them to xps documents as the watermarking software can only accept xps for input. Now these xps are then run through a command line app that is the watermarking software.
What I am trying to do is find some way to either disable hyperlinks that are invalid in either word or the xps document or simply disable all hyperlinks in the document. Unfortunately due to the org size simply telling people to remove hyperlinks is not a solution especially since Management uses it the most.
Now I originally thought the issue was with Powershell converting the document to XPS so I wrote a C# console app to handle this. However that is when I discovered that it was actually malformed hyperlinks causing the issue.
What I would like to do is one of the folllowing:
1.) "read" through the word document and disable invalid URL's i.e. convert them to strings
2.) Disable all URL's in the word document
3.) Disable invalid URL's (convert to string) in XPS Document
4.) Disable all URL's in XPS document
Any thoughts or suggestions regarding how to handle this issue.
Thanks for the help!