Solved

Hyperlinks no longer work after converting .wps files to .docx

Posted on 2013-01-10
14
727 Views
Last Modified: 2013-01-11
Error opening embedded URLRecently I converted over a thousand .wps files to .docx, using a VB macro.  Now I have discovered that embedded URL shortcuts no longer open their respective web pages.  These shortcuts were embedded in the .wps files, and there are still shortcut icons in the .docx files, but when they are clicked, there is an error message (see screenshot).  I hope I won't need to manually recreate every embedded URL!
0
Comment
Question by:ddantes
  • 7
  • 4
  • 3
14 Comments
 
LVL 20

Expert Comment

by:wolfcamel
ID: 38766042
check the shortcut properties -you may see where it has screwed up and be able to do a search/replace to fix
0
 

Author Comment

by:ddantes
ID: 38766056
Thank you for your comments.  Right-clicking a shortcut, the menu has cut, copy, paste, package object, hyperlink, insert caption, borders & shading, and format object.  There is no option for Properties.   I tried installing Microsoft Office Compatibility Pack, but it hasn't changed the behavior.
0
 
LVL 20

Expert Comment

by:wolfcamel
ID: 38766060
what are the properties in "hyperlink" as this is the bit that is wrong
0
 

Author Comment

by:ddantes
ID: 38766084
Test.docxThe "hyperlink" context menu does not have properties.  It asks for an address to insert a hyperlink.  I attached a sample docx file with non-functioning embedded URL shortcuts.
0
 
LVL 20

Expert Comment

by:wolfcamel
ID: 38766103
looks like your conversion has stripped all the hyperlink info, however not sure how you managed to get the above error in the first place
0
 

Author Comment

by:ddantes
ID: 38766124
I just found that these links still work on a different machine running Windows 7.  The machine with the error message is running Windows XP SP-3.
0
 
LVL 38

Accepted Solution

by:
BillDL earned 500 total points
ID: 38768057
Hi ddantes

You may or not be aware that the Office 2007 file types (docx and xlsx) as opposed to the pre-2007 file types (doc, xls) are really just ZIP files that contain layout files in XML format plus the image files and some binary data files for some embedded content.  Office 2003 and earlier doc files had everything embedded in binary.  When renamed to a ZIP file the separate files in a docx file can be extracted to a folder using WinZip, 7-Zip, or the inbuilt Windows unzipping function.

In your case, even before I open the docx file, I can see that your "hyperlinks" are actually embedded "packages" in the form of binary OLE data.  When you insert some types of  "objects" in a Word document, for instance a media clip, it creates a package that can optionally be displayed as an icon.

The embedded object in a docx file when extracted is in a file
\extracted_folder\Word\embeddings\oleObject1.bin
and additional content will be oleObject2.bin, etc.

You have two packages displayed as icons, and here's how they show in oleObject1.bin and oleObject2.bin when viewed in a text viewer:
.The icons that display for the packages are in *.WMF (Windows metaFile) image format in the folder:
\extracted_folder\Word\media\image1.wmf
\extracted_folder\Word\media\image2.wmf
They are just the IE URL icon captioned "Haiku.url" and "BB Online -- Tradewinds.url".

The basic text formatting for an Internet Shortcut file (a *.URL file) as you would find in your Favorites folder if you browsed to it and opened one in a text editor, is like this, although Internet Explorer adds a whole load of other crap to it:

[DEFAULT]
BASEURL=http://www.bbonline.com/hi/haiku.html
[InternetShortcut]
URL=http://www.bbonline.com/hi/haiku.html
IDList=
IconFile=http://www.bbonline.com/favicon.ico
IconIndex=1
[{000214A0-0000-0000-C000-000000000046}]
Prop3=19,2

From what I can see and guess, you dragged and dropped a Favorite from Internet Explorer or from your Desktop or other folder right into a Microsoft Works document and it embedded it as a packaged object, or else whatever software converted it has converted a standard hyperlink to a packaged binary object.  I think it is the first possibility.

Look at the screenshots above and you will see "Package" mentioned, then the path to your "Haiku.url" file:
C:\DOCUME~1\ADMINI~1\FAVORI~1\Haiku.url
R:\Temp\Haiku (6).url

Here's the object details if I open your Test.docx in LibreOffice writer and double-click on the Tradewinds packaged icon.  LibreOffice doesn't know what to do with it, so it opens it as plain text:

BB On Line -- Tradewinds.urlC:\WINDOWS\FAVORI~1\BBONLI~1.URL  R:\Temp\BBONLI~1 (9).URL
[InternetShortcut]
URL=http://www.bbonline.com/hi/tradewinds/index.html
Modified=C0AD07D923D2BE0101
IconFile=http://www.bbonline.com/favicon.ico
IconIndex=1
R:\Temp\BBONLI~1 (9).URLBB On Line -- Tradewinds.url  C:\WINDOWS\FAVORI~1\BBONLI~1.URL

It looks as though your "Temp" folder is redirected to R:\Temp and that's what was used as a temporary cache when you originally dragged and dropped it from your Favorites folder:
"C:\Documents and Settings\Administrator\Favorites"
or Internet Explorer's Address bar into MS Works.

If I look at the "relationships" file from the extracted docx:
\extracted_folder\Word\_rels\document.xml.rels
I see the following:
<Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.wmf"/>

<Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject" Target="embeddings/oleObject1.bin"/>

<Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image2.wmf"/>

<Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject" Target="embeddings/oleObject2.bin"/>

Open in new window


If I compare that to the same "relationships" file that would be generated if I typed the word "Haiku" in Word, turned it into a Hyperlink to http://www.bbonline.com/hi/haiku.html, I would see this:
<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink" Target="http://www.bbonline.com/hi/haiku.html" TargetMode="External"/>

Open in new window


The two are quite different.  Clearly the hyperlinks in your converted documents are embedded packages displayed as "icons", whereas the text link in my test document is a standard hyperlink.

Do you recognise the name "SAX XML Reader 5.0"?

That's what shows as the packaged objects' properties when I open your Test.docx file, "SAX XML Reader 5.0 Object".  I think this is probably the file parser version that was used by the software to convert your WPS files to DOCX.

Out of curiosity, what version of Word are you using to open the converted documents?

I'm not sure that I can even begin to suggest a way of converting your embedded *.URL files to standard text hyperlinks, but can I suggest that if you are using Office 2003 then you try this on 2007 and 2010 before you try anything else.
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 

Author Comment

by:ddantes
ID: 38768098
Thank you for your thorough analysis.  Originally, the URL shortcuts were embedded by copying a Windows Favorite shortcut to Wordpad, and from there, pasting it into a wps file.  I could not paste it directly without using Wordpad as an intermediate step.  

I am using Word 2007, having recently converted all my wps files to Word.  Much to my relief, after posting this question I found that one of my computers still opens web pages when the shortcuts in Word are clicked.  That is a Windows 7 desktop.  My Windows XP laptop won't open the URL, and displays the error message.  So, I'm not in as desperate a situation as I first thought.  Still, if there is a way to be able to work with these files on my laptop, that would be helpful.

I'm not really eager to convert the embedded shortcuts into text hyperlinks, and would prefer to leave them as icons, if there is a way to make them useful.

A sample of the original wps file with embedded shortcuts can be downloaded from www.mauitradewinds.com/Experts   I can't embed it here, because .wps is not an accepted file extension for Experts Exchange attachments.
0
 
LVL 38

Expert Comment

by:BillDL
ID: 38768152
Aaah, good.  Thanks for that link.  I was going to ask for an example.  I don't have Works, but can remote into someone else's computer that has Works and open the WPS file natively.  It looks like it was created in Works 2000.  I'll see what I can figure out.  Incidentally, if you ever need to attach a file that isn't accepted by Experts-Exchange, you can always change the extension to  .TXT  as long as you indicate what it needs to be renamed back to.  It may show a file mismatch notification here.

Here is the attached WPS file, so you can always remove the "Experts" folder from your site if you wish.
<EDIT: attachment removed meantime>

I found what appears to be some leftover metadata that does not show as text and may be something that you aren't really keen on others seeing.  Open the file in Notepad and you'll see what I mean.

Do all of your converted files contain links in this format, and is that the ONLY content in all of them, or is there text also?
0
 

Author Comment

by:ddantes
ID: 38768370
Thank you for your comments.   Works 2000 is the application which created those files.  The files contain text as well as links in that format.  I initially uploaded the wps file after converting it to .txt format.  When I downloaded it from this site and converted it back to wps, it would not open.  So I edited my comment and removed the embedded file.   I see what you mean about the meta data.  I removed the text from that file, thinking all that was left was the embedded shortcuts and their icons.  Anyhow, the remaining data is not confidential, but thank you for the head's up.
0
 

Author Comment

by:ddantes
ID: 38768400
I installed Windows 7 and Office 2007 on the laptop, and the links open now.  It's only under Windows XP that the error message appears.
0
 
LVL 38

Expert Comment

by:BillDL
ID: 38768525
Yes, there's something screwy with XP.  It's probably the way the Object Linking and Embedding (OLE) functionality is handled in XP and has apparently been improved in later Windows versions.

I had hoped that perhaps the WPS documents were just a method of archiving links and didn't contain text, because I saw a way that I could probably have extracted the hyperlinks from each document and created new Word documents with standard hyperlinked images.  My idea wouldn't work with additional text in the documents.  I'm presuming that your preference for retaining hyperlinked images is probably so that the original layout (page breaks, etc) is maintained.

I think I'm going to have to throw in the towel for this contest.  I posted a link to this question in your other question where the expert named terencino provided you with the Macro code in the hope that he will be alerted and visit here with some ideas.

By the way, if you see a UK IP Address beginning 79.78 in your website logs it was just me wondering if I could afford one night in the Star Wind ;-)
0
 
LVL 38

Expert Comment

by:BillDL
ID: 38768538
Thank you David.  I wasn't expecting points for throwing in the towel.  It's appreciated.  I hope you can get this all sorted out painlessly.
0
 

Author Comment

by:ddantes
ID: 38768604
Thank you Bill.  Guests from Oxford just left this morning, and we got a new reservation yesterday, from a honeymoon couple in York.  So, it can be done!
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Introduction Authors who set out to write any sort of lengthy piece for online submission—be it a long question or comment on a technical form, an article, or a substantial blog entry—often find it useful to work up a draft in an editor other t…
It is often necessary in this forum and others to illustrate Word fields as text with the field delimiters replaced with the curly brackets that the delimiters resemble when field codes are being displayed on the document. This means that the text c…
This video walks the viewer through the process of creating a watermark for their document, customizing it, and saving it for viewing/printing needs.
This video walks the viewer through the process of creating Hyperlinks for the web and other documents. Select the "Insert" tab: Click "Hyperlink":  Type "http://" followed by a web address to reference a website or navigate to a document to ref…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now