Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Word Document Hyperlink Checker

Posted on 2010-11-11
3
Medium Priority
?
403 Views
Last Modified: 2013-12-24
I have thousands of word documents to check for broken links. Going through these word documents manually is tedious and error prone.

I have tried AbleBits.com's free Document Hyperlink Checker, but it is not able to detect broken links, although it claims it does.

Any ideas?

Thanks.
0
Comment
Question by:jeremyll
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 20

Expert Comment

by:Proculopsis
ID: 34118628

Batch convert all the documets into HTML file format (Save As...) and then subject them to one of the many Web Link Validator tools.
0
 

Author Comment

by:jeremyll
ID: 34129846
Thanks for the suggestion.

I tried that and tried to find the most popular link validator tool which I think is, LinkChecker 0.6.6
by Kevin Freitas. I found this difficult to use because while it highlighted broken links, it's very difficult to do for the documents that I was checking. The documents contained about 5000-10000 words and probably only 10 links. So scrolling every line for any highlighted links can take a very long time.

Any other suggestions?
0
 
LVL 20

Accepted Solution

by:
Proculopsis earned 2000 total points
ID: 34136797

This won't help much but it's a Word document saved as html format and with the inclusion of the small script at the beginning, all the links are easily exposed.  

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<script>
window.onload = function()
{
  var link = document.getElementsByTagName('A');
  for ( var i = 0; i < link.length; i++ )
  {
    alert( link[i].innerHTML );
  }
};
</script>

<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 11">
<meta name=Originator content="Microsoft Word 11">
<link rel=File-List href="Expert_files/filelist.xml">
<title>Experts Exchange</title>
<!--[if gte mso 9]><xml>
 <o:DocumentProperties>
  <o:Author>Keith</o:Author>
  <o:LastAuthor>Keith</o:LastAuthor>
  <o:Revision>1</o:Revision>
  <o:TotalTime>3</o:TotalTime>
  <o:Created>2010-11-15T14:36:00Z</o:Created>
  <o:LastSaved>2010-11-15T14:39:00Z</o:LastSaved>
  <o:Pages>1</o:Pages>
  <o:Words>27</o:Words>
  <o:Characters>159</o:Characters>
  <o:Company>.</o:Company>
  <o:Lines>1</o:Lines>
  <o:Paragraphs>1</o:Paragraphs>
  <o:CharactersWithSpaces>185</o:CharactersWithSpaces>
  <o:Version>11.9999</o:Version>
 </o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:SpellingState>Clean</w:SpellingState>
  <w:GrammarState>Clean</w:GrammarState>
  <w:PunctuationKerning/>
  <w:ValidateAgainstSchemas/>
  <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
  <w:IgnoreMixedContent>false</w:IgnoreMixedContent>
  <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
  <w:Compatibility>
   <w:BreakWrappedTables/>
   <w:SnapToGridInCell/>
   <w:WrapTextWithPunct/>
   <w:UseAsianBreakRules/>
   <w:DontGrowAutofit/>
  </w:Compatibility>
  <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
 </w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:LatentStyles DefLockedState="false" LatentStyleCount="156">
 </w:LatentStyles>
</xml><![endif]-->
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{mso-style-parent:"";
	margin:0cm;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:12.0pt;
	font-family:"Times New Roman";
	mso-fareast-font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;
	text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;
	text-underline:single;}
@page Section1
	{size:595.3pt 841.9pt;
	margin:72.0pt 90.0pt 72.0pt 90.0pt;
	mso-header-margin:35.4pt;
	mso-footer-margin:35.4pt;
	mso-paper-source:0;}
div.Section1
	{page:Section1;}
-->
</style>
<!--[if gte mso 10]>
<style>
 /* Style Definitions */
 table.MsoNormalTable
	{mso-style-name:"Table Normal";
	mso-tstyle-rowband-size:0;
	mso-tstyle-colband-size:0;
	mso-style-noshow:yes;
	mso-style-parent:"";
	mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
	mso-para-margin:0cm;
	mso-para-margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:10.0pt;
	font-family:"Times New Roman";
	mso-ansi-language:#0400;
	mso-fareast-language:#0400;
	mso-bidi-language:#0400;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="2050"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1"/>
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=EN-GB link=blue vlink=purple style='tab-interval:36.0pt'>

<div class=Section1>

<p class=MsoNormal><a href="http://time.gov/">http://time.gov</a></p>

<p class=MsoNormal><a href="http://www.experts-exchange.com/">http://www.experts-exchange.com</a></p>

<p class=MsoNormal><a href="http://www.w3c.org/">http://www.w3c.org</a></p>

<p class=MsoNormal><o:p>&nbsp;</o:p></p>

</div>

</body>

</html>

Open in new window

0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

You need to know the location of the Office templates folder, so that when you create new templates, they are saved to that location, and thus are available for selection when creating new documents.  The steps to find the Templates folder path are …
This article is written by John Gates, CISSP. Gates, the SNUG President-Elect, currently holds the position of Manager of Information Systems at Lake Park High School in Roselle, Illinois.
Articles on a wide range of technology and professional topics are available on Experts Exchange. These resources are written by members, for members, and can be written about any topic you feel passionate about. Learn how to best write an article t…
Learn how to create and modify your own paragraph styles in Microsoft Word. This can be helpful when wanting to make consistently referenced styles throughout a document or template.

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question