Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 411
  • Last Modified:

Word Document Hyperlink Checker

I have thousands of word documents to check for broken links. Going through these word documents manually is tedious and error prone.

I have tried AbleBits.com's free Document Hyperlink Checker, but it is not able to detect broken links, although it claims it does.

Any ideas?

Thanks.
0
jeremyll
Asked:
jeremyll
  • 2
1 Solution
 
ProculopsisCommented:

Batch convert all the documets into HTML file format (Save As...) and then subject them to one of the many Web Link Validator tools.
0
 
jeremyllAuthor Commented:
Thanks for the suggestion.

I tried that and tried to find the most popular link validator tool which I think is, LinkChecker 0.6.6
by Kevin Freitas. I found this difficult to use because while it highlighted broken links, it's very difficult to do for the documents that I was checking. The documents contained about 5000-10000 words and probably only 10 links. So scrolling every line for any highlighted links can take a very long time.

Any other suggestions?
0
 
ProculopsisCommented:

This won't help much but it's a Word document saved as html format and with the inclusion of the small script at the beginning, all the links are easily exposed.  

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<script>
window.onload = function()
{
  var link = document.getElementsByTagName('A');
  for ( var i = 0; i < link.length; i++ )
  {
    alert( link[i].innerHTML );
  }
};
</script>

<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 11">
<meta name=Originator content="Microsoft Word 11">
<link rel=File-List href="Expert_files/filelist.xml">
<title>Experts Exchange</title>
<!--[if gte mso 9]><xml>
 <o:DocumentProperties>
  <o:Author>Keith</o:Author>
  <o:LastAuthor>Keith</o:LastAuthor>
  <o:Revision>1</o:Revision>
  <o:TotalTime>3</o:TotalTime>
  <o:Created>2010-11-15T14:36:00Z</o:Created>
  <o:LastSaved>2010-11-15T14:39:00Z</o:LastSaved>
  <o:Pages>1</o:Pages>
  <o:Words>27</o:Words>
  <o:Characters>159</o:Characters>
  <o:Company>.</o:Company>
  <o:Lines>1</o:Lines>
  <o:Paragraphs>1</o:Paragraphs>
  <o:CharactersWithSpaces>185</o:CharactersWithSpaces>
  <o:Version>11.9999</o:Version>
 </o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:SpellingState>Clean</w:SpellingState>
  <w:GrammarState>Clean</w:GrammarState>
  <w:PunctuationKerning/>
  <w:ValidateAgainstSchemas/>
  <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
  <w:IgnoreMixedContent>false</w:IgnoreMixedContent>
  <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
  <w:Compatibility>
   <w:BreakWrappedTables/>
   <w:SnapToGridInCell/>
   <w:WrapTextWithPunct/>
   <w:UseAsianBreakRules/>
   <w:DontGrowAutofit/>
  </w:Compatibility>
  <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
 </w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:LatentStyles DefLockedState="false" LatentStyleCount="156">
 </w:LatentStyles>
</xml><![endif]-->
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{mso-style-parent:"";
	margin:0cm;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:12.0pt;
	font-family:"Times New Roman";
	mso-fareast-font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;
	text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;
	text-underline:single;}
@page Section1
	{size:595.3pt 841.9pt;
	margin:72.0pt 90.0pt 72.0pt 90.0pt;
	mso-header-margin:35.4pt;
	mso-footer-margin:35.4pt;
	mso-paper-source:0;}
div.Section1
	{page:Section1;}
-->
</style>
<!--[if gte mso 10]>
<style>
 /* Style Definitions */
 table.MsoNormalTable
	{mso-style-name:"Table Normal";
	mso-tstyle-rowband-size:0;
	mso-tstyle-colband-size:0;
	mso-style-noshow:yes;
	mso-style-parent:"";
	mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
	mso-para-margin:0cm;
	mso-para-margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:10.0pt;
	font-family:"Times New Roman";
	mso-ansi-language:#0400;
	mso-fareast-language:#0400;
	mso-bidi-language:#0400;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="2050"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1"/>
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=EN-GB link=blue vlink=purple style='tab-interval:36.0pt'>

<div class=Section1>

<p class=MsoNormal><a href="http://time.gov/">http://time.gov</a></p>

<p class=MsoNormal><a href="http://www.experts-exchange.com/">http://www.experts-exchange.com</a></p>

<p class=MsoNormal><a href="http://www.w3c.org/">http://www.w3c.org</a></p>

<p class=MsoNormal><o:p>&nbsp;</o:p></p>

</div>

</body>

</html>

Open in new window

0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now