Solved

Word Document Hyperlink Checker

Posted on 2010-11-11
3
359 Views
Last Modified: 2013-12-24
I have thousands of word documents to check for broken links. Going through these word documents manually is tedious and error prone.

I have tried AbleBits.com's free Document Hyperlink Checker, but it is not able to detect broken links, although it claims it does.

Any ideas?

Thanks.
0
Comment
Question by:jeremyll
  • 2
3 Comments
 
LVL 20

Expert Comment

by:Proculopsis
Comment Utility

Batch convert all the documets into HTML file format (Save As...) and then subject them to one of the many Web Link Validator tools.
0
 

Author Comment

by:jeremyll
Comment Utility
Thanks for the suggestion.

I tried that and tried to find the most popular link validator tool which I think is, LinkChecker 0.6.6
by Kevin Freitas. I found this difficult to use because while it highlighted broken links, it's very difficult to do for the documents that I was checking. The documents contained about 5000-10000 words and probably only 10 links. So scrolling every line for any highlighted links can take a very long time.

Any other suggestions?
0
 
LVL 20

Accepted Solution

by:
Proculopsis earned 500 total points
Comment Utility

This won't help much but it's a Word document saved as html format and with the inclusion of the small script at the beginning, all the links are easily exposed.  

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<script>
window.onload = function()
{
  var link = document.getElementsByTagName('A');
  for ( var i = 0; i < link.length; i++ )
  {
    alert( link[i].innerHTML );
  }
};
</script>

<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 11">
<meta name=Originator content="Microsoft Word 11">
<link rel=File-List href="Expert_files/filelist.xml">
<title>Experts Exchange</title>
<!--[if gte mso 9]><xml>
 <o:DocumentProperties>
  <o:Author>Keith</o:Author>
  <o:LastAuthor>Keith</o:LastAuthor>
  <o:Revision>1</o:Revision>
  <o:TotalTime>3</o:TotalTime>
  <o:Created>2010-11-15T14:36:00Z</o:Created>
  <o:LastSaved>2010-11-15T14:39:00Z</o:LastSaved>
  <o:Pages>1</o:Pages>
  <o:Words>27</o:Words>
  <o:Characters>159</o:Characters>
  <o:Company>.</o:Company>
  <o:Lines>1</o:Lines>
  <o:Paragraphs>1</o:Paragraphs>
  <o:CharactersWithSpaces>185</o:CharactersWithSpaces>
  <o:Version>11.9999</o:Version>
 </o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:SpellingState>Clean</w:SpellingState>
  <w:GrammarState>Clean</w:GrammarState>
  <w:PunctuationKerning/>
  <w:ValidateAgainstSchemas/>
  <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
  <w:IgnoreMixedContent>false</w:IgnoreMixedContent>
  <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
  <w:Compatibility>
   <w:BreakWrappedTables/>
   <w:SnapToGridInCell/>
   <w:WrapTextWithPunct/>
   <w:UseAsianBreakRules/>
   <w:DontGrowAutofit/>
  </w:Compatibility>
  <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
 </w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:LatentStyles DefLockedState="false" LatentStyleCount="156">
 </w:LatentStyles>
</xml><![endif]-->
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{mso-style-parent:"";
	margin:0cm;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:12.0pt;
	font-family:"Times New Roman";
	mso-fareast-font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;
	text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;
	text-underline:single;}
@page Section1
	{size:595.3pt 841.9pt;
	margin:72.0pt 90.0pt 72.0pt 90.0pt;
	mso-header-margin:35.4pt;
	mso-footer-margin:35.4pt;
	mso-paper-source:0;}
div.Section1
	{page:Section1;}
-->
</style>
<!--[if gte mso 10]>
<style>
 /* Style Definitions */
 table.MsoNormalTable
	{mso-style-name:"Table Normal";
	mso-tstyle-rowband-size:0;
	mso-tstyle-colband-size:0;
	mso-style-noshow:yes;
	mso-style-parent:"";
	mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
	mso-para-margin:0cm;
	mso-para-margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:10.0pt;
	font-family:"Times New Roman";
	mso-ansi-language:#0400;
	mso-fareast-language:#0400;
	mso-bidi-language:#0400;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="2050"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1"/>
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=EN-GB link=blue vlink=purple style='tab-interval:36.0pt'>

<div class=Section1>

<p class=MsoNormal><a href="http://time.gov/">http://time.gov</a></p>

<p class=MsoNormal><a href="http://www.experts-exchange.com/">http://www.experts-exchange.com</a></p>

<p class=MsoNormal><a href="http://www.w3c.org/">http://www.w3c.org</a></p>

<p class=MsoNormal><o:p>&nbsp;</o:p></p>

</div>

</body>

</html>

Open in new window

0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

The Quality Assurance engineer of an Agile scrum team must "own" the acceptance criteria for sprint tasks.
CSS is a visual language used to classify objects and define rules about how they should be displayed. CSS skills aren’t restricted to developers anymore, there is a big benefit to having a basic understanding of the language, regardless of your occ…
This video walks the viewer through the process of creating an MLA formatted document, as well as a bibliography with citations.
This Micro Tutorial well show you how to find and replace special characters in Microsoft Word. This is similar to carriage returns to convert columns of values from Microsoft Excel into comma separated lists.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now