Question

Automatic Linking of Pre-defined Strings on HTML page (500pts)

Asked by: daniel_sirvera

This may be a rather tall order, but I've not found a request phrased quite like this in the archives and I wish to be as descriptive as I can with what I'm looking for...  In short, I'd like to make use of a feature very similiar to the "Intellitext" techology found on this and various other sites for an integrated glossary I'm trying to build for educational purposes.

Suppose we could upload a certain source file onto the server (I've no idea what scripting language would be best for this); this file would contain an easily editable listing of pre-defined WORDS or PHRASES along with URLs to their definitions located elsewhere on the web site. Now further suppose a snippet of code pointing to this source file could be inserted between the <head> or <body> tags of each HTML page on the web site. The end effect would be a scan of each HTML page for those predefined strings and automatically linking each WORD or PHRASE to its anchor on another HTML web page.  This feature would also be smart enough to "know" that if a WORD or PHRASE is found already within an HTML tag, it won't mangle that code by attempting to nest another link within a link, as it were... (for this reason, I've found various Search & Replace utilities to be inadequate)

I'd also like to be able to edit the "source file" to include a unique tooltip element for each WORD or PHRASE.

This would save me HOURS worth of work if I could make use of this resource. Thank you in advance for helping me with this!

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2005-08-05 at 13:30:03ID21517532
Tags

html

,

linking

Topic

Web Languages/Standards

Participating Experts
1
Points
500
Comments
9

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. Name mangling & function overloading?
    Hi, Can we implement function overloading in C ? What is name mangling and how does it help in function overloading ? Regards, Rohit
  2. parse search string for phrases
    Hello, I am taking input from a search HTML input field. I need to parse, with regex if its faster, the string that input by the user to determine what items are phrases/keywords. For instance if the user enters: "big boats" trucks I should somehow be able to kn...

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: rdivilbissPosted on 2005-08-05 at 17:19:50ID: 14612571

What I would do is create a list of these words and their associated tooltips, which I would keep in a database and retrieve into a client side array when the page loads.

However, lets assume you do not have a database...then you can do this all client side and you won't have to modify all your pages, except to add a couple of lines to each.

See live example: http://www.rodsdot.com/ee/highlightGlossaryTerms.asp

You can put all the search terms in a single file along with the JavaScript function to highlight those terms.  That file can be included in each page with:

<script type="text/javascript" src="glossary.js"></script>

then at the bottom of each page, just before the closing body tag </BODY> add,

<script type="text/javascript">highlightGlossaryTerms();</script>

Just two lines.

If you want to, you can use server side code to retrieve the glossary terms from a database, but really, that probably will not be faster, as a JavaScript include file is just text and loads very fast, even if you had thousands of terms.

Regards,
Rod

 

by: daniel_sirveraPosted on 2005-08-05 at 19:17:24ID: 14612796

Wow!

This is incredible...  This alone completely satisfies the question!  I've got to give this an A... Thank you so much!

Two more things though if you don't mind...

How should I edit the javascript code in your example if I wanted to forgo the tooltip altogether?
How should I edit the code if I wanted to highlight only the first instance of the term within an article?

Again, thanks for your help!

 

by: rdivilbissPosted on 2005-08-05 at 20:16:46ID: 14612898

To eliminate the tool tip, we remove  title="'+terms[idx][1]+'"

function highlightGlossaryTerms(){
      var temp = document.getElementById('theExample').innerHTML;
      var tmpBody;
      for (var idx=0;idx<terms.length;idx++){
            tmpBody=temp.split(terms[idx][0]);
            temp=tmpBody[0];
            for (var jdx=1;jdx<tmpBody.length;jdx++)
                  temp+='<a href="'+terms[idx][2]+'" class="terms">'+terms[idx][0]+'<\/a>'+tmpBody[jdx];
      }
      document.getElementById('theExample').innerHTML=temp;
}

To highlight the first instance only we will just add a check to the code where we put the page back together:

just new code:
if (jdx=1)
    // first term only
    temp+='<a href="'+terms[idx][2]+'" class="terms" title="'+terms[idx][1]+'">'+terms
            [idx][0]+'<\/a>'+tmpBody[jdx];
else
    // have to put the term back because we split on it.
    temp+=terms[idx][0] + tmpBody[jdx];

function highlightGlossaryTerms(){
      var temp = document.getElementById('theExample').innerHTML;
      var tmpBody;
      for (var idx=0;idx<terms.length;idx++){
            tmpBody=temp.split(terms[idx][0]);
            temp=tmpBody[0];
            for (var jdx=1;jdx<tmpBody.length;jdx++)
                                         if (jdx=1)
                  temp+='<a href="'+terms[idx][2]+'" class="terms" title="'+terms[idx][1]+'">'+terms[idx][0]+'<\/a>'+tmpBody[jdx];
                                         else
                                                temp+=terms[idx][0] + tmpBody[jdx];
      }
      document.getElementById('theExample').innerHTML=temp;
}

 

by: rdivilbissPosted on 2005-08-05 at 20:35:07ID: 14612941

The modifications to the original routine were written in this form without testing, but I don't think I made a typo. (But you never know until you test it <smile>).

If you confine the search to the content and put your content inside a divide, like I did, this will be faster.  Change the line:

var temp = document.getElementById('theExample').innerHTML;

so 'theExample' is replaced by the ID of the container for your content.

e.g.

<HTML>
<HEAD>
...
</HEAD>
<BODY>
... menu stuff
... more navigation
<div id="mainContent">
the bulk of your page content
</div>
... footer
</BODY>
</HTML>

If you do not segregate your content to search this way, you'll be searching and replacing on the entire document and finding a search term in the head would not be good.

So, then you would have to modify that line to

var tempCollection = document.body.getElementsByTagName('BODY');
(and assuming only one body tag)
var temp = tempCollection[0].innerHTML

Not pretty.  So, if you have a lot of existing pages, and you don't want to add a container divide for your main content...at least add an ID to the body.

e.g. <body id="main" ....>

then you can just do:

var temp = document.getElementById('mainContent').innerHTML;

Whatever ID you use...you need to use the same ID for each page if you want to call this as an include, and not have to modify it for each page, or modify the function to take a parameter for the container's ID.

A note about the original question....you said you didn't want to mess up links, and the original does not look for <a> tags.  The way I write my pages it is very unlikly there would be a link in the body, (but not impossible.)

I'll post a modification in a little while (little while meaning when I finish writing it) to avoid anchors in the body.

Regards,
Rod

 

by: daniel_sirveraPosted on 2005-08-05 at 20:44:57ID: 14612957


"I'll post a modification in a little while (little while meaning when I finish writing it) to avoid anchors in the body."

Amazing... you totally anticipated my next question.  I'm calling the javascript as an include, and it works like a dream.  But yes, I did find that some key terms within the titles of book recommendations are coming out somewhat mangled...  Perhaps some sort of exclusionary feature could be made.

Again, thank you for your time in this!

 

by: rdivilbissPosted on 2005-08-06 at 15:39:55ID: 14615743

Skip hyperlinks:
http://www.rodsdot.com/ee/highlightGlossaryTerms1.asp

<script type="text/javascript">
<!--
var terms = new Array();
terms[0] = ['FBI','Federal Bureau of Investigation','definition.asp#fbi'];
terms[1] = ['e-mail','electronic mail','definition.asp#email'];
terms[2] = ['phony','not true, fake.','definition.asp#phony'];

function highlightGlossaryTerms(){
    // AVOID <a...>..</a> tags
      var temp = document.getElementById('theExample').innerHTML;
      var tmpBody = new Array();
      // segregate the anchors
      var cnt=0;
      var inA=false;
      var lookIn='';
      var subBody;
      var tmpStr;
      
      while (temp.length>0) {
            if ((temp.indexOf('</A>')>-1)||(temp.indexOf('<A')>-1)) {
                  if (inA) {
                        tmpBody[cnt]=temp.substr(0,temp.indexOf('</A>')+5);
                        temp=temp.substr(temp.indexOf('</A>')+4);
                        inA=false;
                        cnt+=1;
                  }else{
                        tmpBody[cnt]=temp.substr(0,temp.indexOf('<A')-1);
                        temp=temp.substr(temp.indexOf('<A')-1);
                        inA=true;
                        lookIn+=''+cnt+',';
                        cnt+=1;
                  }
            }else{
                  tmpBody[cnt]=temp;
                  temp='';
                  lookIn+=''+cnt+',';
            }      
      }            
      lookIn=lookIn.substr(0,lookIn.length-1);
      
      for (var idx=0;idx<terms.length;idx++){
            for (var hdx=0;hdx<tmpBody.length;hdx++) {      
                  if (lookIn.indexOf(hdx)>-1) {
                        subBody=tmpBody[hdx].split(terms[idx][0])
                        tmpStr=subBody[0];
                        for (var jdx=1;jdx<subBody.length;jdx++) {
                              tmpStr+='<a href="'+terms[idx][2]+'" class="terms" title="'+terms[idx][1]+'">'+terms[idx][0]+'<\/a>'+subBody[jdx];
                        }
                        tmpBody[hdx]=tmpStr;
                  }
            }
      }
      tmpStr='';
      for (var idx=0;idx<tmpBody.length;idx++)
            tmpStr+=tmpBody[idx];
      document.getElementById('theExample').innerHTML=tmpStr;
}
//-->
</script>

 

by: daniel_sirveraPosted on 2005-08-07 at 04:10:42ID: 14617028

Mr Divilbiss,

This has been absolutely the most helpful javascript code I've seen yet.  I know you've been most patient with me, but I hope to share a work-around to one of the expected idiosyncrasies, and then after ask something else about this very latest feature you've incorporated into the original version.

Suppose one glossary entry was the word "format".  The "format" in inFORMATion would be linked.  One quick improvisation was to mangle the HTML text with an arbitrary (and inert) tag to render the Key Term "invisible" to the javascript function.  Something like:

info<null>rma</null>tion

I was immensely pleased to see this work. But now suppose I were to manually edit the HTML to read:

<i><a href="somerandompage.html">"The Ingenius Format:  Why This is The Best Script."</a></i>

While the new version you've just posted works wonderfully in ignoring the "Format" within the title (provided there's a case-sensitive match).  The rendered page would look like this (simulating italicizing, of course):

The Ingenius Format: Why THis is The Best Script."<

I wondered where the hanging < came from, and later discovered that whatever character immediately followed the </a> tag would be rendered visible on my browser (IE 6 & Firefox 1.03), even if that charachter is part of another HTML tag. If instead I placed a non-HTML related character following the </a> tag, it would simply be replicated.  For example, if the HTML were manually coded this way:

"<a href="somerandompage.html">The Ingenius Format:  Why This is The Best Script.</a>"

The rendered page would display this instead:

"The Ingenius Format: Why This is The Best Script""

The only way I've found to remove any lingering text is to just leave an extra space after the </a>.   It looks like this is something that might be fixed just by changing one small aspect of the javascript, but I can't seem to pinpoint exactly where.

Thanks for all your help.  Had I been given the option I would have done better than offer 500 pts. for your efforts.

 

by: daniel_sirveraPosted on 2005-08-07 at 05:15:40ID: 14617159

Does this do the trick properly?


FROM THIS:


      while (temp.length>0) {
            if ((temp.indexOf('</A>')>-1)||(temp.indexOf('<A')>-1)) {
                  if (inA) {
                        tmpBody[cnt]=temp.substr(0,temp.indexOf('</A>')+5);
                        temp=temp.substr(temp.indexOf('</A>')+4);
                        inA=false;
                        cnt+=1;
                  }else{
                        tmpBody[cnt]=temp.substr(0,temp.indexOf('<A')-1);
                        temp=temp.substr(temp.indexOf('<A')-1);
                        inA=true;
                        lookIn+=''+cnt+',';
                        cnt+=1;


TO THIS:

      while (temp.length>0) {
            if ((temp.indexOf('</A>')>-1)||(temp.indexOf('<A')>-1)) {
                  if (inA) {
                        tmpBody[cnt]=temp.substr(0,temp.indexOf('</A>')+4);
                        temp=temp.substr(temp.indexOf('</A>')+4);
                        inA=false;
                        cnt+=1;
                  }else{
                        tmpBody[cnt]=temp.substr(0,temp.indexOf('<A')-1);
                        temp=temp.substr(temp.indexOf('<A')-1);
                        inA=true;
                        lookIn+=''+cnt+',';
                        cnt+=1;

 

by: rdivilbissPosted on 2005-08-07 at 08:17:46ID: 14618121

>Suppose one glossary entry was the word "format".  The "format" in inFORMATion would be linked.  One quick improvisation was to mangle the HTML text with an arbitrary (and inert) tag to render the Key Term "invisible" to the javascript function.  Something like:

You could also make your entry 'format ' (trailing space) thus forcing it to break on a whole word boundary.

>The only way I've found to remove any lingering text is to just leave an extra space after the </a>.   It looks like this is something that might be fixed just by changing one small aspect of the javascript, but I can't seem to pinpoint exactly where.

Yes you founf the right area...

The Best Script."</a></i>
-------------------^--^
                        12345

tmpBody[cnt]=temp.substr(0,temp.indexOf('</A>')+5);
temp=temp.substr(temp.indexOf('</A>')+4);

might need to be

tmpBody[cnt]=temp.substr(0,temp.indexOf('</A>')+4);
temp=temp.substr(temp.indexOf('</A>')+5);




20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...