I have transcribed a ca. 400 page novel into an ordinary Word document. (Actually I have dozens of such novels transcribed.)
I have a program that searches for occurrences of certain key words in the text. The program then records (a) the word (with punctuation), (b) a few preceding words and (c) a few following words for context. I would also like to record where those keywords appear in the printed book that is the source of the Word document. I have put marks (e.g., |p 2;) in the Word document to show the start of each printed page, but those marks could confuse the computer searching for strings that span more than one page.
I have therefore written a program to go through the document to note the number of characters between the start of each new page and the top of the document. That program searches for each page mark, notes the page number, deletes the mark, and uses “Selection.Range.Start” to get the distance between the top of the document and the given page number. (Note: It also records the first few words on the given page just to confirm the data are correct, but that is a separate task than the one being described here.) I then have two pieces of information to store for each page of the original text: the page number and its location.
I would then like to set up some sort of look-up table to locate any word on any given page. For example,
If page 2 starts at 1,200 characters from the top of the document and
Page 3 starts at 2,300 characters,
Then if Selection.Range.Start tells me that a word I have begins at 2,000 characters from the top of the document then I would like the computer to be able to say the word can be found on page 2 of the original text. Its location is greater than 1,200 but less than 2,300 characters.
I would like to avoid having to go through up to 400 searches for any given location just to find the correct page number.
Thanks for any easy solution to the problem. If that is impossible, thanks for any moderately difficult solution.
JRA in Priddis