Link to home
Start Free TrialLog in
Avatar of Shyamulee Das
Shyamulee DasFlag for India

asked on

Find and Replace Words(Abbreviations) in Ms Word.

Hi,

I want to create a macro in word which will find words listed in excel and replace the words in MS Word Document.
For example - In my excel the sheet contains 2 columns, 1st column Words_to_Find & 2nd column Words_to_Replace!! The list of words are Abbreviation, (Eg: DBS -Deep brain stimulation) . I want to replace the first instance in the document. Suppose if the full-form is not written in 1st instance the replace it ,
For Eg    [ (DBS) has shown wide applications for treating various disorders in the central nervous system by using high frequency stimulation (HFS) sequences of electrical pulses. ]
In the above line DBS will be replaced by Deep brain stimulation but in the pattern of underline word in the above sentence. It will replace the word with "full-form (abbreviation)".
I am attaching Excel and word document.

Thank you.
Avatar of Pinapple Pink
Pinapple Pink

Are you looking for a dynamic macro, or a macro with set word functionality?
Avatar of Shyamulee Das

ASKER

I just want to find words from MS Excel and replace in MS Word using word macro. I have attached the Excel sheet and the word document above.
Avatar of aikimark
I have attached

Open in new window

File missing
Sorry for not uploading the files!!
Abbrevation.xlsx
Doc1.docx
if the full-form is not written in 1st instance
Does the expanded text need to exist immediately before or can it exist anywhere in the document?

Do you need to replace "(DBS)" with "(Deep brain stimulation)" or prepend the replacement text?  I'm thinking your parentheses are key to proper replacement logic.
It should be Deep brain stimulation (DBS) after replacement.. I need  the "full-form (Abbreviation) " in my first instance. and short form in rest of the document.
So, you need to prepend the longer text when it doesn't already exist?  Please confirm my assertion.  You asked for text replacement in your problem description, but it seems that you really want to conditionally prepend the text.

What if the first instance of "DBS" is not in parentheses?  This gets back to my hunch that the parentheses matter to the proper solution.
Agreed!! So where ever there is (DBS) then it should be like "Deep brain stimulation (DBS)" and where ever there is "DBS" then only "deep brain stimulation" . Is this possible? It will be a great help.

Thank you.
I've added this routine to a module in the attached document.  In order to run VBA code, the file type had to change to docm.

Please test this.
Sub Q_29118168()
    Dim oXL As Object
    Dim wkb As Object
    Dim wks As Object
    Dim vData As Variant
    Dim lngLoop As Long
    Dim rng As Range
    Dim rng2 As Range
    Dim lngFullTextStart As Long
    
    Set oXL = CreateObject("excel.application")
    Set wkb = oXL.workbooks.Open(ActiveDocument.Path & "\Abbrevation.xlsx")
    Set wks = wkb.worksheets("sheet1")
    
    vData = wks.usedrange.Value
    
    For lngLoop = 2 To UBound(vData)
        Set rng = ActiveDocument.Content
        
        With rng.Find
            '(try to) find the full text
            .Execute _
                findtext:=vData(lngLoop, 2) & " (" & vData(lngLoop, 1) & ")", _
                MatchCase:=False
            lngFullTextStart = rng.Start
            'Debug.Print .Found, lngFullTextStart, vData(lngLoop, 2) & " (" & vData(lngLoop, 1) & ")"
            
            If .Found Then
                'Is this the first abbrev instance?  Compare start values
                Set rng2 = ActiveDocument.Content
                With rng2.Find
                    .Execute _
                        findtext:="(" & vData(lngLoop, 1) & ")", _
                        MatchCase:=False
                    Debug.Print .Found, rng2.Start, "(" & vData(lngLoop, 1) & ")"
                End With
                
                If rng2.Start < lngFullTextStart Then
                    'not the first instance, so we prepend the full text to the first
                    'instance of the abbreviation
                    Set rng2 = ActiveDocument.Content
                    With rng2.Find
                        .Execute _
                            findtext:="(" & vData(lngLoop, 1) & ")", _
                            MatchCase:=False, _
                            replacewith:=vData(lngLoop, 2) & " (" & vData(lngLoop, 1) & ")", _
                            Replace:=wdReplaceOne
'                        Debug.Print .Found, rng2.Start, "(" & vData(lngLoop, 1) & ")"
                    End With
                End If
            Else
'                'replace the first abbrev instance, if it exists
                .Execute _
                    findtext:="(" & vData(lngLoop, 1) & ")", _
                    MatchCase:=False, _
                    replacewith:=vData(lngLoop, 2) & " (" & vData(lngLoop, 1) & ")", _
                    Replace:=wdReplaceOne
                Stop
            End If
        End With
    Next
    
    'Stop
    
    Set wks = Nothing
    wkb.Close
    Set wkb = Nothing
    oXL.Quit
    Set oXL = Nothing
    
End Sub

Open in new window

Q_29118168.docm
But all my files are in docx format and i don't want to change the format..
It is only replacing the first instance? and it is showing error and the macro breaks!! Can you please highlight the change?
I want to change the first instance for now in the doc!! Which ever word is listed in the excel their 1st instance should be changed and rest of the word should be in abbreviation format!!
Sorry for the confusion!

The output I am getting is "Deep brain stimulation (DBS)(DBS)" & I want it as "Deep brain stimulation (DBS)" .
The last word  listed in the excel is not changing in the doc!!
i don't want to change the format
So, your problem scope has expanded to multiple Word documents  :-(
Please explain all the constraints in future problem descriptions

Once we agree on the behavior of the code, I can add some code to iterate the docx files in a folder.  You will need some VBA or VBScript environment for any solution.  This could even be the Excel workbook, which would need to be an xlsm format.

It is only replacing the first instance? and it is showing error and the macro breaks!! Can you please highlight the change?
In the sample files you provided, the code only changes the first instance.

What error messages are you seeing?
What do you mean by "breaks"?  (when, where, how, what input file(s))

What purpose is highlighting?

and rest of the word should be in abbreviation format
Please post two representative Word documents of the "before" (current state) and "after" (desired state)

The output I am getting is "Deep brain stimulation (DBS)(DBS)"
I did not get that in my run.  Did you run the code twice?  It is possible that the code might behave differently with a different document, one that was changed by a prior invocation of the code.  I didn't test this.  Once the code runs, it shouldn't need to alter that document again.  If you don't have backup copies of the document, please download the one I posted, make a copy, and test with copy.  If you want to save yourself some time, delete the copy from the prior test and make a new copy of the downloaded file.

You can compare the copy with the downloaded document using Word's compare function.  That might be the highlighting you seek.

The last word  listed in the excel is not changing in the doc!!
What is the "last word" in the workbook?
Are you using the same workbook as the one you posted earlier?

===========================
You're using a lot of exclamation marks (!).  Is that to express surprise or anger?  I realize we're using different English dialects, so I didn't want to miss any subtext or hidden meaning with your punctuation.
First of all sorry for (!) exclamation marks used , its just my habit. :)

Macro is running great without changing the format of the word document..
I have fixed the errors of getting "Deep brain stimulation (DBS)(DBS)" this. It was from my end.

 I want to highlight the replaced word i.e "Deep brain stimulation (DBS)" so that it is easy to find the change.

So I will brief you again about my problem- I want to replace the first instance of every abbreviated words in the document listed in the excel sheet (which is happening by the given macro).

I want you to help me out with this:
Instances of the same words in the doc should be in abbreviated form, i.e [ if it is "Deep brain stimulation" then it should be like this "DBS"].

The last word in my excel is "APS" and now it is replacing as I refreshed everything and started the macro again.

So my only problem is replacing instances into abbreviated format.

I hope I have cleared all your doubts.

Thank you so much.
If you are using the same document as you posted and the docm version that I posted, please post a desired-state version of the original document.  Do the changes manually.  I can perform a document comparison on my PC.
I am working on the same document that I have posted.
DBS has shown wide applications for treating various disorders in the central nervous system by using HFS sequences of electrical pulses. However, upon the onset of HFS sequences, the narrow pulses could induce synchronous firing of action potentials among large populations of neurons and cause a transient phase of “onset response” that is different from the subsequent steady state. To investigate the transient onset phase, the APS were used as an electrophysiological marker to evaluate the synchronous neuronal reactions to axonal HFS in the hippocampal CA1 region of anesthetized rats. New stimulation paradigms with time-varying intensity and frequency were developed to suppress the “onset responses”. Results show that HFS paradigms with ramp-up intensity at the onset phase could suppress large APS potentials. In addition, an intensity ramp with a slower ramp-up rate or with a higher pulse frequency had greater suppression on APS amplitudes. Therefore, to reach a desired pulse intensity rapidly, a stimulation paradigm combining elevated frequency and ramp-up intensity was used to shorten the transition phase of initial HFS without evoking large APS potentials. The results of the study provide important clues for certain transient side effects of DBS and for development of new adaptive stimulation paradigms.


The words which are in bold needs to be replaced like this DBS to "Deep brain stimulation(DBS)", HFS to "high frequency stimulation(HFS)".
I am working on the same document that I have posted.
Oh.  NO.  :-(

Posting text in a comment is not the same as posting documents.  Please do not make this more difficult than it already is (has become).  We are still at the problem definition stage.

Don't present a moving target for experts.  I will wait for you to finish working on the document and post before and after versions for me.  Don't worry about the code, just post the two docx documents.
I have just posted the inside text of the document for explanation. Its the same content that is available in the document.

For your clearance I am uploading the document again. There is only one word document and an excel sheet .

Thank you.
Doc1.docx
taking deep breath....

I have asked you to post a before/current document and an after/desired document.  Do you understand this request?  Do you understand WHY I've made this request?

The document you posted was similar to the original document -- but different.  Most notably, the parentheses are missing.

If you change the conditions (document), you are moving the target and obfuscating the actual problem definition.  Don't do this.

If your most recently posted document is a representational sample of the actual documents, then post a manually edited version of this document as you need it to be.
I guess we both are not understanding each other.

The recent document which I have send that is only the document I am working on.  
the parentheses are missing
From First only I wanted in this "Deep brain stimulation(DBS)" way.
 
Parenthesis will not be there for 1st instances.
Input = DBS
My output = "Deep brain stimulation(DBS)".

I hope this is understandable?
In the original document (docx) you posted, there were parentheses around several of the abbreviation instances.  In the most recent version of the document, these were missing.

Let's use an analogy.  You show a picture of your kitchen to a designer or contractor and describe, in words, what you would like.  The internal image of your beautiful, newly renovated, kitchen is clear in your mind.  The designer/contractor comes back to you with a drawing of what they think you want.  You are upset because their drawing doesn't match your description.  They ask you to draw up a picture of what you want.  Instead, you produce a picture of your kitchen as it is right now, which is different than the picture you originally showed them.  The designer/contractor is very confused - your kitchen (their starting point) has magically changed.  Their plans for transitioning your existing kitchen to the kitchen you described in words won't word because the starting conditions have changed.

When I ask you to manually edit a document and post the before and after documents, I am asking you to provide me with a non-moving starting condition and desired-end-state.  If you can not do that, then let me know that you can't (and, preferably, why).  I will be unable to work on your problem for the next six hours.  Hopefully, you can edit-and-post the desired-state document.  You are welcome to craft a new "special" document with as many variations on the text as is necessary to cover all the conditions that will be found in your directory.  If you create a new document, then copy and edit the new document to your desired state.

It is perfectly fine for you to ask questions.  This question thread is a dialog for people located on opposite sides of the world, speaking different English dialects.
(Starting Point)>> Document Before Replacing  - "DBS has shown wide applications for treating various disorders"
(Desired End point)>> Document After Replacing - "Deep brain stimulation(DBS) has shown wide applications for treating various disorders"

I am attaching two documents -

The first doc contains non-edited abbreviated words eg: DBS, HFS    (Starting Point)
The second doc contains edited abbreviated words eg : Deep brain stimulation(DBS), High frequency stimulation(HFS).   (End Point)
Starting-Point.docx
End-Point.docx
Thank you.
Comments and questions:
1. The first occurrence of "DBS" in the Abstract section is clear and unambiguous.
2. Later, in the INTRODUCTION section, you repeat the association with "deep brain stimulation (DBS)".  I thought you only wanted to see the long version text the first time it appears in the document and only see the abbreviation all subsequent times.  Please clarify.
3. The first time "High Frequency Stimulation" is used in (what I assume is) the title paragraph, even though it has a normal paragraph style.  According to your stated rules, this should appear as "High Frequency Stimulation (HFS)".  Please clarify.  Does the long text have to match the case of existing text?
4. The second time "high frequency stimulation" appears is in the Keywords: section.  Just like #3, it has not been changed.  Please clarify.
5. In the INTRODUCTION section, your change of "HFS" to "high frequency stimulation" is confusing.  You are changing the abbreviation to the full text.  This was not one of your stated objectives.  I would have expected the "HFS" to be changed to "high frequency stimulation (HFS)".  Please clarify.
6. Similar to #2, there string "high frequency stimulation (HFS)" appears in the INTRODUCTION section.  Please clarify.
7. The treatment of "APS" and its long string form is clear and unambiguous.
8. I expected to see at least one instance of a long form being changed to its abbreviation.  I didn't see that.  Have I misunderstood that requirement?  Please clarify.
Only for the 1st instance I want you to change to "Deep brain stimulation(DBS)" and if and only if there are "Deep brain stimulation (DSB)" else where other then 1st instance then change it to "DBS". Title, Keywords section can have abbreviation. 1st instance in text i.e paragraphs(Abstract, Introduction) should be changed.
Same goes for others..
If there are "DBS" elsewhere in the doc other than 1st instance then not to change them , that is fine.
Isn't Ctrl H the smartest way to do this. Have 2 windows open and just go down the list.

If it is a long list or lots of docs, then use one of the many key / mouse logging macros to automate it as much as you want. This is my current favorite: www.mouserecorder.com. I haven't downloaded your docs for a look at the problem but for this kind of issue I would have two windows open and start in the top lookup value being put in A1 in Excel. Ctrl C (copy), focus the word doc, Ctrl H (find & replace), Ctrl V (paste) in top field, focus Excel, hit Right (the replacement column info) Ctrl C, focus Word, Down, Ctrl V, Enter. Focus Excel, Down, Repeat.

Bottom line is if you can do it with key strokes (preferably) and mouse movements (not so reliable) once, you can do an easy macro to repeat it. You can add a test condition to stop it, or just watch it and break / swap docs when gets to the bottom. If there were lots of docs I would have another column with their names and have them auto opened and processed. I have left PC's running for hours doing this menial but effective housekeeping on masses of files in the past. It may be a bit cludgy but, when you get it to work, it works really well.

Just have a think about boundary conditions. Obviously getting to the endo of the list, but what does it do if it doesn't find the term either once or at all going down the list (like you opened the wrong doc entirely) and other mistakes in that reality gap between belief and instructions, as guess which one the PC will do...!
@Shyamulee

Thank you for that clarification.

Please address #5 (very important) and #8 (less important).
My bad!
#5
It should be "high frequency stimulation(HFS)"
I didn't get #8, if you can elabroate it will be good.

Thank you.
How long will this take?

8. I expected to see at least one instance of a long form being changed to its abbreviation.  I didn't see that.  Have I misunderstood that requirement?  Please clarify.
Your before and after documents show us representative samples of your replacement rules applied to document text.

From your problem descriptions, I expected that you would have included an instance in your text of full text being translated into an abbreviation.  Since you didn't include an example of this in your before/after documents, I wanted you to clarify my interpretation of your rule.

In this comment, "long form" is the (full text) string associated with an abbreviation in your Excel workbook.
Example: "high frequency stimulation" is the long form of the HFS abbreviation.

How long will this take?
Shouldn't take very long.  I think we're almost finished defining the rules you've described.  I am getting ready to leave for work, so it will be 11 hours before I can work on this.
Is there a rush?  The EE experts are volunteers.

I have a new question:
9. Are you sure that you want to apply your abbreviation substitution in the Keywords section?  This would result in a document that starts like this:

Novel Stimulation Paradigms with Temporally-Varying Parameters to Reduce Synchronous Activity at the Onset of high frequency stimulation (HFS) in Rat Hippocampus

Running title:  Novel Stimulation Paradigms for HFS
Keywords: HFS, pulse stimulation, intensity transition, frequency transition, axonal conduction block, LabVIEW software
Notice that the long text in the title is lower case.  The table of values in the Excel workbook has a mix of upper case and lower case long text strings.  In some cases, the resulting document might have undesired capitalization.  This is something I just thought of.  Sometimes I will show a client something like this in order to 'play dumb' and spark a detailed conversation.
I don't want anything to be done to the Title, Running title and Keywords.
The 1st instance should be in Abstract or whichever paragraphs, example you can set a range like paragraph more than 40 words or something.

#8 The file that I have send you is with Track Changes On that's the reason.
The actual sentence is like this>> "Most stimulation the rapies utilize high-frequency stimulation (HFS), though the frequency ranges of HFS have different definitions. For brain stimulations, a pulse frequency greater than 50 HZ is defined as HFS (Durand and Bikson 2001)."
It will be really good if I get the code as soon as possible.

Thank you.
The excel can be change accordingly.
As DBS is at the starting of the sentence that is why I have kept it in Upper case.
Can I get the code today?
Can I get the code today?
not likely.  I'm at work. Just checking messages/new comments on my lunch break.
How long will the process take ? I have something important to do and because of the macro I am stuck.

Thank you.
I don't want anything to be done to the Title, Running title and Keywords.
and
Title, Keywords section can have abbreviation. 1st instance in text i.e paragraphs(Abstract, Introduction) should be changed.
So the abbreviations can exist the Title and Keywords sections, but the code should not create those abbreviations by any actions.  Not sure what a "Running title" is.

Will ALL titles be formatted the way that your posted documents have?

If you have to do something pressing, you should tell your client that you will be delayed or else do these changes manually.

Doing your own capitalizations may be tricky.  If the replacement happens in the middle of a sentence, you are introducing unwanted caps.
So the abbreviations can exist the Title and Keywords sections, but the code should not create those abbreviations by any actions.  Not sure what a "Running title" is.
That's why I had told you to set a Range for Text.
E.g. if paragraph.range.words.count >30  then do the changes...
As Title, Running title(1 liner title description),Keywords Will never exceed more than 30 words so hence no changes done.
Can you please explain me the flow of the code which you had send me before?
ASKER CERTIFIED SOLUTION
Avatar of aikimark
aikimark
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Time to go do things for which I'm paid.  Bye for now
Thank you so much..