Link to home
Start Free TrialLog in
Avatar of Luiza1
Luiza1

asked on

How to replace %20 to "-" in Dreamweaver only in the internal links?

I have a problem in Dreamweaver with documents that have blank spaces. Dreamweaver puts %20 in the links to those documents and then it does not recognize those links anymore, it sees them as broken and files as orphaned. This is not good because as soon as you move the document, Dreamweaver does not updtae the link and then it really stops working on the page. What is the best solution for this? I would like to replace all the %20 in the internal links to "-". How can I do this at once without changing %20 in external links?

example of internal link to be modified:
<a href="_doc/_doc/grants/grants_after_revis07/conv_cadre_partenariat_EN_no%20modif%20LS_TC%20version.DOC">[en]</a>
Avatar of silemone
silemone
Flag of United States of America image

search and replace.
search and replace entire site.  and when you come upon this change to what you want.  you could use regular expressions like (.)%20(.)  or just search for %20 should work also...
Avatar of Luiza1
Luiza1

ASKER

Yes there is a find and replace option in Dreamweaver, I know that already. But we have a large website with over 4000 pages, and as I cannot do a replace all because the external links have to stay unchanged, this would mean that I have to go page by page, link by link. I am looking for a way to select only broken internal links and then do a replace all. Can anyone tell me how to do this?

example of external link NOT to be modified by the replace:
<A class=bleulien href="https://intracomm.cec.eu.int/budg/budgacc/en/accounting-modern/implementationDG/transition/guarantees%20Note%20for%20Balance%20Validation.doc"></A>
yes, again, you could use the regular expression that only changes %20 when its found in a relative address as opposed to an address that starts with https:   or if you're using absolute address for internal links, then search for a pattern that includes your sitename:  i.e.  https://mysite/(.)%20(.)
yes, again, you could use the regular expression that only changes %20 when its found in a relative address as opposed to an address that starts with http:: (if these signify external sites...)   or if you're using absolute address for internal links as well, then search for a pattern that includes your sitename:  i.e.  https://mysite/(.)%20(.)
SOLUTION
Avatar of silemone
silemone
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Luiza1

ASKER

ok, but how does this regular expression that distinguishes only relative links look like then exactly?
if you don't know how to write regular expressions, I will help if you, but you must be able to identify a different pattern between internal and external links, otherwise, whatever method you choose will be page by page.
Avatar of Luiza1

ASKER

Ok, I realise we must find a pattern. I heard of regular expressions before but I just don't see the pattern. Let's try with some examples perhaps.

External (not to be modified)

https://intracomm.cec.eu.int/budg/budgacc/en/accounting-modern/implementationDG/transition/guarantees%20Note%20for%20Balance%20Validation.doc

https://intracomm.cec.eu.int/budg/budgacc/fr/comptabilite/manual%20comptable/version%20html/Fiches%20du%20manuel/sommaire-immob-incorp.htm

Faulty links (%20 not needed at all, to be deleted)
../../leg/ir/leg-030-25_ir2003_en.html#93%20%20

Internal links (%20 to be modified in "-")

_doc/_pdf/D%2051%20du%2012012jan07_controles_comptables.pdf

_doc/_doc/closure2007/DG%20general%20closure%20instructions%202007.doc

_doc/2006/lignes%20directrices_de.doc

the _doc folders exist in many subfolders across the entire site. Normally we have a rule to keep all documents of each section in its own _doc subfolder, but sometimes people forget to do that.

Is this information clear enough, can you create a regular expression from this?
well you may have to do two search and replaces...

one with a patter that  starts like

[.][.]/(.)%20(.)

_doc(.)%20(.)
if i'm correct, (.) searches for any character...may need a  + behind it
can you test for one of your internal links using that?
Avatar of Luiza1

ASKER

Ok, sorry but I got no results. I tried the first one: [.][.]/(.)%20(.)
and I did not get any results. I used the page with a lot of faulty links:
../../leg/ir/leg-030-13_ir2003_fr.html#43%20%20

Then I tried _doc(.)%20(.) and got no results either.

The problem with (.) is that it stands for any character but only specifically one char and not more. While we need something for any character with a varying number of characters it stands for. Do you know if something like that exists in regular expression?
so as my earlier post stated, you may need a + behind the (.)...

[.][.]/(.)+%20(.)+
hopefully we get something this time
Avatar of Luiza1

ASKER

Ok, with + it finds them all.
_doc(.)+%20 is working also good

So we are getting the right results, but now how do we do a replace all since the whole link is now selected by using this search starting from _doc till %20 and we only need to replace %20 part?
good question...lol...well now we have to use the regular expression substitution method...this could be tricky...
Avatar of Luiza1

ASKER

Yes I agree, this whole operation is tricky. Couldn't Dreamweaver provide a simpler way? I know you have a link checker option, if there was a way to select there only internal broken links and then do a replace all, that would ideal. Do you know if something like this could be possible?
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Luiza1

ASKER

Ok, that's very good. I tried it and it works. There is only still a little problem. When there is a link with multiple %20, only the last one is being replaced and then rest stays unmodified.
Example:
_doc/_doc/grants/grants_after_revis07/conv_sub_fonct_EN_with%20modif%20LS_TC%20version.DOC
will become:
_doc/_doc/grants/grants_after_revis07/conv_sub_fonct_EN_with%20modif%20LS_TC-version.DOC

This has to do with the search (_doc(.)+)(%20) that selects the whole link until the last %20
Any ideas how to solve this so that all %20 are immediately replaced?
ok...i didn't take into account mutiple %20  again...back to the drawing board...
if i'm not mistaken, try this on one line

[(_doc(.)+)(%20)]*
and see if it picks up anything in the search
if it does, try a replace on one page with the $1TheTextWeWant <---obviously you change that to what you were using... and see what it does
[(_doc(.)+)(%20)]+   use plus instead of *
Avatar of Luiza1

ASKER

No, I get no results with that one.
Avatar of Luiza1

ASKER

No, with this one [(_doc(.)+)(%20)]+ I get all the characters on the page selected that include d, o, c...
uhhhmmmm....I'll have to review my regular expression...unfortunately i don't have dreamweaver to test the RE, but i shall return with a solution.
Avatar of Luiza1

ASKER

Ok, thank you for your help so far. I shall check back tomorrow, hopefully to find a solution to this tricky problem.
gotcha...
I have an idea that might work, but first can you confirm the following?

All external links are http links.  (Do you have any ftp: mail: etc...)

I need to craft it first, but the idea would be to build a command file that loops through all link nodes, checks to see if thay are http, if they are ignore them, if not replace all %20 with one space.
Avatar of Luiza1

ASKER

Ok, but not all external links are http links. There are some that just link to another intranet site within our company and then they just use that site's folder names. Here is with that our external links start with:

http:
https:
mailto:
/home/
/../home/
/sg_vista/
/security/

Does this help or is it too much to use it in the find command?
I just got in to work...I'm back on the case...i can finish reg exp this morning...
I feared that might be the case Luiza.  My idea won't work then.  I bet you Silemone will come through with a home run though!  Good luck.
trying...
Avatar of Luiza1

ASKER

Hi, how are you doing? Could you make any progress with the solution so far to be able to delete mutliple %20 at once?

I also have a small question concerning the current regular expression: (_doc(.)+)(%20)
It works fine as long as the different links are separated by a space or enter. But as soon as there are 2 links right next to each other, they will be both selected together by the find, example:
<p> [<a href="_doc/_doc/grants/grants_after_revis07/specificactions.DOC">en</a>]</p><p><a href="_doc/_doc/grants/grants_after_revis07/second%20grant%20payment.DOC">[en]</a></p>

I don't know if that can cause problems, so I'm just letting you know. What do you think?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Luiza1

ASKER

Yes I tried that on a few pages and it seemed to work fine, but I did have to run it 5 times in total. There are a lot of pages in total, so it will take a lot longer if I have to do it 5 times. Are you sure there isn't a better way to replace them all at once, like maybe by using a windows script? That's how I managed to remove all the spaces from the documents names. Do you think using Dreamweaver is the best option here?
Avatar of Luiza1

ASKER

thank you very much for the solution, it's just a shame we could not find a way to immediately replace all the %20 at once.
Agreed...I'm sorry I couldn't find more time to complete.  Even my friend who's a regular expression guru was having problems with it.  I apologize for not being able to complete the task completely.
And, yes using a window's script or programming language would have been awesome.  I just thought we were restricted to the tools of dreamweaver.  That would have been ten times easier.  I didn't know you were a coder.  But at least you did learn how to use regular expressions in Dreamweaver.  Again, my apologies.
Avatar of Luiza1

ASKER

Sure no problem. That's a great idea, it's still not too late. I have to perform this at the end of this week so there is still time to find a better solution. I will launch a new question asking for a windows script solution. Thanks.