asgarcymed
asked on
Script to Remove UniCode/Illegal Characters from Files' Name
Many files downloaded by eMule (ed2k/Kad) contain, in its name, UniCode characters (such as Chinese, Japanese, Korean, Arabic, Hebraic, Russian) which are seen as "Illegal Characters" by English version of Windows XP's explorer.exe... This causes serious troubles when managing such files...
Thus, I would like to get a script to automatically delete such characters from files' name, in order to avoid problems when trying to access them...
PS - even when we download an eBook totally written in English, stupidly the files' names contain such unicode/illegal characters...
Thanks.
Regards.
Thus, I would like to get a script to automatically delete such characters from files' name, in order to avoid problems when trying to access them...
PS - even when we download an eBook totally written in English, stupidly the files' names contain such unicode/illegal characters...
Thanks.
Regards.
use Renamer: http://www.den4b.com/ (it supports unicode)
ASKER
Although "ReNamer" is a superb renaming tool; I still to want/need to get a script, as a "light-weight" and quickly/rapidly solution "on-the-fly"...
Someone, outside EE, told me about REGULAR EXPRESSIONS... Please give a look at:
http://www.autoitscript.com/forum/index.php?showtopic=58848&st=0&gopid=444082&#entry444082
and
http://www.isthisthingon.org/unicode/allchars1.php
My problem is that I do not know how to make the RegExp...
I can use VBScript, AutoIt, Perl, Python, Ruby, Tcl-Tk, or whatever, but I need some help...
Thanks.
Regards.
Someone, outside EE, told me about REGULAR EXPRESSIONS... Please give a look at:
http://www.autoitscript.com/forum/index.php?showtopic=58848&st=0&gopid=444082&#entry444082
and
http://www.isthisthingon.org/unicode/allchars1.php
My problem is that I do not know how to make the RegExp...
I can use VBScript, AutoIt, Perl, Python, Ruby, Tcl-Tk, or whatever, but I need some help...
Thanks.
Regards.
ASKER
I am now using "RegExBuddy", a superb Win32 app to work and learn about Regular Expressions...
Using Google, I could get a txt file (see http://www.xys.org/xys/netters/others/net/wiki2.txt) which has many, many Chinese characters; and few English characters... I opened it with RegExBuddy, and I tested both RegEx's:
[\x10-\x1F\x21-\x2F\x3A-\x 40\x5B-\x6 0\x80-\xFF ]
and
[^\u0000-\u024F]+
But the results of test/debug were very confusing...
Even more - I got the Windows XP MUI (MultiLingual User Interface) and I installed all languages I already announced (Chinese/Japanese/Korean/A rabic/Hebr aic/Russia n)...
My confusion is now even bigger - some apps can correctly load the Chinese characters (for example), but the majority of apps continue not to deal with such characters (they show "squares" or "???????????" or distorted characters like when we try to read a binary file with a text editor...
A big confusion is installed in my brain... Must I have MUI installed ?... What is the best RegEx to kill such characters from files' names? If I have MUI installed, do I need such regex/script?? What should I do to solve this question once and for all?
Is there any Chinese/Japanese/Korean/Ar abic/Hebra ic/Russia person here? If yes, how do you manage the characters' conflicts between your Native Language and English?
Help is very appreciated!
Thanks in advance.
Regards.
Using Google, I could get a txt file (see http://www.xys.org/xys/netters/others/net/wiki2.txt) which has many, many Chinese characters; and few English characters... I opened it with RegExBuddy, and I tested both RegEx's:
[\x10-\x1F\x21-\x2F\x3A-\x
and
[^\u0000-\u024F]+
But the results of test/debug were very confusing...
Even more - I got the Windows XP MUI (MultiLingual User Interface) and I installed all languages I already announced (Chinese/Japanese/Korean/A
My confusion is now even bigger - some apps can correctly load the Chinese characters (for example), but the majority of apps continue not to deal with such characters (they show "squares" or "???????????" or distorted characters like when we try to read a binary file with a text editor...
A big confusion is installed in my brain... Must I have MUI installed ?... What is the best RegEx to kill such characters from files' names? If I have MUI installed, do I need such regex/script?? What should I do to solve this question once and for all?
Is there any Chinese/Japanese/Korean/Ar
Help is very appreciated!
Thanks in advance.
Regards.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you very much!!
Regards
Regards