converting to html entities

hi,
here is what i want to do: covert words/characters to html entities i.e. email = emal and also a webpage or file (which is encoded in HTML not text)..... is there a program out there that does this, so u don't have to do it manually?
what i want to do is type in a word or point to a URL or file and have the program automatically convert for me.

if no program is availabe, maybe i can make a perl script (though i'm still learning perl/cgi) and submit it via a form and have the script do it for me...

well, any help appreciated. thanks.
LVL 1
sayhiAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
dmaryakhConnect With a Mentor Commented:
Check out http://www.omnimark.com 
You can definately do this through OmniMark Language. This language is free, easy to learn and (in most cases) more usefull than Perl especially for HTML/SGML/XML processing



Here is a complete UNICODE<=>Character entity OmniMark conversion utility that you can use
http://www.xmeta.com/omlette/index.html


http://www.w3.org/TR/WD-html40-970708/sgml/entities.html#h-10.5.1
0
 
chewymonCommented:
0
 
sayhiAuthor Commented:
nah, i seen that already, it's unencoding and requires javascript.
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
sayhiAuthor Commented:
@ dmaryakh

hey, i think this may be what i'm looking for! BUT i don't know how to use it or make it work. i downloaded omnimark and the unimap.zip from the site... i just dunno how to start converting...
0
 
sayhiAuthor Commented:
@ dmaryakh

hey, i think this may be what i'm looking for! BUT i don't know how to use it or make it work. i downloaded omnimark and the unimap.zip from the site... i just dunno how to start converting...
0
 
dmaryakhCommented:
unimap.xin is the library that has all the functions needed for conversion. I haven't worked with it just yet, but by briefly looking at it I would say you need to use get-unicode-values function. Let me play with this library a little, and I will post you a code that you can use for conversion later today.
0
 
sayhiAuthor Commented:
Adjusted points to 150
0
 
sayhiAuthor Commented:
increased points to 160
0
 
sayhiAuthor Commented:
Adjusted points to 160
0
 
dmaryakhCommented:
OK, sorry it took me longer than I anticipated, but here it is. The script below converts the file that has character entities into HTML entities example: "some &hellip; &tilde; data &amp; more &hellip;" into:
some &#8230; &#732; data &#38; more &#8230;




place the code below in to a file: converter.xom
-------------------------------

cross-translate
include "unimap.xin"


        find "&" ((LOOKAHEAD NOT WORD-END)ANY)+  => temp ";"
            local counter tester
            local counter unicode-values variable initial-size 0
            set tester to get-unicode-values for "%x(temp)" into unicode-values
            output "&#%d(unicode-values);"

        find any=>tmp
            output "%x(tmp)"
   

To execute, from the command line type:
omnimark -s converter.xom input.file -of output.file -l log.file

where:
  input.file - your original file
  output.file - converted file
  log.file  - log of any omnimark messages (shouldn't be any)


----------------------------------

P.S. I understand that this not 100%what you were asking, but this script could be modified to do what you are asking for by:
1) locate the following string in the begining of unimap.xin  
global counter unicode-entities variable initial {
    "57928" with key "angzarr",
     ...

add the unicode values to the ascii characters in the same manner:

"101" with key "e",
....


2) locate the following rule in convert.xom:
        find any=>tmp
            output "%x(tmp)"

you would need to replace it with
        find any=>tmp
            local counter tester
            local counter unicode-values variable initial-size 0
            set tester to get-unicode-values for "%x(tmp)" into unicode-values
            output "&#%d(unicode-values);"

       
============================

0
 
dmaryakhCommented:
I forgot to mention that the the way the scrips are set-up right now, they all need to be in the same directory (input.file, converter.xom, unimap.xin)
0
 
sayhiAuthor Commented:
Thanks a million! I greatly appreciate all your help! Thanks again.
0
All Courses

From novice to tech pro — start learning today.