Sivakatirswami
asked on
Problem with CharEncoding for Data from a GetViaAjax call
We are developing a revision of our old web site. Book are important for us and we are attempting to use Monocle as a web-based ePub reader. see:
http://dev.himalayanacademy.com/view/dancing-with-siva
and click on the right hand side of the cover image to flip pages. There are a couple of blank pages, then you land on the Copy right page... Note that curly quotes are not rendered. Keep flipping and you will see some unicode for Sanskrit/Devanagari is lost and is not rendering as expected. The doc type is simple html5
which in theory should work. But even curly quotes in the page HTML template/wrapper were not rendering. We use RevIgniter (like CodeIgniter but uses LiveCode instead of PHP) So I set the engine to force the content setting in http headers to be UTF-8. I also decided to cover all bases and set the http.config directives for this site with an additional line:
addDefaultCharacterSet utf-8
Restarted Apached... OK so far so good.. now the curly quotes on the wrapper layer of the page are all happily rendering as expect (and curly apostrophe's)
But inside the monocle block, the copyright page of the book is still showing characters not being rendered properly.
I did some research and there are issues relating to an AJAX data streams losing their character encoding specification, there were also references to the server default -- which is why I set the HTTP.conf for this site to addDefaultCharSet utf-8 -- I thought that would take care of it, but apparently that doesn't apply to an AJAX data stream.
If you look at the generated source you will see the JSON var with the paths to the html files for the book.
If I create a URL from one of the files, the copyright page, in this example:
http://dev.himalayanacademy.com/media/books/dancing-with-siva/web/ops/xhtml/fm_03.html
note that it renders as expected.. curly quotes are delivered.
similarly if I go for a page with both Devanagari and Tamil. Both work just fine:
http://dev.himalayanacademy.com/media/books/dancing-with-siva/web/ops/xhtml/fm_02.html
But when called using the JSON var and AJAX we lose encoding. I am not the developer doing the coding and he is also working on this but it cost me money for him to lose time on this type of thing when i need his brains for the bigger framework.
In his library I think this where he is building his AJAX call and it looks like this:
put "getViaAjax: function (path) {" & cr after tObj
put "var ajReq = new XMLHttpRequest();" & cr after tObj
put "path = '/cloudreader/slurp' + path;" & cr after tObj
put "ajReq.open('GET', path, false);" & cr after tObj
put "ajReq.send(null);" & cr after tObj
put "return ajReq.responseText;" & cr after tObj
put "}" & cr after tObj
I saw over on Stack Over flow there was some way to add a content specification line to the AJAX POST to ensure the character encoding was preserved. Can we do that with a getViaAjax function?
something like
contentType: "application/json; charset=utf-8",
where does it go? Can some supply me a a GetViaAjax model with this included that we can follow?
OR is their another server setting that will force all AJAX data to utf-8? (CentoOS6, Apache)
http://dev.himalayanacademy.com/view/dancing-with-siva
and click on the right hand side of the cover image to flip pages. There are a couple of blank pages, then you land on the Copy right page... Note that curly quotes are not rendered. Keep flipping and you will see some unicode for Sanskrit/Devanagari is lost and is not rendering as expected. The doc type is simple html5
<!doctype html>
<html>
<head>
<meta charset="utf-8">
which in theory should work. But even curly quotes in the page HTML template/wrapper were not rendering. We use RevIgniter (like CodeIgniter but uses LiveCode instead of PHP) So I set the engine to force the content setting in http headers to be UTF-8. I also decided to cover all bases and set the http.config directives for this site with an additional line:
addDefaultCharacterSet utf-8
Restarted Apached... OK so far so good.. now the curly quotes on the wrapper layer of the page are all happily rendering as expect (and curly apostrophe's)
But inside the monocle block, the copyright page of the book is still showing characters not being rendered properly.
I did some research and there are issues relating to an AJAX data streams losing their character encoding specification, there were also references to the server default -- which is why I set the HTTP.conf for this site to addDefaultCharSet utf-8 -- I thought that would take care of it, but apparently that doesn't apply to an AJAX data stream.
If you look at the generated source you will see the JSON var with the paths to the html files for the book.
If I create a URL from one of the files, the copyright page, in this example:
http://dev.himalayanacademy.com/media/books/dancing-with-siva/web/ops/xhtml/fm_03.html
note that it renders as expected.. curly quotes are delivered.
similarly if I go for a page with both Devanagari and Tamil. Both work just fine:
http://dev.himalayanacademy.com/media/books/dancing-with-siva/web/ops/xhtml/fm_02.html
But when called using the JSON var and AJAX we lose encoding. I am not the developer doing the coding and he is also working on this but it cost me money for him to lose time on this type of thing when i need his brains for the bigger framework.
In his library I think this where he is building his AJAX call and it looks like this:
put "getViaAjax: function (path) {" & cr after tObj
put "var ajReq = new XMLHttpRequest();" & cr after tObj
put "path = '/cloudreader/slurp' + path;" & cr after tObj
put "ajReq.open('GET', path, false);" & cr after tObj
put "ajReq.send(null);" & cr after tObj
put "return ajReq.responseText;" & cr after tObj
put "}" & cr after tObj
I saw over on Stack Over flow there was some way to add a content specification line to the AJAX POST to ensure the character encoding was preserved. Can we do that with a getViaAjax function?
something like
contentType: "application/json; charset=utf-8",
where does it go? Can some supply me a a GetViaAjax model with this included that we can follow?
OR is their another server setting that will force all AJAX data to utf-8? (CentoOS6, Apache)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Well we put this into another handler, and my developer had it in before I got this answer but I don't like to be stingy so I'll give @kozaiwaniec the points anyway :-)
:)
cheers.
cheers.
ASKER
this works now... see:
http://dev.himalayanacademy.com/view/loving-ganesha
marvelous, Devanagari, Tamil, all ANSI chars, come thru perfectly
(disclaimer, I have no idea how this performs on IE and I hear that it won't work and, well, that's life, maybe IE 10? We long ago stopped trying to make anything compatible with Internet Explorer. We just can't afford it.)
Jai Ganesha!
I'm pasting this here for my own knowledge base:
#### RECEIVE GET REQUEST:
file: /public_html/system/applic
command slurp
-- figure out what file we need
put rigRuriString() into tURL
set the itemdel to slash
put item 4 to -1 of tURL into tFileToSend
-- read content of file
put url ("binfile:" & $_SERVER["DOCUMENT_ROOT"] & "/" & tFileToSend) into tData
-- escape content
replace "/" with "\/" in tData
replace "svg:" with empty in tData
replace format("\r") with empty in tData
-- send it back
put header "Content-Type: application/json; charset=utf-8"
put tData
quit
end slurp