Link to home
Start Free TrialLog in
Avatar of onyourmark
onyourmark

asked on

Excel formatting and special characters

Translated.mht Translated.mht

I have an Excel file (attached). It seems to have some special characters. I am trying to translate this file. I am using Google's auto translate. What I do is save it as an htm file. Then open it in Chrome. Chrome will offer to translate it. Then I save the translated version. Then copy that and paste it back into Excel. The problem is that I am getting far more cells than were in the original document. From this I am assuming that there are some special characters in the original Excel file that are somehow being read (through this process) as new cell indicators.
I was wondering if anyone knows how to identify and remove such possible characters?
Oh, I forgot to say, the original language is Japanese :-)
Original.xlsx
Avatar of gbanik
gbanik
Flag of India image

Check this file that I created for someone else. See if it can help you...

It translates the text from within Excel using Google API. You can specify the From and To Languages.
=FetchTranslation (InputText, FromLanguage, ToLanguage)
 
FetchInfo.xlsm
Avatar of onyourmark
onyourmark

ASKER

Hi and thanks very much. I don't know how to run it. I see the folder fetchinfo.xlsm
but not sure what to do from here.
FetchInfo.xlsm is an Excel 2007 file. Download it. Enable the macros. Type something in the Column B. Fill the "from" and the "to" languages (check the correct language code from http://www.loc.gov/standards/iso639-2/php/code_list.php). Copy the formula from Column D last cell to your row. The code should automatically bring in the translated text without running a macro. (It uses a custom function to fetch the translations through Google).

I am converting the 2007 file to 2003 format just in case.
FetchInfo.xls
Avatar of byundt
gbanik,
I tried using the code in the original workbook and set the VBA references appropriately. The Japanese characters were translated as ? when using a formula like:
=FetchTranslation(H2,"ja","en")

It is also exceedingly slow (several minutes on my laptop) when function calls are applied to the two columns of cells with Japanese text. It might be faster if the function were called by a sub and the translation object created and destroyed only once.

Brad
Yeah I noticed that Brad :) about the ?? ... I am guessing we may not have the Japanese fonts.
You are right, it can be optimized by creating the object globally once. The function here was just to illustrate the objective.
Meanwhile, the speed (of the function return) at my end seems to be roughly between 0.5 sec to 1 sec. I guess it is connection speed dependent. Could u try again?
I just tested with 250 formulas (2 times) ... it returned all in roughly within 127 seconds ... thats about 0.5 seconds each.
gbanik,
Recalculating 200 rows x 2 columns took 24:15 on my laptop, or 3.7 seconds per translation. You must have a faster computer or internet connection than me.

Brad
Brad .... no issues .... maybe someone else could validate my comment.

Just for curiosity, mine is a Sony Vaio i3 with 3.1 MBPS wireless. Processor may not be an issue, as there is not much local processing in that function. It maybe becos of the connection speed.

Believe me I just tried that again at my end :)
According to Speakeasy.net, I have 0.94 Mbit/s down and 0.20 up.

According to my ISP (ATT.net), I have 1.5 Mbit/s down and 0.128 up.
:) Mine actually is lower than yours at Speakeasy.net .... 0.64 Mbps download and 0.4 upload
I cant figure out what the issue is...
The real problem, however, is the fact that Google really isn't performing any translation. Unless of course you have figured how to get something other than question marks.

Brad
Hi. Thanks. I get a #VALUE! error. I am attaching the workbook.
FetchInfo.xlsm
gbanik's code is working for me in your sample workbook.

Did you enable macros?

In Excel 2007:
1) Click the Office icon at top left of window
2) Click the Excel Options button at bottom of resulting dialog
3) Click Trust Center tab, then click the link for Trust Center Settings
4) On the Macro Settings tab, choose the option for "Disable all macros with notification"
5) Close the workbook
6) Reopen the workbook
7) Click the warning message at top of your worksheet and choose to enable macros
Hi I did that but I still get Value error. I am attaching again.
FetchInfo.xlsm
In the VBA Editor, open the Tools...References menu item. There should be six items with checked at the very top of the list. The first four are standard. The next two are Microsoft HTML Object Library and Microsoft XML, v5.0. Are one or more items labeled (MISSING)?

If you have a missing reference, that could explain the #VALUE! error value. You can probably change to a different version of Microsoft XML--scroll down the alphabetical listing to see what you have, and check a different version if you need to.

If you don't have anything close to the missing references, try unchecking them. I copied gbanik's settings and haven't tried doing without--but it doesn't hurt to try.
Hi. I changed it. I was missing the 2 boxes. But still get the error. Then I changed from v5 to v6 but still have the error
Just to be sure onyourmark... I am assuming you are connected to the internet while accessing the file.
Yes. I am. Thanks
I am opening a new discussion here...

The problem seems to be of unicode characters. Excel handles them inside cells for display but we need some unicode mechanism to transport the text further.

Break point at the function entry and check the value of sText. They are all "???????".
I tried to find out a function to get UNICODE rather than just CODE... or equivalent. The closest was ChrW()... see http://www.oraxcel.com/projects/litlib/help/chrw.html

Another article http://teaandbiscuits.org.uk/drupal/node/77 may show light.

Meanwhile I am attaching a file below which shows the same problem.

In conclusion, the above translation code would work for ANSI based languages... not for Unicode based ones. Some rework will be required.
Unicode.xlsx
Hi. Well thanks a lot anyway!
Hi Brad, I am still working on it ... and I "HAVE" to get a solution on this. Is there a way to keep this question open? Or maybe open a new question... cos I may want participation. Do reply to this post.
Hi all. I am still here. Thanks.
Finally....
onyourmark see your data ... is it translated correctly?

If yes... i would explain in my next post.
FetchInfo.xls
onyourmark ,
Google API fails to support beyond a certain length of text for translation (at least in the current api that i have used). So I have split up your text into multiple rows.

Now the explanation...
First... the problem was indeed of unicode characters. ????? was returned by Google API because WE had passed it that way for translation. The text was unicode, hence could not be handled directly for sending to the API (all data including source text, source language, to language are sent as URL). There may be many ways to URL Encode .... but none worked for me ... finally used javascript to do the same (got the idea from google's tranlation page by HTTP tracking).

Hope it help!
Gautam,
Well done!

Brad
By the way.... if you need a code to dynamically split a long text into parts (in such a way that the split is based on a space) here is an Excel file. It contains a very "dynamic" formula to get the parts....
(from another question https://www.experts-exchange.com/questions/26525126/EXCEL-FORMULA-SPLIT-ADDRESS.html
see my comment ID:33844305 just below the accepted answer)
I am attaching the file for your convenience.
Split-Text.xlsm
:) thx Brad. I really appreciate it :)
Gautam,
I don't know if you have discovered it yet, but we run a thread for the "Excel regulars" at http:/Q_26493646.html   I thought your code in this thread was worth mentioning to the group and posted a link to it there.

Brad
Wow Brad... thanks :) ... I am quite a newbie here in EE (active only from the last month).
Now this seems to be a great place ... the playground of the seniors. I will learn a lot from here. Thanks again :)
Hi. Thanks! It is working. Where or what is FetchTranslation()?
It is something you wrote but where is it located?
ALT + F11 to open the VBA Editor. If it doesn't open to Module1, then click the + to the left of Modules and then double-click Module1 in the pane on the left. You will find the VBA code for FetchTranslation there.
ASKER CERTIFIED SOLUTION
Avatar of gbanik
gbanik
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Great. Thanks!
By the way, does this kind of thing also work with other APIs?
Different APIs work differently. But more or less could be invoked similarly.
Hi. I am the one who asked about this question before. Thanks very much for the sheet. I was wondering if there is a limitation on using this. I have read that the google API limits characters to 100000/per day for translation. I suppose that applies to this as well. Any comments?
Thanks again.