Strange codec error

I have been playing around with pyGoogle (python wrapper for the google API), and I have come across a strange error that I have never seen before.  Here is a code snippet:

for r in data.results:
    print 'Title: ',r.title
    print 'URL: ',r.URL
    print 'Summary: ',r.snippet
    print

And here is the error:
UnicodeEncodeError: 'ascii' codec can't encode character '\ua9' in position 119: ordinal not in range(128)

This error specifically happens when the script tries to print r.snippet (the summary snippet of the site that google returns).

I have found very little on the net about this error message or how to correct it.  Any thoughts?

Thanks,
Brian
LVL 1
bnblazerAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
mish33Connect With a Mentor Commented:
Well that one was one unicode (as it was in my run).
Than
for r in data.results:
  if isinstance(r.snippet, unicode):
    print r.snippet.encode('latin1')
  else:
    print r.snippet
0
 
bnblazerAuthor Commented:
OK, I think I have a fix for this, but it requires that I know what the text encoding of the data is:

print unicode(r.snippet, 'whatverTheEncodingIs').encode('whateverYouWantItEncodedInto')

So here is the next question.  How in python can Idetermine what the encoding is?  Is there function for this?

Thanks,
brian
0
 
mish33Commented:
According to http://www.google.com/apis/reference.html#2_5
r.snippet (as well as all other results) are UTF-8.
0
Learn to develop an Android App

Want to increase your earning potential in 2018? Pad your resume with app building experience. Learn how with this hands-on course.

 
bnblazerAuthor Commented:
Then doesn't python handle UTF-8 well?  \ua9 is the copyright sign, right?  Do I need to encode it over to ASCII?
0
 
mish33Commented:
u'\u00a9' is unicode (C) sign and it doesn't exist in ascii.
In windows encodings '\xa9' is ©.
0
 
mish33Commented:
To print it <b>in windows</b> environment use:
  print u'\u00a9'.encode('latin1')
0
 
bnblazerAuthor Commented:
I should say that this is happening on a Mac OS 10.3.7.  I am beginning to wonder if this is an issue with the wrapper that is around the original java based google api.

Brian
0
 
mish33Commented:
Try .encode('mac_roman')
0
 
bnblazerAuthor Commented:
OK, tried the encoding and no help - same error.  Just tried it on a windows and a linux box and got the same error message.

I am really begining to think I may have stumbled on a bug here.

Brian
0
 
mish33Commented:
Yes I saw your bug #1177232. It's not a bug, it's a feature. ;)
Please post data.results[3].encode('latin1') from mac run as I don't have mac around.
0
 
bnblazerAuthor Commented:
Here is the output from using that line....

Traceback (most recent call last):
  File "/Users/brianblazer/Documents/workspace/pythonStuff/search.py", line 24, in ?
    print data.results[3].encode('latin1')
AttributeError: SearchResult instance has no attribute 'encode'
0
 
bnblazerAuthor Commented:
You are the winner!!!!  That one solved it all!

Thank you for your help.

Brian
0
All Courses

From novice to tech pro — start learning today.