Pyhon with Oracle encoding issue

I am using Python 3.3 on Windows 7x64 machine to connect to an Oracle server over the VPN tunnel to dump locally data in a CSV file. I am using cx_Oracle module to connect to Oracle. Everything is fine until the cursor hits a character in some string column, which causes the module to fail with error:

...
for row in orcl_cur.execute(sql_select):
        File "C:\Python33\lib\encodings\cp1252.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)
      UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 426: character maps to <undefined>

AT this point the dump stops to whatever rows were saved so far.

How can I avoid this?
LVL 27
ZberteocAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

peprCommented:
If the database content is correct, then it does not use cp1252 encoding. The character 0x8f is not defined -- see here https://en.wikipedia.org/wiki/Windows-1252 or here http://www.cp1252.com/ Firstly, I suggest to double-check, whether the database really uses that encoding.

Another possibility is that the database should contain the cp1252, but there is a bug in the data.

If you need to solve the situation, you can wrap the decoding of the character the try-except construct, replace the character by something or very visible (to be able to see the problem) or by something very invisible (to mask the problem). You can also write some warning into a log in the case.

It is difficult to say without seeing something more.
ZberteocAuthor Commented:
I appears that Python uses that code file, cp1252.py, by default. Can I change that?
peprCommented:
Is the database column od the unicode type (nvarchar or so)?
Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

ZberteocAuthor Commented:
Only VARCHAR2.
slightwv (䄆 Netminder) Commented:
If that character is in the database I'm betting the database is Unicode.
select value from nls_database_parameters where parameter='NLS_CHARACTERSET';

If you cannot get Python to handle that characterset, you might look at encoding the column.  I've not done a lot with multi-byte character sets so I'm not sure the "official" Oracle way.

Off the top of my head there is utl_i18n.escape_reference:
http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/u_i18n.htm#ARPLS71120
peprCommented:
If it is VARCHAR2, then you probably have to tell the encoding explicitly somehow. The cp1252 is probably a default. However, I do not know the library. I guess it will be or the connection object parameter or the cursor parameter. The only other possibility is that the cursor.execute is also capable to return the binary type in the row (if possible, then it should be set similarly to the encoding). I do not know the answer.
ZberteocAuthor Commented:
The only way I could get pass the querying the Oracle server without actually failing was to change the NLS_LANG registry key value to AMERICAN_AMERICA.UTF8 on the Windows machine where I was running teh Python module and restart it.  Before was some Canadian setting. The key is here:

HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_XE\NLS_LANG

After that it won't give that error anymore and I could deal with the "strange" characters inside Python. They looked like: Ï¿Ï¿½ , this was breaking the cx_Oracle cursor when querying the Oracle server.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
ZberteocAuthor Commented:
The solution was suggested by a co-worker of mine and I found reference to that NLS_LANG windows key on the net.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.