Python unicode problem

I'm trying to do experiment on Scrapy. But the chinese characters in stored result appears to be \u4e00\u8d77\u6e38\u5427(\u5168\u7403\u65c5\u884c\u8d34\u8eab\u4f34\u4fa3... What should I do? thanks
fxp007Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

peprCommented:
What do you expect to be stored? Where is the problem?
0
fxp007Author Commented:
I want to see the chinese characters as "¿¿" instead of \u....\u...
0
peprCommented:
How do you display the wanted result.  I guess that you print the result to the console window that is not capable to display the characters.  Because of this it prints the symbolic representation of the characters in the form of escape sequences for unicode characters.  Try the following code to store the value into a HTML file. Then display the resulting file:

import codecs

s = u'\u4e00\u8d77\u6e38\u5427(\u5168\u7403\u65c5\u884c\u8d34\u8eab\u4f34\u4fa3'

f = codecs.open('test.html', 'w', encoding='utf-8')

f.write('''<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  <title>Chinese test</title>
</head>
<body>
<p>The value displayed via HTML browser: ''')
f.write(s)
f.write('''</p>
<p>Representation of the same value using escape sequences: ''')
f.write(repr(s))
f.write('''</p>
</body>
</html>''')

f.close()

Open in new window


It displays in my case:

 snapsot of the browser window content
In other words, your problem may actually be no problem.  The representation can be OK.  Probably only your output device is not capable to display the chinese characters.
test.zip
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Exploring SQL Server 2016: Fundamentals

Learn the fundamentals of Microsoft SQL Server, a relational database management system that stores and retrieves data when requested by other software applications.

jpg526Commented:
Chinese characters cannot be stored directly, it must be converted into Unicode first, you may use some available tools to check the corresponding unicode of each chinese character (e.g. http://weber.ucsd.edu/~dkjordan/resources/unicodemaker.html)

solution from pepr should useful for you.
0
peprCommented:
There are also older encodings than Unicode.  So, there is more ways to store the Chinese chararacters in a file.  However, the Unicode way should be preferred these days.

In the Unicode standard, each (Chinese or whatever) character is assigned one unambiguous numeric value.  I.e. each character glyph (picture) is related to the concrete number.  The number is pure abstract integer.  When you want to store an unicode text to a file, you have to choose the way how the integers should be stored in the file.  The UTF-8 is one of the several possible ways.

Have a look at Chapter4. Strings by Mark Pilgrim (http://diveintopython3.org/strings.html) that starts with problems of encodings and continues with explanation of Unicode and the Unicode encodings.
0
fxp007Author Commented:
Thanks pepr. I'm reading that.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.