30/09/2018, 23:47

How to convert a list to unicode list

Hey guys,
I need to know which is the best way to convert a list to unicode list.
Code :
nameStaff = ['Huey', 'Dewey', 'Cộng']
The 3rd element is already unicode character. So whenever I try to convert to Unicode list, got an error message below.

UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xcc in position 3: ordinal not in range(128)

I search on StackOverFlow but still not find the solution.
Solution 1 :
answer = [unicode(item) for item in x]
Solution 2 :
map(unicode,lst)

The expect result :
nameStaff = [u'Huey', u'Dewey', u'Cộng']

Could you have any idead for this issuse ?

Lương Quang Mạnh viết 01:54 ngày 01/10/2018

Have you tried unicode(some_str, 'utf-8')? It worked for me.

Jack Vo viết 02:02 ngày 01/10/2018

Problem solved. Thank you for this solution, could you help me to explain the reason ?
Code :

#coding:utf-8
import sys, string
nameStaff = [‘Huey’, ‘Dewey’, ‘Cộng’]
nameAfter = []
print "The current default encoding Python: ", sys.getdefaultencoding()
for s in nameStaff:
nameAfter.append(unicode(s,‘utf-8’))
print nameAfter
msg = repr(nameAfter).decode(‘unicode-escape’)
print msg

Output :

The current default encoding Python: ascii
[u’Huey’, u’Dewey’, u’C\xf4\u0323ng’]
[u’Huey’, u’Dewey’, u’Cộng’]

But if I delete the first line coding:utf-8 this script has error on line 4.

SyntaxError: Non-ASCII character ‘\xc3’ in file E:/ex20.py on line 4, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Lương Quang Mạnh viết 01:56 ngày 01/10/2018

First of all, you should wrap your code in a pair of ``` to syntax highlight it. If you show your code like that, I don’t know where a code block starts and ends. Furthermore, it looks like the first line starts with a #, which the markdown parser identifies as a header.

Edited:
# encoding: utf-8 at the first line of a python source file declares the encoding of it, which is default to ASCII. Then, without it, it wouldn’t understand what 'Cộng' means, because ASCII table doesn’t have 'ộ'.

Bài liên quan
0