python: json.dumps can't handle utf-8?

Below is the test program, including a Chinese character:

# -*- coding: utf-8 -*-
import json

j = {"d":"中", "e":"a"}
json = json.dumps(j, encoding="utf-8")

print json

Below is the result, look the json.dumps convert the utf-8 to the original numbers!

{"e": "a", "d": "\u4e2d"}

Why this is broken? Or anything I am wrong?


You should read The complete JSON specification is in the white box on the right.

There is nothing wrong with the generated JSON. Generators are allowed to genereate either UTF-8 strings or plain ASCII strings, where characters are escaped with the \uXXXX notation. In your case, the Python json module decided for escaping, and 中 has the escaped notation \u4e2d.

By the way: Any conforming JSON interpreter will correctly unescape this sequence again and give you back the actual character.

Looks like valid JSON to me. If you want json to output a string that has non-ASCII characters in it then you need to pass ensure_ascii=False and then encode manually afterward.

Use simplejson with the mentioned options:

# -*- coding: utf-8 -*-
import simplejson as json

j = {"d":"中", "e":"a"}
json = json.dumps(j, ensure_ascii=False, encoding="utf-8")

print json


{"e": "a", "d": "中"}

Need Your Help

Lazy Method for Reading Big File in Python?

python file-io generator

I have a very big file 4GB and when I try to read it my computer hangs.

'Server Explorer' > 'Windows Azure' > 'SQL Databases' in VS2013 does not show any databases

azure visual-studio-2013 azure-sql-database

According to: