15703

Django legacy database encoding

Question:

I'm sure this question is not specific to django, but since I couldn't find any solution for my problem in other questions about python and encodings, I'm going to ask this. I need to add new features to existing website which is written in PHP using MySQL as backend. I inspected the database and created models for tables I am going to use. However, there is a problem with the existing data- half of it is in russian, and (at least it seems to me) it's in utf-8 encoding. When I show that data in django's admin, it doesn't appear right.

In [52]: p.name Out[52]: u'\xd0\u02dc\xd0\xb3\xd0\xbe\xd1\u20ac\xd1\u0152 ' In [53]: repr(p.name) Out[53]: "u'\\xd0\\u02dc\\xd0\\xb3\\xd0\\xbe\\xd1\\u20ac\\xd1\\u0152 '"

In django admin it displays like this:

Игорь

Encodings are still a little bit mythical for me, but if I understand this output correctly, basically those are utf-8 bytes in unicode object.

The question: is it possible to fix this in django's database layer? I'm going to update existing content in these tables, and I need the existing PHP front-end to be compatible with both the new data and old one.

When I add these database options data is displayed in admin correctly, <strong>however, I get UnicodeEncode error when saving something.</strong>

DATABASE_OPTIONS = { 'charset': 'latin1', 'use_unicode': False, }

Name returned in this case is:

In [2]: p2.name Out[2]: '\xd0\x9b\xd0\xae\xd0\xa1\xd0\xaf'

I checked with utf-8 character table, and those are correct characters for the data stored in that row.

Answer1:

Check your mysql connection parameters. Also, You can specify DATABASE_OPTIONS:

DATABASE_OPTIONS = { "charset": "utf8", "init_command": "SET storage_engine=InnoDB", }

But check out if it's really utf-8. Also note that connection and server encoding must be in sync.

Answer2:

Actually this problem was the database's previous character set and collation- it was latin1, but data was inserted using utf-8 charset. It was solved by exporting data using latin1 charset, replacing all occurences of latin1 with utf8 and importing data again. This answer shows how to do this: <a href="https://stackoverflow.com/questions/1440837/mysql-convert-latin1-data-to-utf8/1939896#1939896" rel="nofollow">MySQL Convert latin1 data to UTF8</a>

Recommend

  • OpenSSL rsa_private_decrypt error:0406506c:lib(4):func(101):reason(108)
  • Problems with passing Euro Sign as URL parameter
  • How can I determine file encodings on Windows / IIS?
  • Converting/writing a BufferedImage to postscript
  • Google cloud speech syncrecognize “INVALID_ARGUMENT”
  • Dereferenceing on casting the void pointer to float*/int*
  • Why does the address of an object change across methods?
  • Doc2vec : TaggedLineDocument()
  • Which video formats to use for Java ME in mobile phones?
  • how do you obtain the address of an instance after overriding the __str__ method in python
  • Caret disappears when the background of a textbox is gray in wpf
  • Problems with matplotlib.pyplot.xticks()
  • Microsoft Excel Pivot miscalculation in Sum for positive and negative numbers
  • Responsive left sidebar open close
  • Cross platform UI spacing/padding
  • Changing Jupyter Notebook start up folder by modifying “start in” not working any more
  • Recording values of radio buttons in ember
  • Configure nginx to return different files to different authenticated users with the same URI
  • Put value at centre of bins for histogram
  • CSS - Cannot get one spanned style to override another inherited style and align left
  • How to override value that appears in a dropdown in the rails_admin gem
  • Optimizing the print function in Matlab
  • Suppressing passwd when calling sqlplus from shell script
  • android google indoor map
  • IE7 and TinyMCE with Plone
  • Implementation of State Monad
  • custom UITableViewCell with image for highlighting
  • JQuery Internet Explorer and ajaxstop
  • why overloaded new operator is calling constructor even I am using malloc inside overloading functio
  • Q promise. Difference between .when and .then
  • Why is the size of this struct 32?
  • jQuery show() function is not executed in Safari if submit handler returns true
  • Problems to linebreak with an int in JLabel
  • Sony Xperia Z Tablet not found by adb
  • Check if a string to interpolate provides expected placeholders
  • javascript inside java/jsp code
  • Sending data from AppleScript to FileMaker records
  • KeystoneJS: Relationships in Admin UI not updating
  • AngularJs get employee from factory
  • Load html files in TinyMce