12673

Convert HTML character code to char in Java

Question:

Our XML feed gives us encoded UTF-8 characters inside ISO-8859-1 a file. This is being fed into the database. So the text is ISO-8859-1 encoded and contains following stuff:

金融市场

Is there a way to convert that into a normal Java string? Similar to:

String str = fromHtmlUtf8("金融市场");

Where resulting str will contain normal UTF8 chars. Chinese in this case, but can be quite mixed.

Thanks.

Answer1:

You can use the StringEscapeUtils from Apache Commons: <a href="http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html" rel="nofollow">http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html</a>

next time search before: <a href="https://stackoverflow.com/questions/2825985/how-to-convert-from-html-to-utf-8-in-java/2826064#2826064" rel="nofollow">How to convert from HTML to UTF-8 in java</a>

Answer2:

If you need small lib for this, you can use HTMLEntitles

<a href="http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=htmlentities" rel="nofollow">http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=htmlentities</a>

Recommend

  • Converting mysql database to support multiple languages
  • What should be the better way for localizing iOS project?
  • filesize stat failed
  • How do I change POPUP Text of Menu without ID
  • Why does mariadb regex give contrary result?
  • How to make Column to Row without an aggregate function in sql server 2005/8?
  • Change encoding of a list of objects
  • how to make sql developer display non-English character correctly instread of displaying squares?
  • The sample of in facebook sdk for Android always displays authentication page with language “Bahasa
  • Regex to check existence of numbers and special characters only
  • How to display the values of each array in larger array?
  • allow UTF-8 encoded filenames on (file-)webserver?
  • MSSQL - JPA - Character encoding for Special characters - appending 'N' nativeQuery
  • Run-time Error 424 Object Required UserForm doesnt exist
  • Android Use Non-Gregorian Calendars
  • Int to char conversion rule in C when int is outside the range of char
  • possible limitation of implode function in PHP
  • c++ using primitive types as a base class
  • How to extract text from a PDF and decode characters?
  • How can I include If-None-Match header in HttpRequestMessage
  • WPF Visiblity Binding to Boolean Expression with multiple Variables
  • Conversion from string “a” to type 'Boolean' is not valid
  • how to avoid repetitive constructor in children
  • How to handle images sent by a mobile device?
  • Spark fat jar to run multiple versions on YARN
  • Recording logins for password protected directories
  • Pass value from viewmodel to script in zk
  • Encrypt data by using a public key in c# and decrypt data by using a private key in php
  • Cross-Platform Protobuf Serialization
  • ActionScript 2 vs ActionScript 3 performance
  • AT Commands to Send SMS not working in Windows 8.1
  • Arrays break string types in Julia
  • How to format a variable of double type
  • Windows forms listbox.selecteditem displaying “System.Data.DataRowView” instead of actual value
  • json Serialization in asp
  • Benchmarking RAM performance - UWP and C#
  • coudnt use logback because of log4j
  • apache spark aggregate function using min value
  • JaxB to read class hierarchy
  • Sorting a 2D array using the second column C++