Question:

What is the difference between utf8 and unicode?how to convert a file from utf8 to unicode?

by  |  earlier

0 LIKES UnLike

What is the difference between utf8 and unicode?how to convert a file from utf8 to unicode?

 Tags:

   Report

2 ANSWERS


  1. Unicode is an encoding system that can be implemented officially according to the rules of the Unicode consortium in three encoding forms: UTF-8, UTF-16, and UTF-32,in which “UTF stands for “Unicode Transformation Format”. These forms are eactually implemened as one of seven different encoding schemas, depending on whether the high bit or low bit of each byte comes first, and whether a Unicode file is preceded by a BOM (byte-order mark) character, that is the Unicode marker character U+FEFF.

    All encodings are equally official “Unicode”, but Microsoft Windows tends to wrongly use “Unicode” to mean its own internal form of Unicode which is UTF-16 Big Endidan, as contrasted with UTF-8, UTF16 Little Endian, and UTF-43.

    Normally your browser should convert properly, if necessary. If it doesn’t usually because the code used is not labeled properly, you can select a different code page in the browser. Since there are numerous code pages available, sometimes it may take several attempts before you find one that will work.

    If you have a file in UTF-8 text format, and want to convert it to UTF-16 text format, you can do so through many editors and word processors available today. For example, if you are using Windows, open the file in Notepad, identifythe file in the open dialog box as text and identify its type as “Unicode TextDocuments (*.txt)”. Then save it in the same format. It will now actually be in UTF-16 format.

    Some other editors and word processors will make you specifically chose UTF-8 or UTF-16 and which endian format you want. It won’t matter much.


  2. Well I'm not sure how are you planning on using your answer because UTF-8 and Unicode are not really rivals to compare.

    Unicode is just code tables that assign integer numbers to characters.

    i.e, A=41

    UTF-8 on the other hand is an implementation of Unicode.  There exist several alternatives for how a sequence of characters and their respective integer values can be represented as a sequence of bytes and UTF-8 is one of them.

    So basically if you have a UTF-8 file it's already an unicode file!

Question Stats

Latest activity: earlier.
This question has 2 answers.

BECOME A GUIDE

Share your knowledge and help people by answering questions.
Unanswered Questions