Monday, June 25, 2018

How do you print a Kannda (South Indian Language) word in Python using UTF-8?

Kannada (South Indian language) characters have the Unicode range from 0C80-0CCF 

UTF-8 is Python's default encoding.


 Let us take a simple word:
ಕಲಹ
Which means 'quarrel'. The letters are simple and you could use their Unicode equivalents
using this chunk of code block.

To form the word we concatenate the characters as shown,



What about a word such as this, ಅರಮನೆ? (means Palace)
The first three letters are simple but the fourth letter is a rendering with a diacritic of a simple letter 'ನ'. The fourth letter requires concatenating the two shown above:

In order to form the word ಅರಮನೆ we need to do the following:




No comments: