Utf8

De Linuxmemo.

Sommaire

Code dans Vim

For instance, in the vim text-editor you would enter insert mode and press CTRL+VU and then the code-point number as a 4-digit hexadecimal number (pad with zeros if necessary). So you would type

CTRL+V. U 2 6 2 0.

Code dans le Shell

At a terminal running Bash you would type CTRL+SHIFT+U and type in the hexadecimal code-point of the character you want. During input your cursor should show an underlined u. The first non-digit you type ends input, and renders the character. So you could be able to print U+2620 in Bash using the following:

echo CTRL+SHIFT+U2620ENTERENTER

(The first enter ends Unicode input, the second runs the echo command)

Code avec echo

In UTF-8 it's actually 6 digit (or 3 byte).

$ echo -e "\xE2\x98\xA0"
☠

To check how it's encoded by you console, you can use hexdump. echo -n ☠ | hexdump

Code avec printf

  • via printf
printf '\xc3\xa0'
  • via le shell :
printf $'\xc3\xa0'
  • idem en ksh (88/93) :
typeset -i8 a=16#c3 b=16#a0
print “\${a#8\#}\${b#8\#}”

Table Utf8

http://www.utf8-chartable.de/

Utile

\xHH       pour les codes en hexa (1 à 2 chiffres)
\uNNNN     pour les codes en utf8 (code hexa en 4 caractères)
\NNN       pour les codes en octale (1 à 3 chiffres)
\b0101110  Pour les codes en binaire
  • retour chariot est le charactère "\u240D"
  • tabulation est \x09 ou \u0009

Conversion depuis le sheel

  • Decimal to Hexadecimal
echo 'obase=16;10'| bc
A

Ou

wcalc -h 10
  • Decimal to Octal
echo 'obase=8;10' | bc
12

Ou

wcalc -o 10
  • Decimal to Binary
echo 'obase=2;10' | bc
1010

Ou

wcalc -b 10
  • From Hexadecimal to decimal
echo 'ibase=16;A' | bc
10
  • From Octal to Decimal
echo 'ibase=8;12 | bc
10
  • From Binary to Decimal
echo 'ibase=2;1010 | bc
10
Outils personnels