[Python] Convert string to ASCII code

Let's first explain the relationship between ASCII, Unicode and UTF-8

The earliest one is ASCII , which contains a total of 128 characters (one byte can represent 256 states, but the first bit is all 0 by default, so there are only 128). In the subsequent development of computers, 128 kinds are far from meeting the increased needs (various national languages, etc.), so Unicode is introduced , but Unicode only specifies the binary code of the symbol, and does not specify the storage method. Hence the introduction of UTF-8 , which is an implementation of Unicode . For English alphabets, UTF-8 encoding and ASCII code are the same.

The default python3 string is Unicode encoding

For a single character use the following:

>>> ord ( 'A' ) 65 >>> ord ( 'Medium' ) 20013 >>> chr ( 66 ) ' B ' >>> chr ( 25991 )  ' Wen '

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

For a long string use the following:

import numpy as np

str = 'hello world'

ascii = np.fromstring(str, dtype=np.uint8)

print(ascii)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

The output is[104 101 108 108 111 32 119 111 114 108 100]

Related: [Python] Convert string to ASCII code