encoding

When a tokenizer performs the process of encoding (through its encode method), a natural language text is broken into tokens that are then converted to IDs.

# Even notes can have code
prompt = "Sorry Dave, I can't do that"
input_token_ids_list = tokenizer.encode(prompt)
print(input_token_ids_list)

[19152, 20238, 11, 358, 646, 944, 653, 429]

The way back from IDs to natural language is called [[decoding]].