Skip to main content

decoding

·55 words·1 min
Dave the human
Author
Dave the human
Homo sapiens in the loop

When a tokenizer performs the process of decoding (through its decode method), a list of IDs representing tokens is converted to natural language.

input_token_ids_list = [19152, 20238, 11, 358, 646, 944, 653, 429]
text = tokenizer.decode(input_token_ids_list)
print(text)
Sorry Dave, I can't do that

The other way around from natural language to IDs is called encoding.


 encoding BPE 

Comments