Skip to main content

tokenizer

·42 words·1 min
Dave the human
Author
Dave the human
Homo sapiens in the loop

A tokenizer is a critical component of the LLM text processing and generation pipeline even though it is not directly part of it. It splits the text into tokens that get converted into numerical IDs to be ingested by the language model.


Comments