jonmatumalpha
conceptsnotesexperimentsessays

© 2026 Jonatan Mata · alpha · v0.1.0

#preprocessing

1 article tagged #preprocessing.

  • Tokenization

    Process of splitting text into discrete units (tokens) that language models can process numerically, fundamental to how LLMs understand and generate text.

    seed#tokenization#bpe#tokens#nlp#llm#preprocessing
All tags