1 article tagged #attention.
The maximum number of tokens an LLM can process in a single interaction, determining how much information it can consider simultaneously to generate responses.