Jonatan Matajonmatum.com

concepts notes experiments essays

© 2026 Jonatan Mata. All rights reserved.v2.1.1

#inference

2 articles tagged #inference.

Inference Optimization
Techniques to reduce cost, latency, and resources needed to run language models in production, from quantization to distributed serving.
seed #inference #optimization #quantization #latency #serving #llm #performance
llms.txt
Proposed standard for publishing a Markdown file at a website's root that enables language models to efficiently understand and use the site's content at inference time.
growing #llms-txt #ai #web-standards #seo #agents #markdown #inference