Jonatan Matajonmatum.com
conceptsnotesexperimentsessays
© 2026 Jonatan Mata. All rights reserved.v2.1.1

#inference

2 articles tagged #inference.

  • Inference Optimization

    Techniques to reduce cost, latency, and resources needed to run language models in production, from quantization to distributed serving.

    seed#inference#optimization#quantization#latency#serving#llm#performance
  • llms.txt

    Proposed standard for publishing a Markdown file at a website's root that enables language models to efficiently understand and use the site's content at inference time.

    growing#llms-txt#ai#web-standards#seo#agents#markdown#inference
All tags