2 articles
KV cache and tokenizer bugs squashed. Local inference actually viable now.
Latest llama-server build automatically migrates local cache without warning, disrupting established workflows.