1 article
The framework now supports aggressive KV-cache compression, making on-device models faster to run.