TurboQuant compresses AI model vectors from 32 bits down to as few as 3 bits by mapping high-dimensional data onto an efficient quantized grid. (Image: Google Research) The AI industry loves a big ...
flexiGIF shrinks GIF files by optimizing their compression scheme (LZW algorithm). No visual information is changed and the output is 100% pixel-wise identical to the original file - that's why it's ...
Beats Q8_0 perplexity at half the size -- and even beats F16. APEX outperforms Unsloth Dynamic 2.0 (UD) quantizations on perplexity, HellaSwag, and inference speed while being 2x smaller: APEX ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results