Quip#: An Alternative to GGUF

When quantizing ginormous models such as 120B or 180B, it would be highly recommended to just to one pass at a low quantization as to test performance.

With Quip#, it makes it much easier to quantize large models in order to test multiple at the same quantization level.

So for instance, I can side-by-side compare the perfomances of different models that are built like goliath: big.

The github repo (opens in a new tab) includes instructions on how to install and operate the software.