Listen

Description

Matt discusses Llama2 with GPTQ quantization, which is much more powerful than previous methods of quantizing model weights, and demos text-generation-webui.