Matt discusses Llama2 with GPTQ quantization, which is much more powerful than previous methods of quantizing model weights, and demos text-generation-webui.
Want to check another podcast?
Enter the RSS feed of a podcast, and see all of their public statistics.