How can GPTQ quantization be applied to reduce the memory footprint of a 65B parameter model

Can you tell me How can GPTQ quantization be applied to reduce the memory footprint of a 65B parameter model?

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):

Email me at this address if my answer is selected or commented on:

Privacy: Your email address will only be used for sending these notifications.