You can quantize activations in a Transformer using QLoRA by enabling bnb_4bit_use_double_quant and setting quantization parameters during model loading.
Here is the code snippet you can refer to:

In the above code we are using the following key strategies:
- 
Enables double quantization for activation compression.
 
- 
Uses NF4 (normal float 4-bit) for better activation representation.
 
- 
Specifies target modules for fine-grained QLoRA insertion.
 
- 
Compatible with Hugging Face + bitsandbytes quantization backend.
 
Hence, activation quantization in QLoRA is achieved by configuring quantization-aware loading parameters that efficiently compress intermediate representations in Transformers.