Most efficient techniques to cache or pre-compute frequently generated response are as follows:
- Response Caching
 
- Memoization
 
- Embeddings Caching
 
- Indexing
 
- Pre-Training with Fixed Responses
 
Note that these techniques will help in reducing model load and improving efficiency also.