Understanding Self-Attention in Transformers
Why Data Scientists Should Test Cheap Hypotheses First
What Makes LLMs “Creative”
top-k: LLM lazy sampling
Top-p: LLM Nucleus Sampling
The Dice inside ChatGPT
LLM Creativity Fuel
The Creative Shell