https://maharshi.bearblog.dev/optimizing-softmax-cuda Learning CUDA by optimizing softmax: A worklogMaharshi Pandya04 Jan, 2025 “Optimizing softmax, especially in the context of GPU programming with CUDA, presents many opportunities for learning.”
What I Read: Transformers Inference Optimization
https://astralord.github.io/posts/transformer-inference-optimization-toolset Transformers Inference Optimization ToolsetAleksandr SamarinOct 1, 2024 “Large Language Models are pushing the boundaries of artificial intelligence, but their immense size poses significant computational challenges. As these models grow,
What I Read: Nvidia, GPU gold rush
https://blog.johnluttig.com/p/nvidia-envy-understanding-the-gpu Nvidia Envy: understanding the GPU gold rushJohn LuttigNov 10, 2023 “In 2023, thousands of companies and countries begged Nvidia to purchase more GPUs. Can the exponential demand endure?”