https://cnichkawde.github.io/statespacesequencemodels.html Beyond Transformers: Structured State Space Sequence ModelsChetan NichkawdeJanuary 22, 2024 “A new paradigm is rapidly evolving within the realm of sequence modeling that presents a marked advancement over the
What I Read: LLM Evaluation Metrics
https://www.confident-ai.com/blog/llm-evaluation-metrics-everything-you-need-for-llm-evaluation LLM Evaluation Metrics: Everything You Need for LLM EvaluationJeffrey IpJanuary 22, 2024 “This article will teach you everything you need to know about LLM evaluation metrics…”
What I Read: Distributed Training, Finetuning
https://sumanthrh.com/post/distributed-and-efficient-finetuning/ Everything about Distributed Training and Efficient FinetuningSumanth R HegdeLast updated on Oct 13, 2023 “practical guidelines and gotchas with multi-GPU and multi-node training”
What I Read: To Understand Transformers, Focus on Attention
https://drscotthawley.github.io/blog/posts/Transformers1-Attention.html To Understand Transformers, Focus on AttentionScott H. HawleyAugust 21, 2023 “To Understand Transformers, Focus on Attention”