neural network – Page 3 – Andrew Fairless, Ph.D.

What I Watch: How LLMs store facts

By Andrew Fairless on December 16, 2024September 3, 2024

How might LLMs store facts | Chapter 7, Deep Learning3Blue1BrownAug 31, 2024 “Unpacking the multilayer perceptrons in a transformer, and how they may store facts”

What I Read: sparsity, PyTorch, Hadamard product

By Andrew Fairless on December 2, 2024August 26, 2024

https://alexshtf.github.io/2024/07/07/HadamardParameterization.html Alex ShtoffFun with sparsity in PyTorch via Hadamard product parametrizationJul 7, 2024 “The beauty of sparsity inducing regularization is that we let our optimizer discover the sparsity patterns, insteadContinue readingWhat I Read: sparsity, PyTorch, Hadamard product

What I Read: Classifying pdfs

By Andrew Fairless on November 21, 2024August 26, 2024

https://snats.xyz/pages/articles/classifying_a_bunch_of_pdfs.html Classifying all of the pdfs on the internetSantiago Pedroza2024-08-18 “How would you classify all the pdfs in the internet? Well, that is what I tried doing this time.”

What I Read: Kernel, Convolutional Representations

By Andrew Fairless on October 24, 2024July 22, 2024

https://logb-research.github.io/blog/2024/ckn Kernel Trick I – Deep Convolutional Representations in RKHSOussama Zekri, Ambroise OdonnatJuly 18, 2024 “…we focus on the Convolutional Kernel Network (CKN) architecture proposed in End-to-End Kernel Learning withContinue readingWhat I Read: Kernel, Convolutional Representations

What I Read: Illustrated AlphaFold

By Andrew Fairless on October 9, 2024July 14, 2024

https://elanapearl.github.io/blog/2024/the-illustrated-alphafold The Illustrated AlphaFoldElana Simon, Jake Silberg “A visual walkthrough of the AlphaFold3 architecture…”

What I Read: What can LLMs never do?

By Andrew Fairless on October 1, 2024July 14, 2024

https://www.strangeloopcanon.com/p/what-can-llms-never-do What can LLMs never do?Rohit KrishnanApr 23, 2024 “Every time over the past few years that we came up with problems LLMs can’t do, they passed them with flyingContinue readingWhat I Read: What can LLMs never do?

What I Read: neural systems understanding

By Andrew Fairless on August 26, 2024June 15, 2024

Can an emerging field called ‘neural systems understanding’ explain the brain? Can an emerging field called ‘neural systems understanding’ explain the brain?George Musser5 June 2024 “This mashup of neuroscience, artificialContinue readingWhat I Read: neural systems understanding

What I Read: KL All You Need

By Andrew Fairless on August 21, 2024June 4, 2024

https://blog.alexalemi.com/kl-is-all-you-need.html KL is All You NeedAlexander A. Alemi2024-01-08 “…the core of essentially all modern machine learning methods is a single universal objective: Kullback-Leibler (KL) divergence minimization…. Understand KL, understand theContinue readingWhat I Read: KL All You Need

What I Read: Merge Large Language Models

By Andrew Fairless on August 15, 2024June 4, 2024

https://huggingface.co/blog/mlabonne/merge-models Merge Large Language Models with mergekitJanuary 9, 2024Maxime Labonne “Model merging is a technique that combines two or more LLMs into a single model. It’s a relatively new andContinue readingWhat I Read: Merge Large Language Models

What I Read: Transformers by Hand

By Andrew Fairless on August 14, 2024May 25, 2024

https://towardsdatascience.com/deep-dive-into-transformers-by-hand-%EF%B8%8E-68b8be4bd813?gi=b2b3c1885179 Deep Dive into Transformers by HandSrijanie Dey, PhDApr 12, 2024 “…the two mechanisms that are truly the force behind the transformers are attention weighting and feed-forward networks (FFN).”

Tag: neural network