What I am reading
Papers
- 2025. SmolLM2: When Smol Goes Big - Data-Centric Training of a Small Language Model
- 2024. DeepSeek-V3 Technical Report
- 2022. Training Compute-Optimal Large Language Models
- 2021. Highly accurate protein structure prediction with {AlphaFold}
- 2020. Scaling Laws for Neural Language Models
- 2019. Language Models are Unsupervised Multitask Learners Accessed: 2024-11-15
Books
Blog Articles
- 2019. Deep Equilibrium Models