While I’ve always felt that I had a pretty strong intuition for solving problems, the field I want to enter is riddled with complicated jargon and fundemental techniques that can’t be learned passively. This is a record of the papers/articles that I’ve read that really felt impactful, and a few key takeaways (pending).
- Tangent: Information is a “Paper” if the link is on ArXiv. Otherwise, it’s an “Article”. This doesn’t mean some information is more formal than others, if anything I find myself learning more from articles just because they’re written in a more approachable style.
The Ultra-Scale Playbook: Training LLMs on GPU Clusters
Present
Article: [https://huggingface.co/spaces/nanotron/ultrascale-playbook?section=our_journey_up_to_now]
Defeating Nondeterminism in LLM Inference
December 16th, 2025
Article: [https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/]
Neural Machine Translation By Jointing Learning to Align and Translate
October 27th, 2025
Paper: [https://arxiv.org/abs/1409.0473]
Adam: A Method For Stochastic Optimization
August 25th, 2025
Paper: [https://arxiv.org/abs/1412.6980]