While I’ve always felt that I had a pretty strong intuition for solving problems, the field I want to enter is riddled with complicated jargon and fundemental techniques that can’t be learned passively. This is a record of the papers/articles that I’ve read that really felt impactful, and a few key takeaways (pending).

  • Tangent: Information is a “Paper” if the link is on ArXiv. Otherwise, it’s an “Article”. This doesn’t mean some information is more formal than others, if anything I find myself learning more from articles just because they’re written in a more approachable style.

The Ultra-Scale Playbook: Training LLMs on GPU Clusters

Present

Article: [https://huggingface.co/spaces/nanotron/ultrascale-playbook?section=our_journey_up_to_now]

A GitHub Issue Title Compromised 4,000 Developer Machines

March 6th, 2026

Article: [https://grith.ai/blog/clinejection-when-your-ai-tool-installs-another]

Mapping the Mind of a Large Language Model

February 3rd, 2026

Article: [https://www.anthropic.com/research/mapping-mind-language-model]

Defeating Nondeterminism in LLM Inference

December 16th, 2025

Article: [https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/]

Neural Machine Translation By Jointing Learning to Align and Translate

October 27th, 2025

Paper: [https://arxiv.org/abs/1409.0473]

Adam: A Method For Stochastic Optimization

August 25th, 2025

Paper: [https://arxiv.org/abs/1412.6980]