From Attention to Prediction: The Transformer Workflow Explained

Transformer is a neural network architecture for sequence transduction that replaces recurrence and convolutions with self-attention, enabling fully parallel processing of input embeddings $X\in\ma...

Jul 16, 2025 DNN, Transformer, LLM

Sentinel-AI - Designing a Real-Time, Scalable AI Newsfeed

“What would a production-grade AI cluster look like if built from scratch for scale, resilience, and lightning-fast insights?” Sentinel-AI is my answer. Repo 👉 github.com/gsantopaolo/sentinel-AI ...

Jul 13, 2025 AI Clusters, Agentic AI, RAG

Attention Is All You Need: A Hands‑On Guide for Gen-AI Engineers

Summary As part of my study for the Artificial Intelligence Professional Program at Stanford, I’m studying CS224N: Natural Language Processing with Deep Learning. In this class, we studied the his...

Jul 12, 2025 Attention, Transformer, NLP, Deep Learning

Yes, You Should Understand Backprop: A Step-by-Step Walkthrough

Backpropagation—originating in Linnainmaa’s 1970 reverse-mode AD thesis and popularized for neural nets by Rumelhart et al. in 1986—is the workhorse that makes deep learning feasible (en.wikipedi...

Jul 3, 2025 DNNs, Language Model, Backpropagation

How to Supercharge Your Terminal with Gemini CLI

TL;DR Gemini CLI is Google’s free, open-source AI agent for your terminal. Powered by Gemini 2.5 Pro (with a huge 1 million-token context), it lets you scaffold apps, debug and refactor code, fe...

Jun 29, 2025 Gemini-Cli, AI Tools

Beyond the Thought Vector: The Evolution of Attention in Deep Learning

Sequence-to-sequence (seq2seq) models without attention compress an entire source sentence into a single fixed-length vector, then feed that into a decoder to produce the target sentence. While t...

Jun 28, 2025 DNNs, Language Model, Attention

Mastering Language Modeling: From N-grams to RNNs and Beyond

A modern language model must handle vast contexts, share parameters efficiently, and overcome the vanishing-gradient issues that plagued early RNNs. Count-based n-grams hit a combinatorial wall: s...

Jun 27, 2025 RNNs, Language Model

Code from Anywhere: Your SSH Guide to Remote Development in PyCharm

If you landed here, you’re probably already using a remote (GPU) for your ML tasks, and you’re sick of git push from your laptop and the pull on your powerful GPU machine. Ask me why I know that, a...

Jun 26, 2025 PyCharm, Remote Development

N-gram Language Models: The Classic Building Blocks of Predictive Text

Have you ever tapped on your phone’s keyboard and seen it guess your next word? That “magic” used to (and often still does!) come from n-gram language models—the statistical workhorses that predate...

Jun 25, 2025 Language Model, N-gram

Using Agentic AI to Modernize Large Scale Code

Modernizing large-scale legacy Java applications in banking is a multifaceted challenge, combining intricate domain logic with stringent compliance and security requirements. Generative AI—especia...

May 10, 2025 Agents, LLM, CrewAI