Skip to content

Tuan's Blog

Overview

Generative AI¶

Explorations into the frontier of Artificial Intelligence.

Core Concepts¶

Transformer & Attention Foundations: The architecture that started it all. A mathematical and code-level deep dive into Self-Attention, Multi-Head Attention, and Positional Encodings.

SOTA Roadmap¶

We will cover the following cutting-edge topics:

1. Large Language Models (LLMs)¶

Architectures: Mixture of Experts (MoE/Mixtral), Grouped Query Attention (GQA), Rotary Embeddings (RoPE).
Optimization: FlashAttention-2, Memory-efficient Transformers.
State-of-the-Art Models: Analysis of Llama 3, Gemini 1.5, Claude 3.5 Sonnet.

2. Alignment & Instruct Tuning¶

RLHF Alternatives: Direct Preference Optimization (DPO), Identity Preference Optimization (IPO).
Synthetic Data: Self-Instruct, Evol-Instruct, Constitutional AI.

3. Image & Video Generation¶

Diffusion Evolution: Latent Diffusion (SDXL), Rectified Flow (Flux.1), Consistency Models.
Video: Spacetime Patches (Sora), Masked Generative Transformers.

4. Reasoning & Agents¶

Prompt Engineering: Chain of Thought (CoT), Tree of Thoughts (ToT), Reflection.
Tool Use: ReAct, Toolformer, Function Calling patterns.

Key Resources¶

Blog: Lilian Weng's Blog (Highly recommended for deep technical summaries).
Visuals: The Illustrated Transformer (Jay Alammar).
Video: Andrej Karpathy's Neural Networks: Zero to Hero.