Skip to content

Generative AI

Explorations into the frontier of Artificial Intelligence.

Core Concepts

  • Transformer & Attention Foundations: The architecture that started it all. A mathematical and code-level deep dive into Self-Attention, Multi-Head Attention, and Positional Encodings.

SOTA Roadmap

We will cover the following cutting-edge topics:

1. Large Language Models (LLMs)

  • Architectures: Mixture of Experts (MoE/Mixtral), Grouped Query Attention (GQA), Rotary Embeddings (RoPE).
  • Optimization: FlashAttention-2, Memory-efficient Transformers.
  • State-of-the-Art Models: Analysis of Llama 3, Gemini 1.5, Claude 3.5 Sonnet.

2. Alignment & Instruct Tuning

  • RLHF Alternatives: Direct Preference Optimization (DPO), Identity Preference Optimization (IPO).
  • Synthetic Data: Self-Instruct, Evol-Instruct, Constitutional AI.

3. Image & Video Generation

  • Diffusion Evolution: Latent Diffusion (SDXL), Rectified Flow (Flux.1), Consistency Models.
  • Video: Spacetime Patches (Sora), Masked Generative Transformers.

4. Reasoning & Agents

  • Prompt Engineering: Chain of Thought (CoT), Tree of Thoughts (ToT), Reflection.
  • Tool Use: ReAct, Toolformer, Function Calling patterns.

Key Resources