Week 1 · Day 7/30

Week 1 Capstone: Full RAG Pipeline

Integrácia všetkého z týždňa 1 do produkčného RAG API

📅 2026-03-10 ⏱️ 6-8 hodín 📊 Foundations & RAG
Celkový progres 23%

🎯 Cieľ dňa

Buildni document-grounded Q&A API s FastAPI, hybrid search, reranking, streaming a RAGAS evaluáciou.

project core

📚 Study Resources

🤗

Hugging Face — Code a Simple RAG from Scratch

Kompletný RAG od nuly s Python a Ollama, bez frameworkov.

tutorial
🦜

LangChain — Build a RAG Agent

Framework-based approach pre porovnanie s from-scratch verziou.

docs
🔧

DataCamp — RAG System with LangChain & FastAPI

Production-oriented: LangChain pre RAG + FastAPI pre API layer.

tutorial
📝

KDnuggets — 7 Steps to Build RAG from Scratch

Clean step-by-step guide. Ingestion, splitting, embedding, storage, query, retrieval, generation.

tutorial

💡 Key Concepts

Full Pipeline — Query → Token counting → Decomposition → Hybrid retrieval → Reranking → Prompt + CoT → Streaming → Eval
API Layer — FastAPI s /ingest, /query (streaming), /evaluate endpointmi
Error Handling — Retry logic, exponential backoff, model fallback, rate limit handling
Integration Testing — End-to-end testy celého pipeline vrátane RAGAS metrík

🔧 Praktické cvičenie

Buildni full RAG pipeline API.

  1. Ingestion pipeline: load docs, chunk (512 tokens, 50 overlap), embed, store do ChromaDB
  2. Retrieval: hybrid search (vector + BM25), top-10, rerank na top-3
  3. Generation: system prompt + CoT, streaming SSE response
  4. FastAPI: /ingest, /query (streaming), /evaluate endpoints
  5. Error handling: retry, fallback, rate limit handling
  6. RAGAS evaluácia: 10+ test otázok, faithfulness > 0.7
  7. Test streaming v browser/curl, trigger rate limit a verify fallback

🧠 Self-Assessment Questions

  1. Vysvetli ako self-attention počíta attention weights cez Q/K/V matice.
  2. Kedy by si použil few-shot vs chain-of-thought vs meta-prompting?
  3. Prečo MMLU prestáva byť užitočný pre porovnanie frontier modelov?
  4. Aký je rozdiel medzi 429 a 500 error, a ako ich handlovať?
  5. Prečo je chunk size 256-512 tokenov recommended default?
  6. Čo meria RAGAS faithfulness a prečo je dôležitý?
  7. Kedy pridať reranking na top hybrid searchu vs hybrid samotný?