transcription/AGENTS.md
keboss-m eee8f4c8a4 Replace LightRAG with native Python RAG engine + add deploy tooling
- New: src/rag/engine/ — in-process hybrid search (FTS5 BM25 + sqlite-vec + LLM rerank)
- New: src/rag/qmd/ — compatibility layer (qmd_query, qmd_chat, qmd_chat_stream, qmd_index_*)
- New: src/ingest/stub_writer.py — .md stubs for binary files (videos, archives)
- New: scripts/deploy.sh + scripts/pull_models.sh + Makefile + .env.example
- Removed: LightRAG, sentence-transformers embedding via separate package, rag_standalone/
- Removed: @nousresearch/qmd npm dep (package not published); Node.js from Dockerfile
- Updated: tests/ (46 passed), docker-compose, .dockerignore, config.yaml, README

Engine: in-process Python (no daemon, no npm), sentence-transformers 384-dim,
RRF fusion (k=60), BM25 + vector with numpy fallback. WebSocket API unchanged.

Deploy: 'git clone' + 'make init' + 'make pull-models MODELS_SOURCE=...' + 'make up'.
Models (5.83 GB) live outside git; pulled via rsync from dev host.
2026-06-10 14:24:01 +03:00

52 lines
2.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Agent Guidelines
## Git Workflow
- **Commit frequently**: After completing a meaningful unit of work (feature, fix, or file update), stage changes with `git add` and create a commit with a clear, concise message in the imperative mood (e.g., "Add parser", "Fix timeout").
- **Push to remote**: Once the local commit(s) are ready, push them to the remote repository. Use `git push -u origin main` if the upstream branch is not yet tracked; otherwise use `git push`.
- **No uncommitted changes left behind**: Before finishing a task, ensure all intended changes are committed and pushed to avoid losing work.
- **No empty commits**: Avoid creating empty or placeholder commits.
## Native RAG Engine
The project uses a **native Python RAG engine** (no external daemons, no Node.js):
hybrid BM25 (SQLite FTS5) + vector (sqlite-vec with numpy fallback) + LLM rerank
through OpenCode.
### Layout
- `src/rag/engine/` — the engine itself:
- `db.py``Database` (SQLite + sqlite-vec + FTS5 schema, fallback detection).
- `chunker.py` — markdown-aware recursive splitter (~900 chars, 15% overlap).
- `embeddings.py` — singleton sentence-transformers model (lazy load).
- `bm25.py` — FTS5 BM25 with `rank_bm25` fallback.
- `vector.py` — sqlite-vec with numpy cosine fallback.
- `hybrid.py` — RRF fusion (k=60).
- `rerank.py` — LLM rerank through OpenCode.
- `engine.py` — public facade: `index_file`, `index_text`, `search`, `vsearch`, `query`, `get`, `status`, `warmup`.
- `src/rag/qmd/` — compatibility layer preserving the old `qmd_*` API:
`qmd_query`, `qmd_chat`, `qmd_chat_stream`, `qmd_index_meeting`, `qmd_index_document`.
`main.py` / `queue.py` / `ingest_worker.py` use these.
- `src/ingest/stub_writer.py``.md` stubs for binary files (videos, archives).
### Conventions
- Коллекция = `processed/<org>/qmd_collections/<project_slug>/` (или `_global/`) — внутри лежит `index.sqlite`.
- Перед изменением `src/rag/engine/` — прочитай `openspec/changes/native-rag-engine/design.md`.
- При добавлении нового retrieval-режима — обнови `LEGACY_MODE_MAP` в `src/rag/qmd/query.py`.
- При добавлении нового LLM-вызова — обнови `CHAT_MODES` в `src/rag/qmd/query.py`.
### Tests
- Все новые модули `src/rag/engine/` обязаны иметь unit-тест в `tests/test_native_engine.py`.
- Реальные данные: 35 `.md` файлов в `tempfile.TemporaryDirectory()`.
- Запуск: `python -m pytest tests/ -q` (46 passed на момент написания).
- E2E: `tests/test_native_engine_e2e.py` — ingest → search → chat-stream с подменой OpenCode.
### Fallback-стратегии
- FTS5 недоступен → `rank_bm25` in-memory.
- sqlite-vec недоступен → numpy cosine in-memory.
- Embedding-модель не загрузилась → BM25-only режим.