2026-05-29 07:06:55 +00:00
# Agent Guidelines
## Git Workflow
- **Commit frequently**: After completing a meaningful unit of work (feature, fix, or file update), stage changes with `git add` and create a commit with a clear, concise message in the imperative mood (e.g., "Add parser", "Fix timeout").
- **Push to remote**: Once the local commit(s) are ready, push them to the remote repository. Use `git push -u origin main` if the upstream branch is not yet tracked; otherwise use `git push` .
- **No uncommitted changes left behind**: Before finishing a task, ensure all intended changes are committed and pushed to avoid losing work.
- **No empty commits**: Avoid creating empty or placeholder commits.
Replace LightRAG with native Python RAG engine + add deploy tooling
- New: src/rag/engine/ — in-process hybrid search (FTS5 BM25 + sqlite-vec + LLM rerank)
- New: src/rag/qmd/ — compatibility layer (qmd_query, qmd_chat, qmd_chat_stream, qmd_index_*)
- New: src/ingest/stub_writer.py — .md stubs for binary files (videos, archives)
- New: scripts/deploy.sh + scripts/pull_models.sh + Makefile + .env.example
- Removed: LightRAG, sentence-transformers embedding via separate package, rag_standalone/
- Removed: @nousresearch/qmd npm dep (package not published); Node.js from Dockerfile
- Updated: tests/ (46 passed), docker-compose, .dockerignore, config.yaml, README
Engine: in-process Python (no daemon, no npm), sentence-transformers 384-dim,
RRF fusion (k=60), BM25 + vector with numpy fallback. WebSocket API unchanged.
Deploy: 'git clone' + 'make init' + 'make pull-models MODELS_SOURCE=...' + 'make up'.
Models (5.83 GB) live outside git; pulled via rsync from dev host.
2026-06-10 11:24:01 +00:00
## Native RAG Engine
The project uses a **native Python RAG engine** (no external daemons, no Node.js):
hybrid BM25 (SQLite FTS5) + vector (sqlite-vec with numpy fallback) + LLM rerank
through OpenCode.
### Layout
- `src/rag/engine/` — the engine itself:
- `db.py` — `Database` (SQLite + sqlite-vec + FTS5 schema, fallback detection).
- `chunker.py` — markdown-aware recursive splitter (~900 chars, 15% overlap).
- `embeddings.py` — singleton sentence-transformers model (lazy load).
- `bm25.py` — FTS5 BM25 with `rank_bm25` fallback.
- `vector.py` — sqlite-vec with numpy cosine fallback.
- `hybrid.py` — RRF fusion (k=60).
- `rerank.py` — LLM rerank through OpenCode.
- `engine.py` — public facade: `index_file` , `index_text` , `search` , `vsearch` , `query` , `get` , `status` , `warmup` .
- `src/rag/qmd/` — compatibility layer preserving the old `qmd_*` API:
`qmd_query` , `qmd_chat` , `qmd_chat_stream` , `qmd_index_meeting` , `qmd_index_document` .
`main.py` / `queue.py` / `ingest_worker.py` use these.
- `src/ingest/stub_writer.py` — `.md` stubs for binary files (videos, archives).
### Conventions
- Коллекция = `processed/<org>/qmd_collections/<project_slug>/` (или `_global/` ) — внутри лежит `index.sqlite` .
- Перед изменением `src/rag/engine/` — прочитай `openspec/changes/native-rag-engine/design.md` .
- При добавлении нового retrieval-режима — обнови `LEGACY_MODE_MAP` в `src/rag/qmd/query.py` .
- При добавлении нового LLM-вызова — обнови `CHAT_MODES` в `src/rag/qmd/query.py` .
### Tests
- В с е новые модули `src/rag/engine/` обязаны иметь unit-тест в `tests/test_native_engine.py` .
- Реальные данные: 3– 5 `.md` файлов в `tempfile.TemporaryDirectory()` .
- Запуск: `python -m pytest tests/ -q` (46 passed на момент написания).
- E2E: `tests/test_native_engine_e2e.py` — ingest → search → chat-stream с подменой OpenCode.
### Fallback-стратегии
- FTS5 недоступен → `rank_bm25` in-memory.
- sqlite-vec недоступен → numpy cosine in-memory.
- Embedding-модель не загрузилась → BM25-only режим.
2026-06-10 11:27:35 +00:00
## Deploy
**Архитектура: код в git, модели отдельно (rsync).**
- `Makefile` — основные цели: `init` , `pull-models` , `up` , `down` , `restart` , `logs` , `status` , `test` , `deploy` .
- `scripts/pull_models.sh` — `MODELS_SOURCE=user@host:path` → rsync моделей; fallback на `download_models.py` из интернета.
- `scripts/deploy.sh` — rsync кода + `.env` + `make pull-models && make up` на удалённом сервере.
- `.env.example` — коммитится. `.env` — нет (в `.gitignore` ).
### Что в git / что нет
| В git (~2 MB) | Н Е в git |
|---|---|
| `src/` , `backend/` , `tests/` , `scripts/` , `Makefile` | `models/` (5.83 GB) |
| `Dockerfile*` , `docker-compose*.yml` , `config.yaml` | `processed/` , `uploads/` , `data/` (рантайм) |
| `AGENTS.md` , `README.md` , `.env.example` | `migrate/*.tar.gz` (9.5 GB) |
| `openspec/` (спецификации) | `.env` (секреты) |
### Деплой одной командой (с исходной машины)
```bash
git add -A & & git commit -m "..." & & git push origin main
./scripts/deploy.sh user@server /opt/transcription
```
### Первый запуск на новом сервере
```bash
git clone https://gts.meratalk.online/keboss/transcription.git /opt/transcription
cd /opt/transcription
make init # .env из .env.example
nano .env # вписать HF_TOKEN, OPENCODE_API_KEY
make pull-models MODELS_SOURCE=user@dev-host:/opt/transcription/models/
make up
```