Commit Graph

44 Commits

Author SHA1 Message Date
keboss-m
eee8f4c8a4 Replace LightRAG with native Python RAG engine + add deploy tooling
- New: src/rag/engine/ — in-process hybrid search (FTS5 BM25 + sqlite-vec + LLM rerank)
- New: src/rag/qmd/ — compatibility layer (qmd_query, qmd_chat, qmd_chat_stream, qmd_index_*)
- New: src/ingest/stub_writer.py — .md stubs for binary files (videos, archives)
- New: scripts/deploy.sh + scripts/pull_models.sh + Makefile + .env.example
- Removed: LightRAG, sentence-transformers embedding via separate package, rag_standalone/
- Removed: @nousresearch/qmd npm dep (package not published); Node.js from Dockerfile
- Updated: tests/ (46 passed), docker-compose, .dockerignore, config.yaml, README

Engine: in-process Python (no daemon, no npm), sentence-transformers 384-dim,
RRF fusion (k=60), BM25 + vector with numpy fallback. WebSocket API unchanged.

Deploy: 'git clone' + 'make init' + 'make pull-models MODELS_SOURCE=...' + 'make up'.
Models (5.83 GB) live outside git; pulled via rsync from dev host.
2026-06-10 14:24:01 +03:00
keboss-m
36c9be48be Add document ingestion pipeline, chat analytics modes, and auth fixes
Ingest MD/PDF/DOCX/XLSX into org-scoped documents with classify and RAG indexing. Add compare/timeline chat modes and UI upload. Filter WebSocket progress by user ACL and normalize admin project slugs consistently.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 19:16:23 +03:00
keboss-m
8df14e3102 Add multi-tenant auth with org projects, roles, and personal workspaces.
JWT login, org-scoped storage and RAG, admin/director/user roles, user-owned projects, login UI, and legacy data migration.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 18:54:25 +03:00
keboss-m
2727d3cd32 Fix XSS fallback when DOMPurify is unavailable in chat markdown renderer.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 18:33:13 +03:00
keboss-m
1f2ef8f012 Render chat bot answers as markdown in the UI.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 18:29:53 +03:00
keboss-m
e9f5b80e23 Add API credentials for HF and OpenCode in config and env.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 17:41:31 +03:00
keboss-m
fee9b9acb1 Add RAG, summary pipeline, and split transcribe/postprocess queue.
Separate ASR (2 workers) from summary/RAG post-processing, add LightRAG chat API, batch upload fixes, and local model mounts for Docker deployment.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 17:40:58 +03:00
keboss-m
6206e24af0 Fix CSS: restore hover effect for all file items including MD files 2026-06-01 12:56:55 +03:00
keboss-m
37f2b00deb Fix docx download: use native <a download> links instead of JS iframe method 2026-06-01 12:53:46 +03:00
keboss-m
0c1d671c8d Fix docx download: attach click handlers directly to file items, use iframe for reliable download 2026-06-01 12:49:54 +03:00
keboss-m
fe0a2a0611 Fix docx download: make docx clickable, add forced download via Blob and ObjectURL 2026-06-01 12:34:41 +03:00
keboss-m
2f3b27ac57 Remove source files after processing: delete from uploads, don't copy to processed 2026-06-01 12:19:50 +03:00
keboss-m
b786f84e7c Add nltk_data volume to persist NLTK resources across restarts 2026-06-01 12:13:36 +03:00
keboss-m
24154665b6 Fix: run heavy pipeline in threads to unblock event loop, mount code volumes for live updates 2026-06-01 12:02:49 +03:00
keboss-m
2eee05a52f Fix: add debounce to renderTasks to prevent frontend freeze during WebSocket updates 2026-06-01 10:32:12 +03:00
keboss-m
932dc71b8a Fix task queue display: consistent task_id keys in frontend, add file field to WebSocket progress 2026-06-01 10:24:19 +03:00
keboss-m
e5a3927819 Fix: change filename to file in API responses for task queue display 2026-05-31 16:42:10 +03:00
keboss-m
ea1904f9af Fix Docker: correct entrypoint, remove hf_cache volume overlap, use existing image with models 2026-05-31 13:04:44 +03:00
keboss-m
714ac06364 Add requirements.txt and allow it in gitignore 2026-05-31 13:04:44 +03:00
Кирилл Блинов
a19cb15816 Add uploads/ and processed/ to .gitignore 2026-05-29 18:58:19 +03:00
Кирилл Блинов
ef8c86d3e7 Add explicit webm support in file picker and docs 2026-05-29 18:56:13 +03:00
Кирилл Блинов
22eb20a2db Clarify: large-v3 is max, bad audio needs preprocessing not bigger models 2026-05-29 18:42:35 +03:00
Кирилл Блинов
12fda2c231 Add .env to repo for closed-loop deployment 2026-05-29 18:12:45 +03:00
Кирилл Блинов
6e5ee64be0 Move HF_TOKEN to .env file for one-command docker compose up 2026-05-29 18:11:00 +03:00
Кирилл Блинов
0ed45cdf12 Integrate HF_TOKEN into Docker build for preloaded diarization models 2026-05-29 18:04:38 +03:00
Кирилл Блинов
0931a15d32 Add Docker support with preloaded models and docker-compose 2026-05-29 17:50:30 +03:00
Кирилл Блинов
96426a09b4 Fix delete icon visibility for long folder names with timestamp 2026-05-29 13:02:37 +03:00
Кирилл Блинов
c880891839 Add folder deletion with confirmation dialog 2026-05-29 12:52:51 +03:00
Кирилл Блинов
8bb21d0d7f Add timestamp to output folder to avoid overwrite conflicts on re-upload 2026-05-29 12:45:33 +03:00
Кирилл Блинов
78e542a246 Increase batch_size to 2 for CPU optimization 2026-05-29 12:42:07 +03:00
Кирилл Блинов
6f727b1f3d Fix file dialog double-open: add stopPropagation and reset input value 2026-05-29 12:34:11 +03:00
Кирилл Блинов
5e62b3d308 Fix worker startup via FastAPI lifespan, remove manual start from server script 2026-05-29 12:30:39 +03:00
Кирилл Блинов
a10cc19e61 Fix JS loadFileTree error and add favicon 2026-05-29 12:27:40 +03:00
Кирилл Блинов
b9897555a3 Update README with web service documentation 2026-05-29 12:17:27 +03:00
Кирилл Блинов
beb411dfdc Add web service: FastAPI backend + minimal frontend with drag-drop, WebSocket progress, file tree and MD viewer 2026-05-29 12:17:08 +03:00
Кирилл Блинов
c771f83351 Add multi-format output support (docx + md simultaneously) 2026-05-29 11:39:13 +03:00
Кирилл Блинов
25f45ae7de Add video/ to .gitignore for user data 2026-05-29 11:13:06 +03:00
Кирилл Блинов
bd7eadb49f Update README with video support and first-run documentation 2026-05-29 10:53:44 +03:00
Кирилл Блинов
ddee721bea Add video input support with ffmpeg audio extraction 2026-05-29 10:53:16 +03:00
Кирилл Блинов
09d0a74520 Add .env and secrets to .gitignore 2026-05-29 10:50:54 +03:00
Кирилл Блинов
de212f5f00 Add local/offline processing explanation to README 2026-05-29 10:29:51 +03:00
Кирилл Блинов
bdd94b860f Add README and fix .gitignore for markdown files 2026-05-29 10:28:16 +03:00
Кирилл Блинов
5a5d1fa960 Add initial project structure: pipeline, docs, config, profiles 2026-05-29 10:16:02 +03:00
Кирилл Блинов
4214d689dd Add AGENTS.md with commit and push rules 2026-05-29 10:06:55 +03:00