- Add rag_indexer.py: build LightRAG index from OCR with OpenCode API - Add rag_query.py: query the knowledge graph - Add vlm_describer.py: generate VLM descriptions via LM Studio - Add test_model.py: quick check for LightRAG-compatible models - Add run_pipeline.sh and run_pipeline.bat: full OCR → VLM → RAG pipeline - Fix rapidocr import (rapidocr_onnxruntime) - Fix process_any_pdf.py paths for cross-platform use - Add .env.example, README_RAG.md, AGENTS.md - Update .gitignore for outputs and secrets
22 lines
475 B
Plaintext
22 lines
475 B
Plaintext
# Core project dependencies
|
|
PyMuPDF>=1.24.0
|
|
rapidocr-onnxruntime>=1.4.0
|
|
docling>=2.0.0
|
|
|
|
# RAG stack
|
|
lightrag-hku>=1.0.0
|
|
numpy>=1.24.0
|
|
|
|
# Optional: for local LLM via Ollama (install Ollama separately from https://ollama.com)
|
|
# ollama-python is pulled by lightrag-hku when needed
|
|
|
|
# Optional: for OpenAI API (set OPENAI_API_KEY env var)
|
|
openai>=1.0.0
|
|
|
|
# Local embeddings (used with LM Studio backend)
|
|
sentence-transformers>=3.0.0
|
|
|
|
# Utilities
|
|
tqdm>=4.66.0
|
|
python-dotenv>=1.0.0
|