Commit Graph

6 Commits

Author SHA1 Message Date
Кирилл Блинов
f37c477a0a Add FastAPI backend with DZI viewer and feedback system
- FastAPI app with SQLite DB (projects, pages, issues, feedback)
- OpenSeadragon DZI viewer with inline SVG overlays
- Dashboard: upload, project list, tiling toggle, review mode
- Pipeline integration: tiling OCR → layout → elements → rules QC → DZI → DB
- Feedback collection: true_positive / false_positive / not_sure per issue
2026-06-01 12:29:41 +03:00
Кирилл Блинов
feeb02242b Add layout detection and multi-element extraction
- layout_detector.py: zone classification (drawing/table/title_block/notes) using line detection and text density analysis
- multi_element_extractor.py: extract dimensions, positions (П-1, X-1), GOST refs, steel grades, elevations, beam labels per zone
2026-06-01 12:29:32 +03:00
Кирилл Блинов
b5f7c6327e Add tiling OCR, preprocess and visualization tools
- tiling_ocr.py: split large drawings into overlapping tiles for better small-text recognition
- preprocess_for_ocr.py: CLAHE + unsharp mask for enhancing blueprint contrast
- visualize_dimensions.py: draw bounding boxes around detected dimension numbers
- compare_ocr.py: side-by-side visualization of normal vs tiling OCR results
- dimension_extractor.py: line-based dimension detection with pixel verification
- ocr_qwen.py: Alibaba Cloud qwen-vl-ocr client with resize and regex fallback parser
- test_qwen_ocr.py: standalone test for qwen OCR
- process_any_pdf.py: add --use-tiling flag to switch between normal and tiling OCR
2026-06-01 12:29:26 +03:00
Кирилл Блинов
c756a5766b Add RAG pipeline: LightRAG indexer, OpenCode API, VLM describer, and test tools
- Add rag_indexer.py: build LightRAG index from OCR with OpenCode API
- Add rag_query.py: query the knowledge graph
- Add vlm_describer.py: generate VLM descriptions via LM Studio
- Add test_model.py: quick check for LightRAG-compatible models
- Add run_pipeline.sh and run_pipeline.bat: full OCR → VLM → RAG pipeline
- Fix rapidocr import (rapidocr_onnxruntime)
- Fix process_any_pdf.py paths for cross-platform use
- Add .env.example, README_RAG.md, AGENTS.md
- Update .gitignore for outputs and secrets
2026-05-29 09:54:37 +03:00
keboss-m
851ba10d52 Add PDF source files and remove *.pdf from gitignore 2026-05-29 01:45:03 +03:00
keboss-m
b1b00656f2 Add PDF OCR pipeline and project indexes for Кронштадтский and 123 2026-05-29 01:04:01 +03:00