Docling Projects .

Technology

Docling

Docling is the open-source toolkit (from IBM Research) that transforms complex documents like PDFs into structured, AI-ready JSON and Markdown for RAG and model fine-tuning.

Docling provides the specialized ingestion layer for your modern AI stack. Developed by IBM Research, this open-source toolkit converts unstructured files (PDFs, DOCX, PPTX) into clean, structured data formats: JSON and Markdown. It uses advanced computer vision for layout analysis, often bypassing traditional OCR for a reported 30x speed improvement. The project integrates seamlessly with major AI frameworks (LlamaIndex, LangChain), proving its critical value in building high-quality Retrieval-Augmented Generation (RAG) systems. It has rapidly gained traction, securing over 30,000 stars on GitHub.

https://docling.ai
1 project · 3 cities

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects