List: PDF | Curated by Jay Greathouse

Dec 16, 2024
19 stories
PDF
Anoop Maurya
Ollama-OCR: Now Available as a Python Package!Stuck behind a paywall? Read for Free!
Dec 2, 2024
17
Dec 2, 2024
17
In
TDS Archive
by
Dr. Leon Eversberg
Improved RAG Document Processing With MarkdownHow to read and convert PDFs to Markdown for better RAG results with LLMs
Nov 19, 2024
12
Nov 19, 2024
12
In
Towards AI
by
Florian June
Let AI Instantly Parse Heavy Documents: The Magic of MPLUG-DOCOWL2’s Efficient CompressionToday, let’s take a look at one of the latest developments in PDF Parsing and Document Intelligence.
Nov 13, 2024
2
Nov 13, 2024
2
In
Towards AI
by
Florian June
Demystifying PDF Parsing 05: Unifying Separate Tasks into a Small ModelMechanics, Code, Insights on GOT, DLAFormer, and UNIT
Sep 19, 2024
3
Sep 19, 2024
3
In
AI Advances
by
Richardson Gunde
The PDF Extraction Revolution: Why PymuPDF4llm is Your New Best Friend (and LlamaParse is Crying)Hey there, data-loving friends! Ready for some serious AI magic? Picture this: you’re knee-deep in PDFs, trying to extract information for…
Oct 31, 2024
29
Oct 31, 2024
29
Agent Issue
Llama 3.2-Vision for High-Precision OCR with OllamaWith the new Llama 3.2 release, Meta seriously leveled up here — now you’ve got vision models (11B and 90B) that don’t just read text but…
Oct 31, 2024
1
Oct 31, 2024
1
In
Python in Plain English
by
Anoop Maurya
Why PyMuPDF4LLM is the Best Tool for Extracting Data from PDFs (Even if You Didn’t Know You Needed…Stuck behind a paywall? Read for Free!
Oct 18, 2024
20
Oct 18, 2024
20
Pankaj
Unlock the Power of PyMuPDF4LLM: A Game-Changer for PDF Extraction and AI WorkflowsEfficiently Convert PDFs to Structured Data for Large Language Models and Retrieval-Augmented Generation Systems
Oct 15, 2024
4
Oct 15, 2024
4
Suman Sourabh
This is How I Convert PDF to MarkdownBetter than any online tool out there
Sep 7, 2024
5
Sep 7, 2024
5
In
AI Advances
by
Florian June
Demystifying PDF Parsing 06: Representative Industry SolutionsThis is the sixth article in our series. In this article, we explore how PDF parsing is performed within the well-known and popular RAG…
Oct 9, 2024
6
Oct 9, 2024
6
In
Towards AI
by
Tarun Singh
AI and LLM for Document Extraction: Simplifying Complex Formats with EaseAI and LLMs for Document Extraction: Simplifying Complex Formats with Ease
Sep 27, 2024
7
Sep 27, 2024
7
In
AI Advances
by
Florian June
Kotaemon Unveiled: Innovations in RAG Framework for Document QAPDF Parsing, GraphRAG, Agent-Based Reasoning, and Insights
Sep 13, 2024
4
Sep 13, 2024
4
In
Level Up Coding
by
Lan Chu
Working with PDFs: The best tools for extracting text, tables and imagesWith PyPDF, Camelot, Tabular, Adobe API
Apr 30, 2024
8
Apr 30, 2024
8
In
Generative AI
by
Fabio Matricardi
The Tiny JSONist — meet AI NuExtractLooks like a bad words, but it is not. This is the best SML to take a text IN and give you a structured JSON OUT. Wanna know more?
Aug 11, 2024
1
Aug 11, 2024
1
In
AI Advances
by
Florian June
Demystifying PDF Parsing 04: OCR-Free Large Multimodal Model-Based MethodPrinciples, Insights and Thoughts
Jul 1, 2024
3
Jul 1, 2024
3
Florian June
Unveiling PDF Parsing: How to extract formulas from scientific pdf papersThis article is a supplement to Advanced RAG 02: Unveiling PDF Parsing.
Feb 15, 2024
2
Feb 15, 2024
2
In
Towards AI
by
Florian June
Advanced RAG 02: Unveiling PDF ParsingIncluding key points, diagrams, and code
Feb 2, 2024
22
Feb 2, 2024
22
In
TDS Archive
by
Noah Haglund
Designing and Deploying a Machine Learning Python Application (Part 2)You don’t have to be Atlas to get your model into the cloud
Feb 24, 2024
Feb 24, 2024
In
TDS Archive
by
Noah Haglund
Training and Deploying a Custom Detectron2 Model for Object Detection using PDF Documents (Part 1…Making your machine learn how to see PDFs like a human
Nov 29, 2023
Nov 29, 2023