PDF to JSON Converter
Extract academic PDFs to structured JSON with document hierarchy and metadata.
1 credit per page — Free tier includes 30 credits/month.
Free account required. See pricing for high-volume use.
Built for your workflow
PDF to JSON conversion that actually works for academic papers.
Data Pipelines
Process PDF paper archives into structured data for analysis or ML training
RAG Systems
Build retrieval systems over PDF papers with queryable structure
Paper Analytics
Extract and analyze equations, citations, or sections across many papers
Format Conversion
Use JSON as an intermediate format to convert PDFs to any output you need
AI-powered extraction
State-of-the-art vision models extract structure, equations, and content from any PDF layout.
AI-Powered Extraction
Uses vision models to accurately extract text, equations, and structure from PDFs
Layout Understanding
Recognizes multi-column layouts, figures, tables, and sidebars
Equation Recognition
Converts mathematical equations back to LaTeX notation
Figure Extraction
Extracts figures and diagrams with captions preserved
Structured Output
Full document tree with typed elements - section, equation, figure, table, etc.
Metadata Extraction
Extracts title, authors, abstract, and publication info when available
See what you get
Real output from converting the “Attention Is All You Need” paper.
{
"by": "sciencestack.ai",
"title": "Deep Residual Learning for Image Recognition",
"abstract": "Deeper neural networks are more difficult to train...",
"authors": ["Kaiming He", "Xiangyu Zhang", "Shaoqing Ren", "Jian Sun"],
"document": [
{
"type": "section",
"title": "Introduction",
"content": [
"Deep networks naturally integrate low/mid/high-level features...",
{
"type": "equation",
"content": "\\mathcal{F}(x) := \\mathcal{H}(x) - x"
}
]
}
]
}Equations, cross-references, and structure — all preserved.
How it works
Upload
Upload any academic PDF - we extract structure, equations, and metadata
Process
AI extracts text, equations, and structure from your PDF
Download
Structured JSON with document hierarchy, equations as LaTeX, and rich metadata
Simple pricing
Per page
- AI-powered extraction
- Equation preservation
- Cross-reference resolution
- Bibliography included
- Structured document tree
Free tier includes 30 credits/month. View plans for more credits.
JSON export requires a Plus subscription.