Updates und Roadmap

Schauen Sie sich unsere neuesten Funktionen, Verbesserungen und Zukunftspläne an

Als nächstes kommt

Wir arbeiten an leistungsstarken neuen Funktionen, um die PDF-Konvertierung noch besser zu machen

Voraussichtliche Veröffentlichung: Zwischen dem 28. Dezember und dem 3. Januar

📱

Desktop- und iOS-Anwendungen

Einführung dedizierter Desktop- und iOS-Anwendungen, um mehr lokale Funktionen bereitzustellen und das mobile Benutzererlebnis zu verbessern

📚

Unterstützung für weitere eBook-Formate

Unterstützung für mehrere neue eBook-Formate, einschließlich AZW3, MOBI und mehr, um unterschiedliche Leser- und Geräteanforderungen zu erfüllen

∑

Fortsetzung der Optimierung der Formelanzeige

Weitere Verbesserung der Darstellung mathematischer Formeln und chemischer Gleichungen sowie Verbesserung der Erkennungsgenauigkeit für komplexe Formeln

Update-Verlauf

European Locale Expansion (fr-FR / de-DE / it-IT / es-ES)

Release Date: 2026-03-04

New Features

Added four new locales to the i18n resource set:
- fr-FR
- de-DE
- it-IT
- es-ES
Language list is now data-driven from src/i18n/common/*.json locale files.
Added base-language to regional-language resolution:
- fr / fr-CA -> fr-FR
- de / de-AT -> de-DE
- it -> it-IT
- es / es-MX -> es-ES

Improvements

Added i18n glossary for term consistency:
- docs/i18n-glossary.md
Added a reusable language acceptance checklist:
- docs/i18n-acceptance-checklist.md

Notes / Caveats

English (en) remains the fallback language.
Third-party locale packs currently support zh-CN and en; when locale-specific packs are missing, components will fall back to English.

This release adds support for GitHub-Flavored Markdown (GFM) table rendering, enhances error handling for interruption scenarios, and includes important bug fixes and dependency updates.

What's Changed

Features

GFM Table Support: Added intelligent conversion of HTML tables to GitHub-Flavored Markdown format in https://github.com/oomol-lab/pdf-craft/pull/345
- Simple tables are automatically converted to clean GFM pipe table syntax
- Complex tables (with colspan, rowspan, or multiple tbody sections) gracefully fall back to HTML format to preserve structure
- Prevents data loss from unsupported table features in GFM format
- Added comprehensive test coverage for various table scenarios
- New dependency: markdownify library for table conversion
Enhanced InterruptedError API: Added public properties to InterruptedError for better error introspection in https://github.com/oomol-lab/pdf-craft/pull/346
- New kind property exposes the interruption type (abort or token limit exceeded)
- New metering property provides direct access to OCR token usage data
- OCRTokensMetering is now exported from the public API for convenience
- Enables users to programmatically handle different interruption scenarios and track resource consumption

Bug Fixes

Fixed Error Propagation: Corrected handling of critical error types during page extraction in https://github.com/oomol-lab/pdf-craft/pull/343
- AbortError and TokenLimitError now propagate correctly instead of being wrapped in OCRError
- Ensures interruption signals are properly received and handled by calling code
- Prevents masking of user-initiated abort operations and token limit violations

Dependencies

EPUB Generator Update: Upgraded epub-generator dependency to fix MathML property declaration bug in https://github.com/oomol-lab/pdf-craft/pull/344
- Fixes https://github.com/Moskize91/epub-generator/issues/22: OPF files incorrectly declared mathml property when LaTeX-to-MathML conversion failed, causing EPUBCheck validation failures
- EPUB files now pass validation by only declaring MathML properties when actual MathML content exists

Migration Notes

InterruptedError Changes

If you're catching InterruptedError exceptions, you can now access detailed information about the interruption:

from pdf_craft import transform_markdown, InterruptedError, InterruptedKind

try:
    transform_markdown(
        pdf_path="input.pdf",
        markdown_path="output.md",
    )
except InterruptedError as error:
    # New in v1.0.11: Access interruption details
    if error.kind == InterruptedKind.ABORT:
        print("User aborted the operation")
    elif error.kind == InterruptedKind.TOKEN_LIMIT_EXCEEDED:
        print(f"Token limit exceeded: {error.metering.input_tokens} input tokens used")

    # Access token usage statistics
    print(f"Total tokens: {error.metering.input_tokens + error.metering.output_tokens}")

Table Rendering

Tables in your PDF documents will now be converted to GFM format when possible, making them more readable in markdown viewers. Complex tables will automatically fall back to HTML to preserve their structure.

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.10...v1.0.11

This release simplifies the table of contents (TOC) extraction API by replacing enum-based modes with a boolean flag, while adding LLM-powered chapter title analysis capabilities for improved TOC hierarchy detection.

What's Changed

Breaking Changes

Simplified TOC API: Replaced TocExtractionMode enum with a simpler toc_assumed boolean parameter in https://github.com/oomol-lab/pdf-craft/pull/341
- Removed toc_mode parameter from transform_markdown() and transform_epub() functions
- Removed TocExtractionMode from public API exports
- Introduced toc_assumed boolean flag to control TOC detection behavior

Features

LLM-Powered Chapter Title Analysis: Added support for LLM-based analysis of chapter titles to enhance TOC extraction accuracy in https://github.com/oomol-lab/pdf-craft/pull/341
- Automatically analyzes chapter title hierarchies when toc_llm is configured
- Provides more accurate chapter level detection for complex book structures
- Intelligently falls back to standard analysis when LLM is unavailable or encounters errors

Improvements

Enhanced Error Handling: Added robust error handling for LLM-based analysis with automatic recovery mechanisms in https://github.com/oomol-lab/pdf-craft/pull/341
- Better error diagnostics for LLM analysis failures
- Graceful degradation when LLM analysis fails, ensuring conversion continues successfully

Migration Guide

If you were using toc_mode in previous versions, update your code as follows:

Previous API (v1.0.9 and earlier)

from pdf_craft import transform_markdown, TocExtractionMode

# For Markdown conversion
transform_markdown(
    pdf_path="input.pdf",
    markdown_path="output.md",
    toc_mode=TocExtractionMode.NO_TOC_PAGE,  # Old parameter
)

# For EPUB conversion
transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    toc_mode=TocExtractionMode.AUTO_DETECT,  # Old parameter
)

New API (v1.0.10)

from pdf_craft import transform_markdown

# For Markdown conversion (assumes no TOC pages by default)
transform_markdown(
    pdf_path="input.pdf",
    markdown_path="output.md",
    toc_assumed=False,  # New boolean parameter (default: False)
)

# For EPUB conversion (assumes TOC pages exist)
transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    toc_assumed=True,  # New boolean parameter
)

Migration Mapping

Old `toc_mode` Value	New `toc_assumed` Value
`TocExtractionMode.NO_TOC_PAGE`	`False`
`TocExtractionMode.AUTO_DETECT`	`True`
`TocExtractionMode.LLM_ENHANCED`	`True` (with `toc_llm` configured)

LLM-Enhanced TOC Extraction

To use LLM-powered chapter title analysis:

from pdf_craft import transform_epub, BookMeta, LLM

# Configure LLM for TOC enhancement
toc_llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="cl100k_base",
)

transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    toc_assumed=True,  # Enable TOC detection
    toc_llm=toc_llm,   # Enable LLM-powered analysis
    book_meta=BookMeta(
        title="Book Title",
        authors=["Author"],
    ),
)

Notes

The toc_assumed parameter defaults to False for Markdown conversion and True for EPUB conversion (maintaining backward-compatible behavior)
LLM-powered chapter title analysis is optional and automatically falls back to standard analysis if not configured or if errors occur
The new API is simpler and more intuitive, reducing the cognitive load of choosing between multiple enum values

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.9...v1.0.10

This release introduces enhanced table of contents (TOC) extraction capabilities using LLM-powered analysis, enabling more accurate chapter structure detection and hierarchy recognition.

What's Changed

Features

LLM-Powered TOC Level Extraction: Implemented LLM-based analysis to automatically extract and recognize hierarchical levels in table of contents, improving chapter structure accuracy in https://github.com/oomol-lab/pdf-craft/pull/336
- Resolves https://github.com/oomol-lab/pdf-craft/issues/268
Enhanced TOC Page Processing: Modified the TOC detection algorithm to pass all identified TOC pages to the LLM for comprehensive analysis, rather than processing them individually in https://github.com/oomol-lab/pdf-craft/pull/338
- Improves the accuracy of chapter hierarchy detection
- Provides better context for LLM analysis by including all TOC pages

Refactoring

LLM Analyzer Refactoring: Refactored llm_analyser.py to improve code maintainability and extensibility in https://github.com/oomol-lab/pdf-craft/pull/339

Background

Previously, pdf-craft used statistical analysis to detect TOC pages and extract chapter structure. While effective for basic cases, this approach had limitations in accurately determining chapter hierarchies and handling complex TOC layouts. This release introduces LLM-powered analysis to better understand TOC structure and extract hierarchical information.

How It Works

The new TOC extraction process:

Identify TOC Pages: Uses statistical analysis to detect which pages contain table of contents
Collect All TOC Pages: Gathers all identified TOC pages for comprehensive analysis
LLM Analysis: Passes all TOC pages to an LLM to extract chapter titles and their hierarchical levels
Structure Generation: Uses the extracted hierarchy information to build accurate EPUB navigation structure

This approach combines the efficiency of statistical detection with the semantic understanding capabilities of LLMs, resulting in more accurate chapter organization in the final output.

Usage

The TOC extraction improvements are automatically applied when using the appropriate toc_mode:

from pdf_craft import transform_epub, BookMeta, TocExtractionMode

# Use AUTO_DETECT for statistical analysis (default for EPUB)
transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    toc_mode=TocExtractionMode.AUTO_DETECT,
    book_meta=BookMeta(
        title="Book Title",
        authors=["Author"],
    ),
)

# Use LLM_ENHANCED for LLM-powered extraction (requires toc_llm configuration)
from pdf_craft import LLM

toc_llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="cl100k_base",
)

transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    toc_mode=TocExtractionMode.LLM_ENHANCED,
    toc_llm=toc_llm,
    book_meta=BookMeta(
        title="Book Title",
        authors=["Author"],
    ),
)

Notes

Important: When using TocExtractionMode.LLM_ENHANCED, the toc_llm parameter must be configured. The conversion will fail if toc_llm is not provided.
This feature is most beneficial for books with complex chapter hierarchies
The statistical TOC page detection remains as the first step, with LLM analysis enhancing the extraction quality

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.8...v1.0.9

This release brings enhanced error handling flexibility, improved OCR text quality, and important security fixes.

What's Changed

Features

Enhanced Error Handling: The ignore_pdf_errors and ignore_ocr_errors parameters now accept custom checker functions in addition to boolean flags, enabling more granular control over error suppression in https://github.com/oomol-lab/pdf-craft/pull/323
Improved OCR Text Quality: Implemented n-gram detection to automatically filter out repetitive character sequences that indicate neural text degradation in https://github.com/oomol-lab/pdf-craft/pull/330

Security

Security Fix: Upgraded pypdf from ^6.4.1 to ^6.6.0 to address CVE-2026-22691 vulnerability in https://github.com/oomol-lab/pdf-craft/pull/329
- Fixes issue where malicious PDFs could cause long-running processes when processing invalid startxref entries
- Resolves https://github.com/oomol-lab/pdf-craft/issues/328

Other

Code formatting improvements in https://github.com/oomol-lab/pdf-craft/pull/331
README image link update by @alwaysmavs in https://github.com/oomol-lab/pdf-craft/pull/324

Example Usage

Custom Error Handling with Functions

from pdf_craft import transform_markdown, OCRError

def should_ignore_ocr_error(error: OCRError) -> bool:
    # Only ignore specific types of OCR errors
    return error.kind == "recognition_failed"

transform_markdown(
    pdf_path="input.pdf",
    markdown_path="output.md",
    ignore_ocr_errors=should_ignore_ocr_error,  # Pass custom function
)

Traditional Boolean Error Handling (Still Supported)

from pdf_craft import transform_markdown

transform_markdown(
    pdf_path="input.pdf",
    markdown_path="output.md",
    ignore_ocr_errors=True,  # Simple boolean flag
)

API Changes

The following parameters have been enhanced to accept both boolean values and callable functions:

ignore_pdf_errors: bool | Callable[[PDFError], bool]
ignore_ocr_errors: bool | Callable[[OCRError], bool]

This change is fully backward compatible - existing code using boolean values will continue to work without modifications.

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.7...v1.0.8

This release adds support for including cover images in both Markdown and EPUB conversions, enhancing the output format options.

What's Changed

Features

Cover Image Support: Added includes_cover parameter to both transform_markdown and transform_epub functions, allowing you to include the PDF's cover page as an image in the output in https://github.com/oomol-lab/pdf-craft/pull/319
- For Markdown conversion: The cover image is saved to the images folder and can be referenced in your document
- For EPUB conversion: The cover image is properly embedded in the EPUB file structure
- Default value is False for Markdown (to maintain backward compatibility) and True for EPUB

Example Usage

Markdown with Cover

from pdf_craft import transform_markdown

transform_markdown(
    pdf_path="input.pdf",
    markdown_path="output.md",
    markdown_assets_path="images",
    includes_cover=True,  # Include cover image
)

EPUB with Cover

from pdf_craft import transform_epub, BookMeta

transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    includes_cover=True,  # Include cover image (default)
    book_meta=BookMeta(
        title="Book Title",
        authors=["Author"],
    ),
)

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.6...v1.0.7

This release brings significant improvements to PDF rendering control, text quality, and error handling capabilities.

What's Changed

Features

Flexible DPI Control: Added dpi parameter to control PDF page rendering resolution (default: 300 DPI), allowing you to balance between image quality and file size in https://github.com/oomol-lab/pdf-craft/pull/315
Automatic Image Size Optimization: Introduced max_page_image_file_size parameter that automatically adjusts DPI when generated images exceed specified size limits, preventing overly large output files in https://github.com/oomol-lab/pdf-craft/pull/315
Resilient OCR Processing: Added ignore_ocr_errors parameter to continue processing when OCR recognition fails on individual pages, instead of stopping the entire conversion in https://github.com/oomol-lab/pdf-craft/pull/314
Improved Text Quality: Automatically removes Unicode surrogate characters from OCR-extracted text and PDF metadata (title, authors, publisher, etc.), ensuring cleaner output and better compatibility with downstream tools in https://github.com/oomol-lab/pdf-craft/pull/316

Documentation

DeepWiki Integration: Added DeepWiki badge for auto-refreshing documentation in by @YogeLiu https://github.com/oomol-lab/pdf-craft/pull/285

Dependencies

Updated epub-generator to 0.1.6

Example Usage

from pdf_craft import transform_markdown

transform_markdown(
    pdf_path="input.pdf",
    markdown_path="output.md",
    dpi=300,  # Control rendering resolution
    max_page_image_file_size=5242880,  # 5MB limit per page
    ignore_ocr_errors=True,  # Continue on OCR failures
)

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.5...v1.0.6

Release v1.0.5

What's Changed

Bug Fixes

GPU memory overflow: Fix out-of-memory errors on RTX 3060 (12GB VRAM) by upgrading doc-page-extractor dependency to optimize model loading sequence (https://github.com/oomol-lab/pdf-craft/pull/309, fixes https://github.com/oomol-lab/pdf-craft/issues/305)
TOC detection: Improve table of contents detection accuracy by ensuring page indexes are consecutive sequences within the first 17% of document and adding _TOC_SCORE_MIN_RATIO limitation (https://github.com/oomol-lab/pdf-craft/pull/311, https://github.com/oomol-lab/pdf-craft/pull/313)
Content processing: Fix content override issue (https://github.com/oomol-lab/pdf-craft/pull/312)

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.4...v1.0.5

Release v1.0.4

What's New

🎯 Table of Contents Detection and Smart Removal

pdf-craft now automatically detects and removes table of contents pages from the final output, preventing duplicate TOC content in generated EPUB files. The system uses statistical analysis to identify TOC pages by matching chapter titles against page content, then intelligently excludes these pages while preserving the navigation structure.

Key features:

Automatic TOC page detection using Aho-Corasick substring matching
Hierarchical TOC level analysis for improved chapter organization
XML-based TOC storage for better performance and flexibility
New toc_assumed parameter to control TOC detection behavior (default: True for EPUB, False for Markdown)

Implementation PRs:

📝 Raw HTML Tag Support in Markdown

Full support for CommonMark-compliant raw HTML tags in Markdown output. DeepSeek OCR often generates HTML tags (like  for superscripts) when processing scanned books - these are now properly preserved and rendered in both Markdown and EPUB formats.

Supported tags include:

Inline tags: , , , , <kbd>
Block-level tags: <div>, <center>, <details>, <summary>
Automatic safety filtering and attribute validation

Implementation PRs:

📊 Enhanced Table Rendering

Tables are now rendered in native HTML format for both Markdown and EPUB outputs, providing better structure and readability. Asset metadata now supports structured titles and captions for equations, images, and tables.

https://github.com/oomol-lab/pdf-craft/pull/306

📖 PDF Metadata Extraction

Automatically extracts book metadata (title, authors, publisher, ISBN, etc.) from PDF files and uses it to populate EPUB metadata. No need to manually specify book information when the PDF already contains it.

https://github.com/oomol-lab/pdf-craft/pull/284

📰 Multi-Column Layout Detection

Improved handling of multi-column layouts (common in academic papers and magazines) through histogram valley detection and coefficient-of-variation splitting. Layouts are now correctly grouped by column segments before processing.

https://github.com/oomol-lab/pdf-craft/pull/286

🐛 Bug Fixes

Fixed PIL crash on invalid bounding boxes: Added validation and normalization for layout bounding boxes to prevent crashes when cropping images with invalid coordinates (https://github.com/oomol-lab/pdf-craft/pull/295)
Fixed DeepSeek OCR center tag handling: Ignored alignment tags (<center>, <left>, <right>) generated by DeepSeek OCR that aren't needed in the output (https://github.com/oomol-lab/pdf-craft/pull/307)

🔧 Improvements

Refined layout joining logic: Improved paragraph merging across page boundaries with better handling of override assets and line continuation (https://github.com/oomol-lab/pdf-craft/pull/287, https://github.com/oomol-lab/pdf-craft/pull/288)
Updated dependencies:
- Upgraded doc-page-extractor from 1.0.10 to 1.0.11 (https://github.com/oomol-lab/pdf-craft/pull/289)
- Upgraded epub-generator from 0.1.2 to 0.1.5
- Added pyahocorasick 2.2.0 for efficient substring matching
CI/CD enhancements: Added merge-build workflow for automated builds on main branch pushes (https://github.com/oomol-lab/pdf-craft/pull/289)

📚 Documentation

Updated README with new toc_assumed parameter documentation (https://github.com/oomol-lab/pdf-craft/pull/304)
Refreshed documentation images with hosted assets

🔄 API Changes

New Parameters

toc_assumed parameter in transform_markdown() and transform_epub():
- When True: Attempts to locate and extract TOC from PDF to build document structure
- When False: Generates TOC based on document headings only
- Default: True for EPUB, False for Markdown

New Exports

PDFDocumentMetadata: Dataclass for PDF metadata extraction

🙏 Contributors

Thanks to everyone who contributed to this release!

📦 Installation

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install pdf-craft==1.0.4

For detailed installation instructions, see the Installation Guide.

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.3...v1.0.4

Release v1.0.3

What's Changed

License Improvements

Removed PyMuPDF (fitz) Dependency: Replaced PyMuPDF (AGPL-3.0) with Poppler for PDF parsing and rendering, maintaining pdf-craft's MIT license compatibility
- pdf-craft now uses Poppler via pdf2image (MIT) for all PDF operations
- This change ensures the entire project remains under the permissive MIT license

New Features

Custom PDF Handler Support: Added pdf_handler parameter to predownload_models(), transform_markdown(), and transform_epub() functions, allowing users to customize PDF rendering implementation
Poppler Integration: Migrated to Poppler (via pdf2image) for PDF parsing and rendering, providing better compatibility and control
New Public APIs: Exported PDFHandler, PDFDocument, DefaultPDFHandler, and DefaultPDFDocument for advanced customization
RENDERED Event: Added OCREventKind.RENDERED event to track PDF page rendering progress

Breaking Changes

⚠️ Parameter Renamed: ignore_fitz_errors → ignore_pdf_errors

Update your code: transform_markdown(..., ignore_pdf_errors=True) instead of ignore_fitz_errors=True
Update your code: transform_epub(..., ignore_pdf_errors=True) instead of ignore_fitz_errors=True

⚠️ Exception Renamed: FitzError → PDFError

Update your exception handling code accordingly

Dependencies

New Requirement: Poppler must be installed separately for PDF parsing
- Ubuntu/Debian: sudo apt-get install poppler-utils
- macOS: brew install poppler
- Windows: Download from oschwartz10612/poppler-windows
- See Installation Guide for details

Bug Fixes

Upgraded doc-page-extractor to fix bugs (#280)

Migration Guide

If you're upgrading from v1.0.2, please:

Install Poppler following the Installation Guide

Update parameter names in your code:

# Before (v1.0.2)
transform_markdown(..., ignore_fitz_errors=True)

# After (v1.0.3)
transform_markdown(..., ignore_pdf_errors=True)

Update exception handling if you catch FitzError:

# Before (v1.0.2)
from pdf_craft import FitzError

# After (v1.0.3)
from pdf_craft import PDFError

Full Changelog

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.2...v1.0.3

This release brings improvements to EPUB generation, inline LaTeX support, and enhanced handling of footnotes and tables.

What's Changed

New Features

Inline LaTeX Expression Support - Added support for preserving inline LaTeX mathematical expressions in both Markdown and EPUB outputs. A new inline_latex parameter (default: True) allows you to control this behavior for EPUB conversion
Assets in Footnotes - Footnotes can now contain images and other assets, which are properly preserved during conversion
- https://github.com/oomol-lab/pdf-craft/pull/272

Improvements

Enhanced Table of Contents Generation - Improved EPUB table of contents generation to analyze hierarchical structure based on font sizes, replacing the previous flat list format
- https://github.com/oomol-lab/pdf-craft/pull/266
- https://github.com/oomol-lab/pdf-craft/pull/267
Parameter Naming - Renamed parameters for better clarity:
- model → ocr_size
- Type: DeepSeekOCRModel → DeepSeekOCRSize
- https://github.com/oomol-lab/pdf-craft/pull/265
This change better reflects that the parameter controls the OCR model size rather than being a generic "model" reference.

Bug Fixes

Fixed table rendering issues - https://github.com/oomol-lab/pdf-craft/pull/272
Improved LaTeX escape handling - https://github.com/oomol-lab/pdf-craft/pull/270

Breaking Changes

⚠️ API Parameter Changes: The model parameter has been renamed to ocr_size in transform_markdown() and transform_epub() functions. Additionally, the type DeepSeekOCRModel has been renamed to DeepSeekOCRSize.

Migration:

# Old (v1.0.1)
transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    model="gundam"
)

# New (v1.0.2)
transform_epub(
    pdf_path="input.pdf",
    epub_path="output.epub",
    ocr_size="gundam"
)

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.1...v1.0.2

What's New in v1.0.1

Enhanced Error Handling: Added structured error types (FitzError, OCRError, InterruptedError) with detailed page and step information for better debugging
Improved Stability: Fixed crashes when encountering single-page PyMuPDF errors - now handles page-level failures gracefully
Online Demo: Try PDF Craft directly in your browser at PDF Craft without any installation

What's Changed

docs(project): add online demo links by @Moskize91 in https://github.com/oomol-lab/pdf-craft/pull/260
feat: add new errors by @Moskize91 in https://github.com/oomol-lab/pdf-craft/pull/262
feat: don't crash when find just a page of fitz error by @Moskize91 in https://github.com/oomol-lab/pdf-craft/pull/263
doc(project): sync README.md by @Moskize91 in https://github.com/oomol-lab/pdf-craft/pull/264

Full Changelog: https://github.com/oomol-lab/pdf-craft/compare/v1.0.0...v1.0.1

🎉 PDF Craft v1.0.0 Official Release

PDF Craft v1.0.0 is now officially released. This version includes major architectural changes and brings significant performance improvements.

🚀 Core Changes: Fully Embracing DeepSeek OCR

The biggest change in v1.0.0 is the complete rewrite based on DeepSeek OCR, eliminating the dependency on LLM for text correction.

DeepSeek OCR is a powerful open-source OCR engine that supports complex content recognition (tables, formulas, images, footnotes, etc.) with excellent document structure understanding capabilities. Thanks to DeepSeek OCR, pdf-craft now offers:

Fully Local Processing: The entire conversion process runs completely locally without any network requests. No need to configure LLM APIs, and no risk of conversion failures due to network issues or API outages—in the old version, a single LLM request failure would halt the entire conversion process.
Faster Speed: Compared to v0.2.8 which required multiple LLM calls for text correction, the new version uses direct OCR recognition with significantly improved speed.
Higher Accuracy: DeepSeek OCR excels at document structure analysis, table recognition, and formula extraction, delivering high-quality results without secondary correction.
Simpler API: Removed complex LLM configuration and multi-step processing workflows. Now conversion can be completed with a single function call.

Additionally, v1.0.0 has fully migrated to DeepSeek OCR (MIT License), removing the previous AGPL-3.0 dependency. The entire project now uses the more permissive MIT License, making it easier for commercial use and integration!

⚠️ Important Change: CUDA Environment Required

The new version requires a CUDA environment to run. This is because DeepSeek OCR depends on CUDA acceleration for efficient document recognition. The old version (v0.2.8) could work in pure CPU environments using LLM, but the new version cannot run without a GPU.

If your environment doesn't support CUDA, do not upgrade to v1.0.0. Continue using v0.2.8:

pip install pdf-craft==0.2.8

For specific CUDA environment installation instructions, please refer to the Installation Guide.

🚫 When NOT to Upgrade

Continue using v0.2.8 in the following situations:

No GPU or CUDA Environment: The new version requires CUDA and cannot run without GPU
Need LLM Text Correction: The new version has removed LLM correction functionality. If your use case requires secondary correction of OCR results, continue using the old version or use it in combination with epub-translator

🙏 Acknowledgments

Thanks to DeepSeek OCR for being open source, and to all community members who have contributed code and feedback to pdf-craft!

If you have a CUDA environment, upgrade to v1.0.0 now and experience faster, more stable, and simpler PDF conversion! 🚀