📦 OCR Time Capsule

About OCR Time Capsule

OCR Time Capsule helps researchers and digital humanities professionals compare original OCR text with AI-improved versions from historical documents. This tool is designed for browsing pre-processed OCR improvements stored in HuggingFace datasets.

🎯 Use Cases

  • Review OCR corrections from historical newspapers
  • Quality assessment of digitization projects
  • Training data validation for OCR models
  • Accessibility improvements for scanned texts

⚡ Key Features

  • Side-by-side text comparison
  • Character, word, and line-level diffs
  • Keyboard navigation (J/K or arrows)
  • Direct HuggingFace dataset integration

💡 Tip: For live OCR processing with vision-language models, check out OCR Time Machine. OCR Time Capsule focuses on exploring already-processed datasets for faster navigation and analysis.

Loading dataset...

Error

OCR Quality Metrics

Similarity
Characters
Added
Removed
Words
Markdown Detected

Original OCR


                            

Improved OCR Markdown