all search terms
2024 年 12 月 24 日
Deciphering the Underserved Benchmarking LLM OCR for LowResource Scripts
title: Deciphering the Underserved Benchmarking LLM OCR for LowResource Scripts
publish date:
2024-12-20
authors:
Muhammad Abdullah Sohail et.al.
paper id
2412.16119v1
download
abstracts:
This study investigates the potential of Large Language Models (LLMs), particularly GPT-4o, for Optical Character Recognition (OCR) in low-resource scripts such as Urdu, Albanian, and Tajik, with English serving as a benchmark. Using a meticulously curated dataset of 2,520 images incorporating controlled variations in text length, font size, background color, and blur, the research simulates diverse real-world challenges. Results emphasize the limitations of zero-shot LLM-based OCR, particularly for linguistically complex scripts, highlighting the need for annotated datasets and fine-tuned models. This work underscores the urgency of addressing accessibility gaps in text digitization, paving the way for inclusive and robust OCR solutions for underserved languages.
QA:
coming soon
编辑整理: wanghaisheng 更新日期:2024 年 12 月 24 日