Towards Prompt Engineering and Large Language Models for Post-OCR correction in handwritten texts
Abstract
This work explores the use of Large Language Models (LLMs) for post-OCR spelling correction in full sentences across Portuguese, French, and English. Using outputs from a state-of-the-art recognition model on the BRESSAY, RIMES, and IAM datasets, we evaluated two different zero-shot prompts. Closed LLMs, such as Gemini and GPT series, consistently outperform open-source models in reducing Character Error Rate (CER) and Word Error Rate (WER), while also offering faster inference. Despite the good accuracy of open models, their high computational demands hinder their practical use. Code is available at https://github.com/savi8sant8s/zero-shot-spelling-corrector.
References
Abonizio, H., Almeida, T. S., Laitz, T., Junior, R. M., Bonás, G. K., Nogueira, R., and Pires, R. (2025). Sabiá-3 technical report.
Boros, E., Ehrmann, M., Romanello, M., Najem-Meyer, S., and Kaplan, F. (2024). Postcorrection of historical text transcripts with large language models: An exploratory study. In Bizzoni, Y., Degaetano-Ortlieb, S., Kazantseva, A., and Szpakowicz, S., editors, Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024), pages 133–159, St. Julians, Malta. Association for Computational Linguistics.
de Araújo, S. S., Bezerra, B. L. D., de Sousa Neto, A. F., and Zanchettin, C. (2024). A proposal for post-OCR spelling correction using language models. In Latinx in AI @ NeurIPS 2024.
DeepSeek-AI, Guo, D., Yang, D., Zhang, H., and et al. (2025). Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.
Do, T., Tran, D. P., Vo, A., and Kim, D. (2025). Reference-based post-ocr processing with llm for precise diacritic text in historical document recognition.
Grosicki, E., Carre, M., Brodin, J.-M., and Geoffrois, E. (2008). RIMES evaluation campaign for handwritten mail processing. ICFHR 2008 : 11th International Conference on Frontiers in Handwriting Recognition, pages 1–6.
Marti, U.-V. and Bunke, H. (2002). The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition, 5.
Neto, A., Bezerra, B., Araujo, S., Souza, W., Alves, K., Oliveira, M., Lins, S., Hazin, H., Rocha, P., and Toselli, A. (2024). Bressay: A brazilian portuguese dataset for offline handwritten text recognition. In 18th International Conference on Document Analysis and Recognition (ICDAR). Springer.
Neto, A. F. d. S., Bezerra, B. L. D., and Toselli, A. H. (2020a). Towards the natural language processing as spelling correction for offline handwritten text recognition systems. Applied Sciences, 10(21).
Neto, A. F. d. S., Bezerra, B. L. D., Toselli, A. H., and Lima, E. B. (2020b). Htr-flor: A deep learning system for offline handwritten text recognition. In 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 54–61.
Principe, J. P. P., Fischer, A., and Scius-Bertrand, A. (2025). Post-correction of handwriting recognition using large language models. In Buntine, W., Fjeld, M., Tran, T., Tran, M.-T., Huynh Thi Thanh, B., and Miyoshi, T., editors, Information and Communication Technology, pages 106–118, Singapore. Springer Nature Singapore.
Qwen, Yang, A., Yang, B., Zhang, B., and et al. (2025). Qwen2.5 technical report.
Sánchez, J. A., Romero, V., Toselli, A. H., Villegas, M., and Vidal, E. (2019). A set of benchmarks for handwritten text recognition on historical documents. Pattern Recognition, 94:122–134.
Team, G., Kamath, A., Ferret, J., Pathak, S., and et al. (2025). Gemma 3 technical report.
Thomas, A., Gaizauskas, R., and Lu, H. (2024). Leveraging LLMs for post-OCR correction of historical newspapers. In Sprugnoli, R. and Passarotti, M., editors, Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024, pages 116–121, Torino, Italia. ELRA and ICCL.
Vargas, D. S., de Oliveira, L. L., Moreira, V. P., Bazzo, G. T., and Lorentz, G. A. (2021). socrates a post-ocr text correction method. In Anais do XXXVI Simpósio Brasileiro de Bancos de Dados, pages 61–72, Porto Alegre, RS, Brasil. SBC.
Veninga, M. (2024). Llms for ocr post-correction.
Zhang, J., Haverals, W., Naydan, M., and Kernighan, B. W. (2024). Post-ocr correction with openai’s gpt models on challenging english prosody texts. In Proceedings of the ACM Symposium on Document Engineering 2024, DocEng ’24, New York, NY, USA. Association for Computing Machinery.
