Month: February 2026

  • Benchmarking Visual Language models for OCR

    This is the full write-up on our experiment with Gábor Madarász. Digitizing books is a longstanding problem in the NLP field. It is an even harder problem in low-resource language such as Hungarian. We have tested in a short experiment how far the latest generation of Qwen models (3rd generation) has come. We will first…