LAVA-Workshop

Home LAVA 2025 (ACMMM 2025)LAVA Challenge (ACMMM 2025)LAVA 2024 (ACCV 2024)

Accepted Papers

(Summary paper) Daichi Sato, Duc Minh Vo, Khan Md. Anwarus Salam, Hidenori Shoji, Yuma Matsuoka, Takara Taniguchi, Kaito Baba, Hideki Nakayama: LAVA Challenge 2025: Advancing Japanese Document Understanding with a Challenging PDF-VQA Benchmark (Challenge Summary Paper)
(1st Prize, SYSUpporter team) Longfeng Chen, Zheng Xiao, Juyuan Wang, Zeyu Huang, Yawen Zeng, Jin Xu: HEAR: A Holistic Extraction and Agentic Reasoning Framework for Document Understanding
(2nd Prize, Woof team) Haoxuan Li, Wei Song, Aofan Liu, Peiwu Qin: AdaDocVQA: Adaptive Framework for Long Document Visual Question Answering in Low-Resource Settings
(3rd Prize, nsbsk team) Ao Zhou, Zebo Gu, Tenghao Sun, Jiawen Chen, Mingsheng Tu, Zifeng Cheng, Yafeng Yin, Zhiwei Jiang, Qing Gu: Hierarchical Vision-Language Reasoning for Multimodal Multiple-Choice Question Answering
(4th Prize, char team) Mizuki Yamano, Keito Fukuoka, Hisashi Miyamori: Two-Stage Approach Using a Pretrained Language Model for Question Answering on Japanese Document Images

Call for Challenge Participants

Challenge Overview: The primary goal of this challenge is to advance the capability of Large Vision-Language Models to accurately interpret and understand complex visual data such as Data Flow Diagrams (DFDs), Class Diagrams, Gantt Charts, and Building Design Drawings. We invite AI researchers, data scientists, and practitioners with interest and experience in natural language processing, computer vision, and multimodal learning to join this grand challenge. Participants can register as individuals or teams. They are required to develop a model that can answer questions related to the input data. This year, we invite participants to solve the problem of Japanese government-related and business documents. Each document will be in pdf format which is associated with 10-choice question. We show an example below.

Datasets and Submission guideline:

Please visit our Kaggle page for detailed information on the dataset and submission format.
To be eligible for the final evaluation, participants must submit a report along with their solution by the deadline below. Submission of a paper is optional and intended for proceeding publication. Guidelines for report and paper submissions will be provided at a later date.
We kindly request that the winning teams attend the event in person, as required by the ACMMM organizers.

We note that all data are for research purposes only under CC-BY-NC (商用利用不可) license. We appreciate your understanding and cooporation.

Prizes:We will cover the ACMMM 2025 registration fees for the top 3 winning teams.
Computational Resources:Participants from the University of Tokyo may use SoftBank Beyond AI SANDBOX GPUs.

Important Dates

Registration opened: 2025/3/15
Data released: 2025/4/15 2025/4/17
Registration closed: 2025/5/31
Private data released: 2025/6/23 (to be updated) We decided to use the test data for public and private leaderboard
Results, Report, Paper submission deadline: 2025/6/30
Code and short report submision: 2025/7/1
Paper submission (on invitation): 2025/7/30
Notification: 2025/7/24
Camera-ready submission: 2025/8/26
Grand challenge date: 2025/10/27 - 31