myFoodQA: A Multimodal Dataset for Evaluating Cultural and Visual Reasoning in Myanmar Gastronomy
| dc.contributor.author | Shin Thant Phyo | |
| dc.contributor.author | Pyae Linn | |
| dc.contributor.author | Lynn Myat Bhone | |
| dc.contributor.author | Thet Hmue Khin | |
| dc.contributor.author | Eaint Kay Khaing Kyaw | |
| dc.contributor.author | Ye Kyaw Thu | |
| dc.date.accessioned | 2026-05-08T19:26:05Z | |
| dc.date.issued | 2025-11-12 | |
| dc.description.abstract | This paper introduces myFoodQA (Myanmar (Burmese) Food Question Answering), the first multimodal benchmark designed to evaluate AI models on Myanmar's rich gastronomic culture. A core contribution of this work is the thorough construction of the benchmark itself, which involved curating a diverse set of food images for 20 distinct dishes and, crucially, generating a novel corpus of 2,485 question-answer pairs. The benchmark features tasks for single-image, multi-image, and text-only reasoning, specifically designed to evaluate model understanding of ingredient recognition, cultural context, preparation methods, and comparative logic. To ensure authenticity, data was sourced and collected from personal photography and web-crawling, with all annotations, prompts and questions validated by native Burmese speakers. Leading vision-language models were evaluated in a zero-shot condition and revealed a large performance disparity. While models perform well on text-based tasks, the performance significantly deficit on image-based reasoning, which needs specific image understanding and extensive cultural knowledge. These findings reveal the limitations of current Large Language and Vision Models (LLMs and VLMs) regarding the Myanmar gastronomic domain. Consequently, this work establishes myFoodQA as a foundational resource for advancing multimodal AI in culturally relevant and low-resource settings. | |
| dc.identifier.doi | 10.1109/isai-nlp66160.2025.11320506 | |
| dc.identifier.uri | https://dspace.kmitl.ac.th/handle/123456789/20409 | |
| dc.subject | Nutritional Studies and Diet | |
| dc.subject | Culinary Culture and Tourism | |
| dc.subject | Multimodal Machine Learning Applications | |
| dc.title | myFoodQA: A Multimodal Dataset for Evaluating Cultural and Visual Reasoning in Myanmar Gastronomy | |
| dc.type | Article |