CV
⚒️Digest curriculum vitae is coming soon
⭐️Detailed curriculum vitae is here
📄 Publications
International Conference
- Shintaro Ozaki, Kazuki Hayashi, Miyu Oba, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe. “BQA: Body Language Question Answering Dataset for Video Large Language Models” Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL-2025, Main, short), 2025/07. [paper|arXiv]
- Yusuke Ide, Yuto Nishida, Miyu Oba, Yusuke Sakai, Justin Vasselli, Hidetaka Kamigaito, Taro Watanabe. “How to Make the Most of LLMs’ Grammatical Knowledge for Acceptability Judgments” Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-2025, Main, long), 2025/05 [paper|arXiv]
- Akari Haga, Akiyo Fukatsu, Miyu Oba, Arianna Bisazza, Yohei Oseki. “BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency” the BabyLM Challenge at the 28th Conference on Computational Natural Language Learning, 2024/11 [Outstanding paper award] [paper|arXiv]
- Miyu Oba, Yohei Oseki, Akiyo Fukatsu, Akari Haga, Hiroki Ouchi, Taro Watanabe, Saku Sugawara. “Can Language Models Induce Grammatical Knowledge from Indirect Evidence?” Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-2024, Main, long), 2024/11 [paper|arXiv]
- Akari Haga, Saku Sugawara, Akiyo Fukatsu, Miyu Oba, Hiroki Ouchi, Taro Watanabe, Yohei Oseki. “Modeling Overregularization in Children with Small Language Models.” Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL-2024, Findings, long), 2024/08. [paper|[arXiv]]
- Miyu Oba, Akari Haga, Akiyo Fukatsu, Yohei Oseki. “BabyLM Challenge: Curriculum learning based on sentence complexity approximating language acquisition”, the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, 2023/12. [paper|[arXiv]]
- Miyu Oba, Tatsuki Kuribayashi, Hiroki Ouchi, Taro Watanabe. “Second Language Acquisition of Neural Language Models.” Findings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL-2023, Findings, long), 2023/07. [paper|arXiv]
Journal
- Miyu Oba, Yohei Oseki, Akiyo Fukatsu, Akari Haga, Hiroki Ouchi, Taro Watanabe, Saku Sugawara. Inducing Grammatical Knowledge from Indirect Evidence in Language Models. Journal of Natural Language Processing, Volume 33, Issue 1, 2026/03. (to appear) [[paper]]
- Yusuke Ide, Yuto Nishida, Justin Vasselli, Miyu Oba, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe. Rethinking the Evaluation Methods of LLMs’ Grammatical Knowledge. Journal of Natural Language Processing, Volume 33, Issue 1, 2026/03. (to appear) [[paper]]
- Miyu Oba, Tatsuki Kuribayashi, Hiroki Ouchi, Taro Watanabe. Second Language Acquisition of Neural Language Models. Journal of Natural Language Processing (JNLP), Volume 31, Issue 2, 2024/06. [Paper award] [paper]
Domestic Conference
- 帖佐 宗浩, 西田 悠人, 大羽未悠, 渡辺 太郎. ニューラル言語モデルの学習初期における単語の分節化. 第265回自然言語処理研究会, 6pages, 2025/09. [paper]
- 芳賀あかり, 深津聡世, 大羽未悠, Arianna Bisazza, 大関洋平. 言語モデルの事前学習におけるバリエーションセットの効果. 言語処理学会第31回年次大会, 4pages, 2025/03. [paper]
- 尾崎 慎太郎, 林 和樹, 大羽 未悠, 坂井 優介, 上垣外 英剛, 渡辺 太郎. マルチモーダル大規模言語モデルは非言語コミュニケーションを理解しているか?. 第19回NLP若手の会 シンポジウム (YANS), 2024/09. [Encouragement Award (奨励賞) to the first author]
- 井手佑翼, 西田悠人, 大羽未悠, 坂井優介, Justin Vasselli, 渡辺太郎, 上垣外英剛. 大規模言語モデルに適した容認性判断手法の検討. 第260回自然言語処理研究会, 8pages, 2024/06. [Young Researcher Award (若手奨励賞) to the first author] [paper]
- 大羽未悠, 大関洋平, 深津聡世, 芳賀あかり, 大内啓樹, 渡辺太郎, 菅原朔. 言語モデルの文法知識評価における間接肯定証拠の分析. 言語処理学会第30回年次大会, 4pages, 2024/03. [paper]
- 芳賀あかり, 菅原朔, 深津聡世, 大羽未悠, 大内啓樹, 渡辺太郎, 大関洋平. 小規模言語モデルによる子供の過剰一般化のモデリング. 言語処理学会第30回年次大会, 4pages, 2024/03. [paper]
- 大羽未悠, 芳賀あかり, 深津聡世, 大関洋平. 言語獲得過程を模倣した文の複雑さに基づくカリキュラム学習. 第18回NLP若手の会 シンポジウム (YANS), 2023/08.
- 大羽未悠, 栗林樹生, 大内啓樹, 渡辺太郎. 言語モデルの第二言語獲得. 言語処理学会第29回年次大会, 4pages, 2023/3. [Young Researcher Award (若手奨励賞)] [paper]
- 大羽未悠, 栗林樹生, 大内啓樹, 渡辺太郎. 言語モデルの第二言語獲得効率. 第254回自然言語処理研究会, 6pages, 2022/11. [IPSJ Yamashita SIG Research Award (山下記念研究賞), Best Paper Award (優秀研究賞)] [paper]
- 大羽未悠, 栗林樹生, 大内啓樹, 渡辺太郎. 言語モデルの第二言語獲得効率. 第17回NLP若手の会 シンポジウム (YANS), 2022/08. [Encouragement Award (奨励賞)]
🎓 Education
- 2024/04-present: Ph.D. in Engineering
- Division of Information Science, NARA Institute of Science and Technology
- Supervisor: Taro Watanabe
- Natural Language Processing, Computational Linguistics
- Computer Science
- 2023/04-2024/03: Master of Engineering
- Division of Information Science, NARA Institute of Science and Technology
- Supervisor: Taro Watanabe
- Natural Language Processing, Computational Linguistics
- Computer Science
- 2018/04-2022/03: Bachelor of Foreign Studies
- Faculty of Foreign Studies, Department of French Studies, Nanzan University, Japan
- Supervisor: Ryoji Mogi
- Linguistics
- Cultural Studies
- 2015/04-2018/03: High School Diploma, Meiwa High School
💼 Experiences
- 2025/09-present: Machine Learning Engineer (Full-time Internship)
- VLA Team at Turing Inc.
- Mentor: Hiroki Teranishi
- Developing vision-language-action models for autonomous driving.
- 2024/10-2025/03: Guest Researcher
- Human-Centered Data Science group at University of Göttingen, Germany
- Supervisor: Lisa Beinborn
- Cross-lingual cognitive processing in multilingual and bilingual language models.
- 2023/10-2024/03: Teaching Assistant
- Nara Institute of Science and Technology
- Mentored high school students in research projects using large language models and NLP techniques.
- 2023/08-2023/10: Research Assistant
- Digitized scanned industry yearbooks into structured tabular data using NLP-based processing and LLMs.
- 2023/04-Present: Research Assistant
- National Institute of Informatics
- Advisor: Saku Sugawara
- Investigated language acquisition in language models from the perspective of linguistics and cognitive science.
- 2022/08-2023/01: NLP R\&D Engineer
- Trustworthy AI team at LINE Corporation
- Ethics and trustworthiness in NLP.
- Developed evaluation methods for fairness in language models and a stop-word detection system.
- 2020/06-2022/03: Data Scientist
- ROX Inc.
- Demand forecasting, data analysis, and application development across logistics, retail, and tourism domains.
- Built forecasting software for customer demand, including a feature to predict customer volumes and export calendar-based forecasts as PDFs.
💰 Grants
- 2025/04-Present: Research Fellowship for Young Scientists by Japan Society for the Promotion of Science (PhD Fellowship; DC2)
- 2024/11: Scholarship for Study abroad by the Association for Natural Language Processing
- 2024/04-2025/03: NAIST Granite Program (PhD Fellowship; JST SPRING)
- 2022/04-2024/03: JASSO Scholarship: Full repayment exemption due to outstanding achievements
- 2023/07: ACL SRW Travel Grants
🗣️ Invited Talks
- 2025/09: The Option of a Research Stay During a PhD Program. The 20th Symposium of Young Researcher Association for NLP Studies: YANS 2025 [poster]
- 2025/03: Second Language Acquisition of Language Models. The 31th Annual Meeting of the Association for Natural Language Processing: NLP 2025
- 2023/03: Second Language Acquisition of Language Models. Workshop at the 29th Annual Meeting of the Association for Natural Language Processing: NLP 2023. [slides]
🔍 Reviewer
- ACL Rolling Review: 2024 (Emergency), 2025
- EMNLP 2025 BabyLM Workshop
🏆 Awards
- Paper Award
- The Association for Natural Language Processing (2025)
- Outstanding Paper Award
- The BabyLM Challenge at the 28th Conference on Computational Natural Language Learning: CoNLL 2024
- Young Researcher Award
- The 29th Annual Meeting of the Association for Natural Language Processing: NLP 2023
- IPSJ Yamashita SIG Research Award
- The 254th IPSJ NLP Conference
- Excellent Research Award
- The 254th IPSJ NLP Conference
- Encouragement Award
- The 17th Symposium of Young Researcher Association for NLP Studies: YANS 2022
- NTT Resonant Award, Studio Arcana Award
- JPHACKS 2020
💪 Skills
- Languages: Japanese (native), English (research and professional use)
- Programming Languages: Python, R
- Frameworks / Tools: PyTorch, Hugging Face Transformers, ONNX
Misc.
- 2021/12: Fundamental Information Technology Engineer Examination (FE; 基本情報技術者試験)
- 2020/12: JPHACKS2020 [Finalists, NTT Resonant Award & Studio Arcana Award]
- 2020/11: Call for Code2020 [Regional Finalists]
- 2020/09: TOEIC 855
- 2020/06: Build@Mercari (Software Engineer Training Program)
