Daichi Azuma

prof_pic.jpg

Tokyo, Japan

daichi.azuma@weblab.t.u-tokyo.ac.jp

I am a Ph.D. student at the Matsuo-Iwasawa Laboratory, The University of Tokyo.

My research focuses on Embodied AI, at the intersection of 3D Computer Vision and Natural Language Processing. I aim to develop intelligent agents that can understand, navigate, and interact with the physical world through language and visual perception.

I am particularly interested in how multimodal learning and 3D scene understanding can enable such agents to reason and act effectively in complex environments.

news

Jun 29, 2025 Two papers have been accepted to ICCV2025!
  • GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields
  • CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information
For more details, check out publications page
Oct 08, 2024 As a working doctoral student, I will be enrolled Matsuo Lab at the University of Tokyo from April 2025.
Oct 07, 2024 We will be presenting at IROS2024 held in Abu Dhabi, UAE at October 14-18, 2024.
Aug 08, 2024 We will be presenting at 日本ロボット学会学術講演会(RSJ2024) held in Osaka at September 6, 2024.
  • 3D1-04: 基盤モデルと地図モジュールを用いたゼロショットロボット質問応答の実現

publications

  1. GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields
    Shunsuke Yasuki, Taiki Miyanishi, Nakamasa Inoue, Shuhei Kurita, Koya Sakamoto, Daichi Azuma, Lee Jungdae, Masato Taki, and Yutaka Matsuo
    In IEEE/CVF International Conference on Computer Vision (ICCV), 2025
  2. CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information
    Jungdae Lee, Taiki Miyanishi, Shuhei Kurita, Koya Sakamoto, Daichi Azuma, Yutaka Matsuo, and Nakamasa Inoue
    In IEEE/CVF International Conference on Computer Vision (ICCV), 2024
  3. Answerability Fields: Answerable Location Estimation via Diffusion Models
    Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Koya Sakamoto, and Motoaki Kawanabe
    In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
  4. Map-based Modular Approach for Zero-shot Embodied Question Answering
    Koya Sakamoto, Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, and Motoaki Kawanabe
    In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
  5. Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D Scans
    Taiki Miyanishi, Daichi Azuma, Shuhei Kurita, and Motoaki Kawanabe
    In The 10th International Conference on 3D Vision (3DV), 2024
  6. ScanQA: 3D Question Answering for Spatial Scene Understanding
    Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, and Motoaki Kawanabe
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

achievements

International Conference

  • Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Koya Sakamoto and Motoaki Kawanabe, “Answerability Fields: Answerable Location Estimation via Diffusion Models”, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2024), 2024.
  • Koya Sakamoto, Daichi Azuma, Taiki Miyanishi, Shuhei Kurita and Motoaki Kawanabe, “Map-based Modular Approach for Zero-shot Embodied Question Answering”, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2024), 2024.
  • Taiki Miyanishi, Daichi Azuma, Shuhei Kurita and Motoaki Kawanabe, “Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D Scans”, International Conference on 3D Vision 2024 (3DV2024), 2024.
  • Daichi Azuma*, Taiki Miyanishi*, Shuhei Kurita* and Motoaki Kawanabe, “ScanQA: 3D Question Answering for Spatial Scene Understanding”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2022), pages 19129-19139, New Orleans, 2022. *: Equally contributed.

Local Conference

  • 地理情報を考慮した3D都市ビジュアルプログラミング, 2025年度 人工知能学会全国大会(第39回),大阪, 2025.6, 安木駿介, 宮西大樹, 井上中順, 栗田修平, 坂本滉也, 東大地, Lee Jungdae, 瀧雅人, 松尾豊
  • 基盤モデルと地図モジュールを用いたゼロショットロボット質問応答の実現, 第42回 日本ロボット学会学術講演会 (RSJ2024),大阪, 2024.9, 坂本滉也, 東大地, 宮西大樹, 栗田修平, 川鍋一晃
  • 実世界質問応答のための拡散モデルを用いた回答可能位置の予測, 第27回 画像の認識・理解シンポジウム(MIRU2024),熊本, 2024.8, 東大地, 宮西大樹, 栗田修平, 坂本滉也, 川鍋一晃
  • 異なるRGB-Dスキャンを用いたデータセット横断3D言語接地, 2023年度 人工知能学会全国大会(第37回),熊本, 2023.6, 宮西大樹, 東大地, 栗田修平, 川鍋一晃
  • 屋内環境の意味的理解に向けた3次元質問応答, 第25回 画像の認識・理解シンポジウム(MIRU2022),兵庫, 2022.7, 東大地, 宮西大樹, 栗田修平, 川鍋一晃

Invited Talks

  • ScanQA: 3D Question Answering for Spatial Scene Understanding. MIRU2022. Daichi Azuma, Taiki Miyanishi, Shuhei Kurita and Motoaki Kawanabe

Academic Services

  • IROS2024 Reviewer
  • ACL ARR Reviewer
  • ICCV Reviewer