Aktivitäten pro Jahr
Abstract
The advance in large multimodal models (LMMs) gives rise to autonomous bots that perform complex tasks using human-like reasoning on their own. The ability of large models to understand spatial relations and perform spatial operations, however, is known to be limited. This gap hinders the development of autonomous GIS analysts, travel planning assistants, and other possibilities of spatial bots. In this paper, we explore the impact of modality on the performance of LMMs in spatial planning tasks-specifically, retrieving a target brick by first removing all other bricks on top of it. Experiments demonstrate that what matters is not only the modality of the prompts (text or image), but also how informative the spatial descriptions are for the LMMs to complete the task. We propose novel concepts of task-implicit and task-explicit spatial descriptions to qualitatively quantify the task-specific informativity of prompts. Furthermore, we develop simple techniques to increase the spatial task-explicity of image prompts, and the accuracy of spatial planning increases from 26% to 100% accordingly.
| Originalsprache | Englisch |
|---|---|
| Titel | GeoAI 2024 - Proceedings of the 7th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery |
| Redakteure*innen | Song Gao, Gengchen Mai, Shawn Newsam, Lexie Yang, Dalton Lunga, Di Zhu, Bruno Martins, Samantha Arundel |
| Seiten | 99-105 |
| Seitenumfang | 7 |
| ISBN (elektronisch) | 9798400711763 |
| DOIs | |
| Publikationsstatus | Veröffentlicht - 18 Nov. 2024 |
| Veranstaltung | 7th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (GeoAI'24) - Atlanta, USA / Vereinigte Staaten Dauer: 29 Okt. 2024 → … https://geoai.ornl.gov/acmsigspatial-geoai/ |
Konferenz
| Konferenz | 7th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (GeoAI'24) |
|---|---|
| Kurztitel | GeoAI'24 |
| Land/Gebiet | USA / Vereinigte Staaten |
| Ort | Atlanta |
| Zeitraum | 29/10/24 → … |
| Internetadresse |
ÖFOS 2012
- 507003 Geoinformatik
- 102001 Artificial Intelligence
- 102035 Data Science
Aktivitäten
- 1 Vortrag
-
Task Explicity Matters in Prompting Large Multimodal Models for Spatial Planning Tasks
Majic, I. (Vortragende*r)
29 Okt. 2024Aktivität: Vorträge › Vortrag › Science to Science