Operationalizing Geographic Diversity for the Evaluation of AI-Generated Content

Publications: Contribution to journalArticlePeer Reviewed

Abstract

The introduction and widespread use of foundation models has accelerated the necessity of identifying geographic bias in AI-generated content. In this respect, we operationalize geographic diversity as a countermeasure. We refine the notion of geographic diversity as the quality of including data from various places and maintaining a balance across these places in both learning and generation processes. Drawing from information theory, ecology, and prior work in AI evaluation, we provide an entropy-based definition of geographic diversity and propose to measure geographic diversity as effective numbers of places. We apply our measurement by studying generated content from six large language models, including GPT-3.5, GPT-4o, Mistral 7B, Mistral Large, Claude 3 Haiku, and Claude 3.5 Sonnet. Our case study reveals that prompt variations, such as modifying concept mentions or scale mentions in a user prompt, can result in more geographic diversity in their generated content. In addition, we observe that less advanced models can generate more geographically diverse content than state-of-the-art ones. Furthermore, certain places dominate the generated content of these models, yet their prominence does not reflect their real-world counterparts. Our work stresses the importance of quantifying geographic information in AI-generated content to support GeoAI and the broader AI evaluation in the age of foundation models.
Original languageEnglish
Article numbere70057
JournalTransactions in GIS
Volume29
Issue number3
DOIs
Publication statusPublished - May 2025

Austrian Fields of Science 2012

  • 507003 Geoinformatics

Keywords

  • GeoAI
  • geographic diversity
  • generative AI
  • geographic bias
  • data diversity
  • information theory

Fingerprint

Dive into the research topics of 'Operationalizing Geographic Diversity for the Evaluation of AI-Generated Content'. Together they form a unique fingerprint.

Cite this