Abstract
The increasing availability of granular and big data on various objects of interest
has made it necessary to develop methods for condensing this information into a representative and intelligible map. Financial regulation is a field that exempli-
fies this need, as regulators require diverse and often highly granular data from
financial institutions to monitor and assess their activities. However, processing
and analyzing such data can be a daunting task, especially given the challenges of dealing with missing values and identifying clusters based on specific features.
To address these challenges, we propose a variant of Lloyd’s algorithm that applies to probability distributions and uses generalized Wasserstein barycenters to construct a metric space which represents given data on various objects in condensed form. By applying our method to the financial regulation context, we demonstrate its usefulness in dealing with the specific challenges faced by regulators in this domain. We believe that our approach can also be applied more generally to other fields where large and complex data sets need to be represented in concise form.
has made it necessary to develop methods for condensing this information into a representative and intelligible map. Financial regulation is a field that exempli-
fies this need, as regulators require diverse and often highly granular data from
financial institutions to monitor and assess their activities. However, processing
and analyzing such data can be a daunting task, especially given the challenges of dealing with missing values and identifying clusters based on specific features.
To address these challenges, we propose a variant of Lloyd’s algorithm that applies to probability distributions and uses generalized Wasserstein barycenters to construct a metric space which represents given data on various objects in condensed form. By applying our method to the financial regulation context, we demonstrate its usefulness in dealing with the specific challenges faced by regulators in this domain. We believe that our approach can also be applied more generally to other fields where large and complex data sets need to be represented in concise form.
Originalsprache | Englisch |
---|---|
Titel | AISTATS |
DOIs | |
Publikationsstatus | Eingereicht - 2023 |
Publikationsreihe
Reihe | Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:11650-11670, 2023 |
---|---|
ISSN | 2640-3498 |
ÖFOS 2012
- 102019 Machine Learning