Remote sensing image (RSI) interpretation is fundamentally constrained by challenges such as severe class imbalance and limited availability of high-quality labeled data, significantly impeding the development of robust models for downstream tasks. However, labeling RSIs typically requires domain-specific expertise and substantial manual effort, making large-scale annotation time-consuming and costly. Consequently, an important research objective is to effectively exploit existing labeled Earth observation (EO) datasets by uncovering latent relationships among samples to improve data efficiency.
🌍 EarthSynth and EarthSynth-180K
🛰️ Counterfactual Composition (CF-Comp) and R-Filter
EarthSynth-180K is derived from OEM, LoveDA, DeepGlobe, SAMRS, and LAE-1M datasets. It is further enhanced with mask and text prompt conditions, making it suitable for training foundation diffusion-based generative model. The EarthSynth-180K dataset is constructed using the Random Cropping and Category Augmentation strategies.
EarthSynth is trained with CF-Comp training strategy on real and unrealistic logical mixed data distribution, learns remote sensing pixel-level properties in multiple dimensions, and builds a unified process for conditional diffusion training and synthesis.
@misc{pan2025earthsynthgeneratinginformativeearth,
title={EarthSynth: Generating Informative Earth Observation with Diffusion Models},
author={Jiancheng Pan and Shiye Lei and Yuqian Fu and Jiahao Li and Yanxing Liu and Yuze Sun and Xiao He and Long Peng and Xiaomeng Huang and Bo Zhao},
year={2025},
eprint={2505.12108},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2505.12108},
}