EarthSynth: Generating Informative Earth Observation with Diffusion Models
Remote sensing image (RSI) interpretation is fundamentally constrained by challenges such as severe class imbalance and limited availability of high-quality labeled data, significantly impeding the development of robust models for downstream tasks. However, labeling RSIs typically requires domain-specific expertise and substantial manual effort, making large-scale annotation time-consuming and costly. Consequently, an important research objective is to effectively exploit existing labeled Earth observation (EO) datasets by uncovering latent relationships among samples to improve data efficiency.
š EarthSynth and EarthSynth-180K
š°ļø Counterfactual Composition (CF-Comp) and R-Filter
EarthSynth-180K is derived from OEM, LoveDA, DeepGlobe, SAMRS, and LAE-1M datasets. It is further enhanced with mask and text prompt conditions, making it suitable for training foundation diffusion-based generative model. The EarthSynth-180K dataset is constructed using the Random Cropping and Category Augmentation strategies.
EarthSynth is trained with CF-Comp training strategy on real and unrealistic logical mixed data distribution, learns remote sensing pixel-level properties in multiple dimensions, and builds a unified process for conditional diffusion training and synthesis.
This project references and builds upon several open-source models, including Diffusers, ControlNet, MM-Grounding-DINO, CLIP, and GSNet, and uses data from the OpenEarthMap, LoveDA, DeepGlobe, SAMRS, and LAE-1M datasets. We sincerely thank the authors and maintainers of these resources for supporting this work.
@article{pan2025earthsynth,
title={EarthSynth: Generating Informative Earth Observation with Diffusion Models},
author={Pan, Jiancheng and Lei, Shiye and Fu, Yuqian and Li, Jiahao and Liu, Yanxing and Sun, Yuze and He, Xiao and Peng, Long and Huang, Xiaomeng and Zhao, Bo},
journal={arXiv preprint arXiv:2505.12108},
year={2025}
}