EarthSynth: Generating Informative Earth Observation with Diffusion Models

ArXiv 2025

Jiancheng Pan^*1,3, Shiye Lei^*2, Yuqian Fu†³,

Jiahao Li¹,

Yanxing Liu⁴,

Xiao He⁵,

Yuze Sun¹,

Long Peng⁶,

Xiaomeng Huang†¹, Bo Zhao†⁷,

Motivations

Remote sensing image (RSI) interpretation is fundamentally constrained by challenges such as severe class imbalance and limited availability of high-quality labeled data, significantly impeding the development of robust models for downstream tasks. However, labeling RSIs typically requires domain-specific expertise and substantial manual effort, making large-scale annotation time-consuming and costly. Consequently, an important research objective is to effectively exploit existing labeled Earth observation (EO) datasets by uncovering latent relationships among samples to improve data efficiency.

Contributions

🌍 EarthSynth and EarthSynth-180K

We propose EarthSynth, a diffusion-based generative foundation model trained on the EarthSynth-180K dataset with 180K samples aligned image, semantic mask, and text, achieving a unified solution to achieve multi-task generation.

🛰️ Counterfactual Composition (CF-Comp) and R-Filter

EarthSynth employs the CF-Comp strategy to balance the layout controllability and category diversity during training, enabling fine layout control for RSI generation. It further incorporates the R-Filter to extract more informative and high-quality synthesized samples.

EarthSynth-180K Dataset

EarthSynth-180K is derived from OEM, LoveDA, DeepGlobe, SAMRS, and LAE-1M datasets. It is further enhanced with mask and text prompt conditions, making it suitable for training foundation diffusion-based generative model. The EarthSynth-180K dataset is constructed using the Random Cropping and Category Augmentation strategies.

EarthSynth Model

EarthSynth is trained with CF-Comp training strategy on real and unrealistic logical mixed data distribution, learns remote sensing pixel-level properties in multiple dimensions, and builds a unified process for conditional diffusion training and synthesis.

Citation

Please consider cite us if you find our dataset, or model is useful to you.

@misc{pan2025earthsynthgeneratinginformativeearth, title={EarthSynth: Generating Informative Earth Observation with Diffusion Models}, author={Jiancheng Pan and Shiye Lei and Yuqian Fu and Jiahao Li and Yanxing Liu and Yuze Sun and Xiao He and Long Peng and Xiaomeng Huang and Bo Zhao}, year={2025}, eprint={2505.12108}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2505.12108}, }