Object detection, particularly open-vocabulary object detection, plays a crucial role in Earth sciences, such as environmental monitoring, natural disaster assessment, and landuse planning. However, existing open-vocabulary detectors, primarily trained on natural-world images, struggle to generalize to remote sensing images due to a significant data domain gap. Thus, this paper aims to advance the development of open-vocabulary object detection in remote sensing community.
🌍 LAE-1M Dataset powered by LAE-Label Engine
🛰️ LAE-DINO Open-Vocabulary Detector
In addition to the visual examples as shown in the benchmark figure, we further provide more infomations here. All the target datasets could be found on our github repo.
LAE-COD dataset examples: Raw data labelled by LAE-Label engine without rule-based filtering.
We propose a novel LAE-DINO detector for LAE, with dynamic vocaublary constuction (DVC) and VisualGuided Text Prompt Learning (VisGT) as novel modules. The Overall framework of our LAE-DINO:
@misc{pan2024locateearthadvancingopenvocabulary,
title={Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community},
author={Jiancheng Pan and Yanxing Liu and Yuqian Fu and Muyuan Ma and Jiaohao Li and Danda Pani Paudel and Luc Van Gool and Xiaomeng Huang},
year={2024},
eprint={2408.09110},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2408.09110},
}