SwinTransformer + OBBDet

The sixth place winning solution (6/220) in the track of Fine-grained Object Recognition in High-Resolution Optical Images, 2021 Gaofen Challenge on Automated High-Resolution Earth Observation Image Interpretation.

Members

Qi Ming, Junjie Song, Yunpeng Dong.

Solution

Off-line date augmentation
We use random combination of affine transformation, flip, scaling, optical distortion for data augmentation.
Multi-scale training and testing
The training images are resized into sizes of 600, 800, and 1024 for training and testing.
Strong backbone
Swin transformer is adopt in ORCNN and RoI Transformer for better performance.
Model ensemble
We have merged the results from RoI Transformer, ORCNN, S2ANet, and ReDet.
Lower confidence
Set the output threshold into 0.005.

Tried but didn't work

Soft-NMS.
Adjust NMS threshold.
Class-agnostic NMS.
Mosaic, and mix up for data augmentation.
Oversample the categories with fewer instances.
Train the detectors for specific classes with low AP.
Multi-scale training and testing on SwinTransformer-based detectors (even dropped by about 1% mAP).

The sixth place winning solution (6/220) in 2021 Gaofen Challenge.

Related tags

Overview

SwinTransformer + OBBDet

Members

Solution

Tried but didn't work

Detections

Owner

ming71

Lecture materials for Cornell CS5785 Applied Machine Learning (Fall 2021)

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

Custom studies about block sparse attention.

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

Repo for "Physion: Evaluating Physical Prediction from Vision in Humans and Machines" submission to NeurIPS 2021 (Datasets & Benchmarks track)

OstrichRL: A Musculoskeletal Ostrich Simulation to Study Bio-mechanical Locomotion.

Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

PSPNet in Chainer

ReSSL: Relational Self-Supervised Learning with Weak Augmentation

Evaluation and Benchmarking of Speech Super-resolution Methods

InvTorch: memory-efficient models with invertible functions

4th place solution for the SIGIR 2021 challenge.

Deep learning model for EEG artifact removal

A pre-trained language model for social media text in Spanish

SuperSonic, a new open-source framework to allow compiler developers to integrate RL into compilers easily, regardless of their RL expertise

SOTA easy to use PyTorch-based DL training library

Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

MMFlow is an open source optical flow toolbox based on PyTorch

WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction"