NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

Overview

NanoDet-Plus

Super fast and high accuracy lightweight anchor-free object detection model. Real-time on mobile devices.

CI testing Codecov GitHub license Github downloads GitHub release (latest by date)

  • Super lightweight: Model file is only 980KB(INT8) or 1.8MB(FP16).
  • Super fast: 97fps(10.23ms) on mobile ARM CPU.
  • 👍 High accuracy: Up to 34.3 mAPval@0.5:0.95 and still realtime on CPU.
  • 🤗 Training friendly: Much lower GPU memory cost than other models. Batch-size=80 is available on GTX1060 6G.
  • 😎 Easy to deploy: Support various backends including ncnn, MNN and OpenVINO. Also provide Android demo based on ncnn inference framework.

Introduction

NanoDet is a FCOS-style one-stage anchor-free object detection model which using Generalized Focal Loss as classification and regression loss.

In NanoDet-Plus, we propose a novel label assignment strategy with a simple assign guidance module (AGM) and a dynamic soft label assigner (DSLA) to solve the optimal label assignment problem in lightweight model training. We also introduce a light feature pyramid called Ghost-PAN to enhance multi-layer feature fusion. These improvements boost previous NanoDet's detection accuracy by 7 mAP on COCO dataset.

NanoDet-Plus 知乎中文介绍

NanoDet 知乎中文介绍

QQ交流群:908606542 (答案:炼丹)


Benchmarks

Model Resolution mAPval
0.5:0.95
CPU Latency
(i7-8700)
ARM Latency
(4xA76)
FLOPS Params Model Size
NanoDet-m 320*320 20.6 4.98ms 10.23ms 0.72G 0.95M 1.8MB(FP16) | 980KB(INT8)
NanoDet-Plus-m 320*320 27.0 5.25ms 11.97ms 0.9G 1.17M 2.3MB(FP16) | 1.2MB(INT8)
NanoDet-Plus-m 416*416 30.4 8.32ms 19.77ms 1.52G 1.17M 2.3MB(FP16) | 1.2MB(INT8)
NanoDet-Plus-m-1.5x 320*320 29.9 7.21ms 15.90ms 1.75G 2.44M 4.7MB(FP16) | 2.3MB(INT8)
NanoDet-Plus-m-1.5x 416*416 34.1 11.50ms 25.49ms 2.97G 2.44M 4.7MB(FP16) | 2.3MB(INT8)
YOLOv3-Tiny 416*416 16.6 - 37.6ms 5.62G 8.86M 33.7MB
YOLOv4-Tiny 416*416 21.7 - 32.81ms 6.96G 6.06M 23.0MB
YOLOX-Nano 416*416 25.8 - 23.08ms 1.08G 0.91M 1.8MB(FP16)
YOLOv5-n 640*640 28.4 - 44.39ms 4.5G 1.9M 3.8MB(FP16)
FBNetV5 320*640 30.4 - - 1.8G - -
MobileDet 320*320 25.6 - - 0.9G - -

Download pre-trained models and find more models in Model Zoo or in Release Files

Notes (click to expand)
  • ARM Performance is measured on Kirin 980(4xA76+4xA55) ARM CPU based on ncnn. You can test latency on your phone with ncnn_android_benchmark.

  • Intel CPU Performance is measured Intel Core-i7-8700 based on OpenVINO.

  • NanoDet mAP(0.5:0.95) is validated on COCO val2017 dataset with no testing time augmentation.

  • YOLOv3&YOLOv4 mAP refers from Scaled-YOLOv4: Scaling Cross Stage Partial Network.


NEWS!!!

  • [2021.12.25] NanoDet-Plus release! Adding AGM(Assign Guidance Module) & DSLA(Dynamic Soft Label Assigner) to improve 7 mAP with only a little cost.

Find more update notes in Update notes.

Demo

Android demo

android_demo

Android demo project is in demo_android_ncnn folder. Please refer to Android demo guide.

Here is a better implementation 👉 ncnn-android-nanodet

NCNN C++ demo

C++ demo based on ncnn is in demo_ncnn folder. Please refer to Cpp demo guide.

MNN demo

Inference using Alibaba's MNN framework is in demo_mnn folder. Please refer to MNN demo guide.

OpenVINO demo

Inference using OpenVINO is in demo_openvino folder. Please refer to OpenVINO demo guide.

Web browser demo

https://nihui.github.io/ncnn-webassembly-nanodet/

Pytorch demo

First, install requirements and setup NanoDet following installation guide. Then download COCO pretrain weight from here

👉 COCO pretrain checkpoint

The pre-trained weight was trained by the config config/nanodet-plus-m_416.yml.

  • Inference images
python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH
  • Inference video
python demo/demo.py video --config CONFIG_PATH --model MODEL_PATH --path VIDEO_PATH
  • Inference webcam
python demo/demo.py webcam --config CONFIG_PATH --model MODEL_PATH --camid YOUR_CAMERA_ID

Besides, We provide a notebook here to demonstrate how to make it work with PyTorch.


Install

Requirements

  • Linux or MacOS
  • CUDA >= 10.0
  • Python >= 3.6
  • Pytorch >= 1.7
  • experimental support Windows (Notice: Windows not support distributed training before pytorch1.7)

Step

  1. Create a conda virtual environment and then activate it.
 conda create -n nanodet python=3.8 -y
 conda activate nanodet
  1. Install pytorch
conda install pytorch torchvision cudatoolkit=11.1 -c pytorch -c conda-forge
  1. Install requirements
pip install Cython termcolor numpy tensorboard pycocotools matplotlib pyaml opencv-python tqdm pytorch-lightning torchmetrics
  1. Setup NanoDet
git clone https://github.com/RangiLyu/nanodet.git
cd nanodet
python setup.py develop

Model Zoo

NanoDet supports variety of backbones. Go to the config folder to see the sample training config files.

Model Backbone Resolution COCO mAP FLOPS Params Pre-train weight
NanoDet-m ShuffleNetV2 1.0x 320*320 20.6 0.72G 0.95M Download
NanoDet-Plus-m-320 (NEW) ShuffleNetV2 1.0x 320*320 27.0 0.9G 1.17M Weight | Checkpoint
NanoDet-Plus-m-416 (NEW) ShuffleNetV2 1.0x 416*416 30.4 1.52G 1.17M Weight | Checkpoint
NanoDet-Plus-m-1.5x-320 (NEW) ShuffleNetV2 1.5x 320*320 29.9 1.75G 2.44M Weight | Checkpoint
NanoDet-Plus-m-1.5x-416 (NEW) ShuffleNetV2 1.5x 416*416 34.1 2.97G 2.44M Weight | Checkpoint

Notice: The difference between Weight and Checkpoint is the weight only provide params in inference time, but the checkpoint contains training time params.

Legacy Model Zoo

Model Backbone Resolution COCO mAP FLOPS Params Pre-train weight
NanoDet-m-416 ShuffleNetV2 1.0x 416*416 23.5 1.2G 0.95M Download
NanoDet-m-1.5x ShuffleNetV2 1.5x 320*320 23.5 1.44G 2.08M Download
NanoDet-m-1.5x-416 ShuffleNetV2 1.5x 416*416 26.8 2.42G 2.08M Download
NanoDet-m-0.5x ShuffleNetV2 0.5x 320*320 13.5 0.3G 0.28M Download
NanoDet-t ShuffleNetV2 1.0x 320*320 21.7 0.96G 1.36M Download
NanoDet-g Custom CSP Net 416*416 22.9 4.2G 3.81M Download
NanoDet-EfficientLite EfficientNet-Lite0 320*320 24.7 1.72G 3.11M Download
NanoDet-EfficientLite EfficientNet-Lite1 416*416 30.3 4.06G 4.01M Download
NanoDet-EfficientLite EfficientNet-Lite2 512*512 32.6 7.12G 4.71M Download
NanoDet-RepVGG RepVGG-A0 416*416 27.8 11.3G 6.75M Download

How to Train

  1. Prepare dataset

    If your dataset annotations are pascal voc xml format, refer to config/nanodet_custom_xml_dataset.yml

    Or convert your dataset annotations to MS COCO format(COCO annotation format details).

  2. Prepare config file

    Copy and modify an example yml config file in config/ folder.

    Change save_path to where you want to save model.

    Change num_classes in model->arch->head.

    Change image path and annotation path in both data->train and data->val.

    Set gpu ids, num workers and batch size in device to fit your device.

    Set total_epochs, lr and lr_schedule according to your dataset and batchsize.

    If you want to modify network, data augmentation or other things, please refer to Config File Detail

  3. Start training

    NanoDet is now using pytorch lightning for training.

    For both single-GPU or multiple-GPUs, run:

    python tools/train.py CONFIG_FILE_PATH
  4. Visualize Logs

    TensorBoard logs are saved in save_dir which you set in config file.

    To visualize tensorboard logs, run:

    cd <YOUR_SAVE_DIR>
    tensorboard --logdir ./

How to Deploy

NanoDet provide multi-backend C++ demo including ncnn, OpenVINO and MNN. There is also an Android demo based on ncnn library.

Export model to ONNX

To convert NanoDet pytorch model to ncnn, you can choose this way: pytorch->onnx->ncnn

To export onnx model, run tools/export_onnx.py.

python tools/export_onnx.py --cfg_path ${CONFIG_PATH} --model_path ${PYTORCH_MODEL_PATH}

Run NanoDet in C++ with inference libraries

ncnn

Please refer to demo_ncnn.

OpenVINO

Please refer to demo_openvino.

MNN

Please refer to demo_mnn.

Run NanoDet on Android

Please refer to android_demo.


Citation

If you find this project useful in your research, please consider cite:

@misc{=nanodet,
    title={NanoDet-Plus: Super fast and high accuracy lightweight anchor-free object detection model.},
    author={RangiLyu},
    howpublished = {\url{https://github.com/RangiLyu/nanodet}},
    year={2021}
}

Thanks

https://github.com/Tencent/ncnn

https://github.com/open-mmlab/mmdetection

https://github.com/implus/GFocal

https://github.com/cmdbug/YOLOv5_NCNN

https://github.com/rbgirshick/yacs

Comments
  • 训练完10个epoch开始测试的时候报错:list object has no attribute cpu

    训练完10个epoch开始测试的时候报错:list object has no attribute cpu

    File "nanodet-main/nanodet/trainer/trainer.py", line 89, in run_epoch results[meta['img_info']['id'].cpu().numpy()[0]] = dets AttributeError: 'list' object has no attribute 'cpu'

    opened by DL-Practise 16
  • Training nanodet from scratch

    Training nanodet from scratch

    Hi, I'm training NanoDet-m model (ShuffleNetV2 1.0x | 320*320) from scratch with Coco dataset and 4 GeForce RTX 2080 Ti. Convergence seems pretty slow, it could take 1-2 weeks.

    May I ask how long did it takes for you to reach 20.6 mAP, and which setup did you use?

    Thank you.

    bug help wanted 
    opened by Cloudz333 10
  • 关于项目部署的问题

    关于项目部署的问题

    你好,我想请教两个问题:

    1. nanodet.cpp文件中的NanoDet::detect(cv::Mat image, float score_threshold, float nms_threshold)函数中,给模型输入数据的时候是用的ex.input("input.1", input);,这里的input.1是什么意思呢,是输入层的名字吗,我怎么通过pytorch查看到这个名字呢,print(model)后没看到层的名字,在Tencent/ncnn/tree/master/examples 上看到基本上都是ex.input("input", input);,如果我加载自己训练的一个模型,这里应该怎么匹配?
    2. nadodet.h中,有一个 std::vector heads_info,这个里面的值具体是什么含义呢,是和网络输出有关的吗
        std::vector<HeadInfo> heads_info{
            // cls_pred|dis_pred|stride
                {"792", "795",    8},
                {"814", "817",   16},
                {"836", "839",   32},
        };
    

    对pytorch以及nano网络都不是很熟,望见谅。

    opened by busyyang 8
  • 运行demo.py时,出现了一个小问题.

    运行demo.py时,出现了一个小问题.

    我的运行环境: cuda==10.1 pytorch==1.7 torchvision==0.8.0 当我运行"python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH",尝试推理图片时, 出现错误: RuntimeError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. 'torchvision::nms' is only available for these backends: [CPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, Tracer, Autocast, Batched, VmapMode].

    CPU: registered at /root/project/torchvision/csrc/vision.cpp:59 [kernel] BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] AutogradOther: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback] AutogradCPU: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback] AutogradCUDA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback] AutogradXLA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback] Tracer: fallthrough registered at /pytorch/torch/csrc/jit/frontend/tracer.cpp:967 [backend fallback] Autocast: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:254 [backend fallback] Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:511 [backend fallback] VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]

    但是当我把:/nanodet/nanodet/model/module/nms.py batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)函数改后:

    boxes_for_nms = boxes_for_nms.cpu()
    scores = scores.cpu()
    boxes = boxes.cpu()
    split_thr = nms_cfg_.pop('split_thr', 10000)
    if len(boxes_for_nms) < split_thr:
        # dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_)
        keep = nms(boxes_for_nms, scores, **nms_cfg_)
        boxes = boxes[keep]
        # scores = dets[:, -1]
        scores = scores[keep]
    

    demo.py正常运行.

    opened by lidongliang666 8
  • 加入mosaic后效果变差了,是什么原因

    加入mosaic后效果变差了,是什么原因

    coco.py

    if self.load_mosaic and not isval:
                img4, labels4, bbox4 = load_mosaic(self, idx)
                meta['img_info']['height'] = img4.shape[0]
                meta['img_info']['width'] = img4.shape[1]
                meta['img'] = img4
                meta['gt_labels'] = labels4
                meta['gt_bboxes'] = bbox4
    
    
            meta = self.pipeline(self, meta, input_size)
    
            meta["img"] = torch.from_numpy(meta["img"].transpose(2, 0, 1))
            return meta
    

    在ShapeTransform里测试打印出来的bbox是正常的

    meta_data["img"] = img
            meta_data["warp_matrix"] = M
            if "gt_bboxes" in meta_data:
                boxes = meta_data["gt_bboxes"]
                meta_data["gt_bboxes"] = warp_boxes(boxes, M, dst_shape[0], dst_shape[1])
            if "gt_masks" in meta_data:
                for i, mask in enumerate(meta_data["gt_masks"]):
                    meta_data["gt_masks"][i] = cv2.warpPerspective(
                        mask, M, dsize=tuple(dst_shape)
                    )
            for i in range(meta_data["gt_bboxes"].shape[0]):
                cv2.rectangle(img, (int(meta_data["gt_bboxes"][i][0]), int(meta_data["gt_bboxes"][i][1])), (int(meta_data["gt_bboxes"][i][2]), int(meta_data["gt_bboxes"][i][3])), (255,0,0), 2)
            cv2.imwrite('./%d.jpg' % int(meta_data["gt_bboxes"][0][0]), img)
    

    有什么可能的原因导致的?

    opened by Rokuki 6
  • Cannot find blob with name: dis_pred_stride_8

    Cannot find blob with name: dis_pred_stride_8

    使用demo_ncnn和demo_openvino测试转换预训练模型,转换过程均正常,但是预测时候出现问题,想问下怎么解决?

    # demo_ncnn
    find_blob_index_by_name input.1 failed
    Try
    find_blob_index_by_name dis_pred_stride_8 failed
    Try
    find_blob_index_by_name cls_pred_stride_8 failed
    
    # demo_openvino
    start init model
    success
    terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
    what(): Cannot find blob with name: dis_pred_stride_8
    

    发现onnx模型存在dis_pred_stride_8等节点,但是转换后的ncnn模型这几个节点消失 onnx网络结构: onnx ncnn网络结构: ncnn

    opened by TTMRonald 6
  • Cannot find blob with name: 795

    Cannot find blob with name: 795

    转换的是NanoDet-EfficientLite 512x512这个模型,openvino版本为2021.3.394,能够正常转换,并在程序中加载成功,但推理的时候报错,日志如下: start init model success terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException' what(): Cannot find blob with name: 795 有人遇到过吗

    opened by deep-practice 6
  • CoreML export failure: 'ConvModule' object has no attribute 'norm'

    CoreML export failure: 'ConvModule' object has no attribute 'norm'

    Hi, I tried to turn the nanodet-m.pth to coreml for IOS. I used coremltools as the guide, and got error "CoreML export failure: 'ConvModule' object has no attribute 'norm'". I read the source code of nanodet found that the norm in head is BN which should be supported by coreml. So I do not know why is the error happening. Is anyone has tried coreml? Thanks!

    opened by ghoshaw 6
  • No result while using single-class nano model in ncnn

    No result while using single-class nano model in ncnn

    Hi,我训练了一个person类的nanodet模型,然后通过tool/export.py转为onnx,然后转为ncnn的model,但是发现ncnn的model没有输出,我更改了cpp代码中的类别与图片size,不知道是在转换onnx时候出错还是onnx->NCNN时候出错了。下面是我训练时候的cfg

    #Config File example
    save_dir: workspace/nanodet_m
    model:
      arch:
        name: GFL
        backbone:
          name: ShuffleNetV2
          model_size: 1.0x
          out_stages: [2,3,4]
          activation: LeakyReLU
        fpn:
          name: PAN
          in_channels: [116, 232, 464]
          out_channels: 96
          start_level: 0
          num_outs: 3
        head:
          name: NanoDetHead
          num_classes: 1
          input_channel: 96
          feat_channels: 96
          stacked_convs: 2
          share_cls_reg: True
          octave_base_scale: 5
          scales_per_octave: 1
          strides: [8, 16, 32]
          reg_max: 7
          norm_cfg:
            type: BN
          loss:
            loss_qfl:
              name: QualityFocalLoss
              use_sigmoid: True
              beta: 2.0
              loss_weight: 1.0
            loss_dfl:
              name: DistributionFocalLoss
              loss_weight: 0.25
            loss_bbox:
              name: GIoULoss
              loss_weight: 2.0
    data:
      train:
        name: coco
        img_path: ../data/yoga_coco/images/train2017
        ann_path: ../data/yoga_coco/annotations/instances_train2017.json
        input_size: [416,416] #[w,h]
        keep_ratio: True
        pipeline:
          perspective: 0.0
          scale: [0.6, 1.4]
          stretch: [[1, 1], [1, 1]]
          rotation: 0
          shear: 0
          translate: 0.2
          flip: 0.5
          brightness: 0.2
          contrast: [0.8, 1.2]
          saturation: [0.8, 1.2]
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
      val:
        name: coco
        img_path: ../data/yoga_coco/images/val2017
        ann_path: ../data/yoga_coco/annotations/instances_val2017.json
        input_size: [416,416] #[w,h]
        keep_ratio: True
        pipeline:
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
    device:
      gpu_ids: [0]
      workers_per_gpu: 6
      batchsize_per_gpu: 40
    schedule:
    #  resume:
    #  load_model: YOUR_MODEL_PATH
      optimizer:
        name: SGD
        lr: 0.14
        momentum: 0.9
        weight_decay: 0.0001
      warmup:
        name: linear
        steps: 300
        ratio: 0.1
      total_epochs: 50
      lr_schedule:
        name: MultiStepLR
        milestones: [130,160,175,185]
        gamma: 0.1
      val_intervals: 10
    evaluator:
      name: CocoDetectionEvaluator
      save_key: mAP
    
    log:
      interval: 10
    
    class_names: ['person',]
    

    当我使用80类的model时,转化为ncnn有结果,所以想问问 当转化成single-class时候,有什么配置是需要再修改一下的。

    opened by Sean-hku 6
  • pth转onnx转ncnn问题

    pth转onnx转ncnn问题

    您好,我想问一下,我这边用pytorch模型转onnx再转ncnn模型,最后用ncnn模型检测结果不对。 有几个修改: 将config中的val输入改为64x64,将tools/export.py的输入大小改为64x64 python tools/export.py python -m onnxsim output.onnx output-sim.onnx build/tools/onnx/onnx2ncnn output-sim.onnx output-sim.param output-sim.bin build/tools/ncnnoptimize output-sim.param output-sim.bin new-output-sim.param new-output-sim.bin 0 这样操作是这样的 pytorch用的1.7.1 onnx 1.8.0 onnx-simplifier 0.2.19 onnxoptimizer 0.1.1 onnxruntime 1.6.0

    是哪里操作有问题吗?

    opened by yhl41001 6
  • original pytorch or onnx model

    original pytorch or onnx model

    Could you please provide pretrained pytorch or onnx model weights also? I noticed you only shared converted ncnn models, but I would like to see the speed of inference on gpu/npu accelerated systems

    opened by kadirbeytorun 6
  •  python tools/train.py  config/nanodet-plus-m_320.yml

    python tools/train.py config/nanodet-plus-m_320.yml

    Tried to : python tools/train.py config/nanodet-plus-m_320.yml error: pytorch_lightning.utilities.cloud_io.get_filesystem has been deprecated in v1.8.0 and will be" [NanoDet][01-04 10:28:00]INFO:Setting up data... loading annotations into memory... Done (t=18.55s) creating index... index created! loading annotations into memory... Done (t=0.56s) creating index... index created! [NanoDet][01-04 10:28:21]INFO:Creating model... model size is 1.0x init weights... => loading pretrained model https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth Finish initialize NanoDet-Plus Head. GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs /root/anaconda3/envs/nanodet/lib/python3.7/site-packages/torch/cuda/init.py:143: UserWarning: NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

    warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name)) LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

    | Name | Type | Params

    0 | model | NanoDetPlus | 4.3 M 1 | avg_model | NanoDetPlus | 4.3 M

    8.7 M Trainable params 0 Non-trainable params 8.7 M Total params 34.647 Total estimated model params size (MB) [NanoDet][01-04 10:28:21]INFO:Weight Averaging is enabled /root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:229: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 40 which is the number of cpus on this machine) in theDataLoader` init to improve performance. category=PossibleUserWarning, Traceback (most recent call last): File "tools/train.py", line 146, in main(args) File "tools/train.py", line 141, in main trainer.fit(task, train_dataloader, val_dataloader) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 604, in fit self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl self._run(model, ckpt_path=self.ckpt_path) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run results = self._run_stage() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage self._run_train() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train self.fit_loop.run() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/loops/loop.py", line 194, in run self.on_run_start(*args, **kwargs) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/loops/fit_loop.py", line 206, in on_run_start self.trainer.reset_train_dataloader(self.trainer.lightning_module) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1552, in reset_train_dataloader if has_len_all_ranks(self.train_dataloader, self.strategy, module) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/utilities/data.py", line 110, in has_len_all_ranks if total_length == 0: RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    python3.7 cuda==10.2 gpu==RT3090 UBUNTU20.04

    Thanks

    opened by molyswu 0
  • Fails to train a model on a dataset with single class.

    Fails to train a model on a dataset with single class.

    I used the converted COCO 2017 with only labeled persons. Вот мой config:

    save_dir: workspace/nanodet-plus-m_416
    model:
      weight_averager:
        name: ExpMovingAverager
        decay: 0.9998
      arch:
        name: NanoDetPlus
        detach_epoch: 10
        backbone:
          name: ShuffleNetV2
          model_size: 1.0x
          out_stages: [2,3,4]
          activation: LeakyReLU
        fpn:
          name: GhostPAN
          in_channels: [116, 232, 464]
          out_channels: 96
          kernel_size: 5
          num_extra_level: 1
          use_depthwise: True
          activation: LeakyReLU
        head:
          name: NanoDetPlusHead
          num_classes: 1
          input_channel: 96
          feat_channels: 96
          stacked_convs: 2
          kernel_size: 5
          strides: [8, 16, 32, 64]
          activation: LeakyReLU
          reg_max: 1
          norm_cfg:
            type: BN
          loss:
            loss_qfl:
              name: QualityFocalLoss
              use_sigmoid: True
              beta: 2.0
              loss_weight: 1.0
            loss_dfl:
              name: DistributionFocalLoss
              loss_weight: 0.25
            loss_bbox:
              name: GIoULoss
              loss_weight: 2.0
        # Auxiliary head, only use in training time.
        aux_head:
          name: SimpleConvHead
          num_classes: 1
          input_channel: 192
          feat_channels: 192
          stacked_convs: 4
          strides: [8, 16, 32, 64]
          activation: LeakyReLU
          reg_max: 1
    data:
      train:
        name: CocoDataset
        img_path: /home/mosminin/fiftyone/coco_person/train/data
        ann_path: /home/mosminin/fiftyone/coco_person/train/labels.json
        input_size: [416,416] #[w,h]
        keep_ratio: False
        pipeline:
          perspective: 0.0
          scale: [0.6, 1.4]
          stretch: [[0.8, 1.2], [0.8, 1.2]]
          rotation: 0
          shear: 0
          translate: 0.2
          flip: 0.5
          brightness: 0.2
          contrast: [0.6, 1.4]
          saturation: [0.5, 1.2]
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
      val:
        name: CocoDataset
        img_path: /home/mosminin/fiftyone/coco_person/validation/data
        ann_path: /home/mosminin/fiftyone/coco_person/validation/labels.json
        input_size: [416,416] #[w,h]
        keep_ratio: False
        pipeline:
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
    device:
      gpu_ids: [0]
      workers_per_gpu: 6
      batchsize_per_gpu: 16
    schedule:
    #  resume:
    #  load_model:
      optimizer:
        name: AdamW
        lr: 0.001
        weight_decay: 0.05
      warmup:
        name: linear
        steps: 500
        ratio: 0.0001
      total_epochs: 300
      lr_schedule:
        name: CosineAnnealingLR
        T_max: 300
        eta_min: 0.00005
      val_intervals: 10
    grad_clip: 35
    evaluator:
      name: CocoDetectionEvaluator
      save_key: mAP
    log:
      interval: 50
    
    class_names: ['person']
    

    I also changed the train.py to use CPU instead of GPU the errors were more understandable.

        # if cfg.device.gpu_ids == -1:
        #     logger.info("Using CPU training")
        #     accelerator, devices, strategy = "cpu", None, None
        # else:
        #     accelerator, devices, strategy = "gpu", cfg.device.gpu_ids, None
    
        accelerator, devices, strategy = "cpu", None, None # CPU training
    
    

    After running it, I get the following errors.

    (.venv) [email protected]:~/dev/nanodet$ python tools/train.py /home/mosminin/dev/nanodet/config/nanodet-plus-m_416_person.yml
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v1.10.0. Please use `lightning_lite.utilities.cloud_io.get_filesystem` instead.
      rank_zero_deprecation(
    [NanoDet][12-18 14:05:30]INFO:Setting up data...
    loading annotations into memory...
    Done (t=4.35s)
    creating index...
    index created!
    loading annotations into memory...
    Done (t=0.16s)
    creating index...
    index created!
    [NanoDet][12-18 14:05:35]INFO:Creating model...
    model size is  1.0x
    init weights...
    => loading pretrained model https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth
    Finish initialize NanoDet-Plus Head.
    GPU available: True (cuda), used: False
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    HPU available: False, using: 0 HPUs
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/setup.py:175: PossibleUserWarning: GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)`.
      rank_zero_warn(
    
      | Name      | Type        | Params
    ------------------------------------------
    0 | model     | NanoDetPlus | 4.1 M 
    1 | avg_model | NanoDetPlus | 4.1 M 
    ------------------------------------------
    8.2 M     Trainable params
    0         Non-trainable params
    8.2 M     Total params
    32.903    Total estimated model params size (MB)
    [NanoDet][12-18 14:05:35]INFO:Weight Averaging is enabled
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
      return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
    Traceback (most recent call last):
      File "/home/mosminin/dev/nanodet/tools/train.py", line 147, in <module>
        main(args)
      File "/home/mosminin/dev/nanodet/tools/train.py", line 142, in main
        trainer.fit(task, train_dataloader, val_dataloader)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
        call._call_and_handle_interrupt(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
        return trainer_fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
        self._run(model, ckpt_path=self.ckpt_path)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
        results = self._run_stage()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
        self._run_train()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train
        self.fit_loop.run()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
        self._outputs = self.epoch_loop.run(self._data_fetcher)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 214, in advance
        batch_output = self.batch_loop.run(kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
        outputs = self.optimizer_loop.run(optimizers, kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 200, in advance
        result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 247, in _run_optimization
        self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 357, in _optimizer_step
        self.trainer._call_lightning_module_hook(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1342, in _call_lightning_module_hook
        output = fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/trainer/task.py", line 281, in optimizer_step
        optimizer.step(closure=optimizer_closure)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 169, in step
        step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 234, in optimizer_step
        return self.precision_plugin.optimizer_step(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 121, in optimizer_step
        return optimizer.step(closure=closure, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
        return wrapped(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/optimizer.py", line 140, in wrapper
        out = func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/adamw.py", line 120, in step
        loss = closure()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 107, in _wrap_closure
        closure_result = closure()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 147, in __call__
        self._result = self.closure(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 133, in closure
        step_output = self._step_fn()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 406, in _training_step
        training_step_output = self.trainer._call_strategy_hook("training_step", *kwargs.values())
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1480, in _call_strategy_hook
        output = fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 378, in training_step
        return self.model.training_step(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/trainer/task.py", line 78, in training_step
        preds, loss, loss_states = self.model.forward_train(batch)
      File "/home/mosminin/dev/nanodet/nanodet/model/arch/nanodet_plus.py", line 56, in forward_train
        loss, loss_states = self.head.loss(head_out, gt_meta, aux_preds=aux_head_out)
      File "/home/mosminin/dev/nanodet/nanodet/model/head/nanodet_plus_head.py", line 198, in loss
        batch_assign_res = multi_apply(
      File "/home/mosminin/dev/nanodet/nanodet/util/misc.py", line 24, in multi_apply
        return tuple(map(list, zip(*map_results)))
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/model/head/nanodet_plus_head.py", line 314, in target_assign_single_img
        assign_result = self.assigner.assign(
      File "/home/mosminin/dev/nanodet/nanodet/model/head/assigner/dsl_assigner.py", line 86, in assign
        F.one_hot(gt_labels.to(torch.int64), pred_scores.shape[-1])
    RuntimeError: Class values must be smaller than num_classes.
    
    

    What am I doing wrong?

    opened by Octopusmode 0
  • Adapting the code to output a center x, y instead of bounding boxes (x1, y1, x2, y2)

    Adapting the code to output a center x, y instead of bounding boxes (x1, y1, x2, y2)

    Hey, I'm not too familiar with machine learning and the like, and I'm not exactly ready to spend the next 2 months (yet) learning how tensor-flow works and such, so I'm hoping someone can assist me with this.

    So far, my experience with nanodet has been great; but, manually annotating images takes a lot of time which I don't have; because I don't really need the bounding box information anyway, I assumed I'd seek for a way to only give the center of objects rather than the top left and bottom right corners.

    Help would be highly appreciated 😄

    opened by icecreamnotallowed 0
  • The onnx model(which is transfor by export_onnx.py) out put is differ from pytoch model

    The onnx model(which is transfor by export_onnx.py) out put is differ from pytoch model

    def image_preprocess(img_path): img = cv2.imread(img_path).astype("float32")/255 # mean = [103.53, 116.28, 123.675] # Image net values # std = [57.375, 57.12, 58.395] mean = [113.533554, 118.14172, 123.63607] std = [21.405144, 21.405144, 21.405144] mean = np.array(mean, dtype=np.float32).reshape(1, 1, 3) / 255 std = np.array(std, dtype=np.float32).reshape(1, 1, 3) / 255 img = (img - mean) / std img = np.transpose(img, (2, 0, 1)) img = np.expand_dims(img, axis=0) return img

    def test_onnx_model(onnx_model,img_path=None): if img_path is None: img_path = "path for img" imgdata = image_preprocess(img_path) sess = rt.InferenceSession(onnx_model) input_name = sess.get_inputs()[0].name output_detect_name = sess.get_outputs()[0].name pred_onnx0= sess.run([output_detect_name], {input_name: imgdata}) print("outputs:") print(np.array(pred_onnx0))

    opened by Genlk 0
  • Fixes a couple of issues to add fp16 training support

    Fixes a couple of issues to add fp16 training support

    There were a couple of issues when trying to use fp16 training. For one was that it was not exposed through the configuration system. The other was that the DynamicSoftLabelAssigner used binary_cross_entropy instead of binary_cross_entropy_with_logits. This changes where sigmoid is called on the predictions so that the more stable binary_cross_entropy_with_logits can be used and the Trainer can be configured to use fp16 precision.

    opened by crisp-snakey 0
Releases(v1.0.0-alpha-1)
  • v1.0.0-alpha-1(Dec 26, 2021)

    NanoDet-Plus v1.0.0-alpha

    In NanoDet-Plus, we propose a novel label assignment strategy with a simple assign guidance module (AGM) and a dynamic soft label assigner (DSLA) to solve the optimal label assignment problem in lightweight model training. We also introduce a light feature pyramid called Ghost-PAN to enhance multi-layer feature fusion. These improvements boost previous NanoDet's detection accuracy by 7 mAP on COCO dataset.

    image

    Model |Resolution| mAPval
    0.5:0.95 |CPU Latency
    (i7-8700) |ARM Latency
    (4xA76) | FLOPS | Params | Model Size :-------------:|:--------:|:-------:|:--------------------:|:--------------------:|:----------:|:---------:|:-------: NanoDet-m | 320320 | 20.6 | 4.98ms | 10.23ms | 0.72G | 0.95M | 1.8MB(FP16) | 980KB(INT8) NanoDet-Plus-m | 320320 | 27.0 | 5.25ms | 11.97ms | 0.9G | 1.17M | 2.3MB(FP16) | 1.2MB(INT8) NanoDet-Plus-m | 416416 | 30.4 | 8.32ms | 19.77ms | 1.52G | 1.17M | 2.3MB(FP16) | 1.2MB(INT8) NanoDet-Plus-m-1.5x | 320320 | 29.9 | 7.21ms | 15.90ms | 1.75G | 2.44M | 4.7MB(FP16) | 2.3MB(INT8) NanoDet-Plus-m-1.5x | 416416 | 34.1 | 11.50ms | 25.49ms | 2.97G | 2.44M | 4.7MB(FP16) | 2.3MB(INT8) YOLOv3-Tiny | 416416 | 16.6 | - | 37.6ms | 5.62G | 8.86M | 33.7MB YOLOv4-Tiny | 416416 | 21.7 | - | 32.81ms | 6.96G | 6.06M | 23.0MB YOLOX-Nano | 416416 | 25.8 | - | 23.08ms | 1.08G | 0.91M | 1.8MB(FP16) YOLOv5-n | 640640 | 28.4 | - | 44.39ms | 4.5G | 1.9M | 3.8MB(FP16) FBNetV5 | 320640 | 30.4 | - | - | 1.8G | - | - MobileDet | 320*320 | 25.6 | - | - | 0.9G | - | -

    Model checkpoints and weights

    Download in the release files.

    Source code(tar.gz)
    Source code(zip)
    nanodet-plus-m-1.5x_320.onnx(9.43 MB)
    nanodet-plus-m-1.5x_320_checkpoint.ckpt(61.63 MB)
    nanodet-plus-m-1.5x_416.onnx(9.43 MB)
    nanodet-plus-m-1.5x_416_checkpoint.ckpt(61.63 MB)
    nanodet-plus-m-1.5x_416_ncnn.zip(4.40 MB)
    nanodet-plus-m-1.5x_416_openvino.zip(4.39 MB)
    nanodet-plus-m_320.onnx(4.57 MB)
    nanodet-plus-m_320_checkpoint.ckpt(33.82 MB)
    nanodet-plus-m_416.onnx(4.57 MB)
    nanodet-plus-m_416_checkpoint.ckpt(33.82 MB)
    nanodet-plus-m_416_mnn.mnn(4.59 MB)
    nanodet-plus-m_416_ncnn.zip(2.11 MB)
    nanodet-plus-m_416_openvino.zip(2.11 MB)
  • v0.4.2(Aug 22, 2021)

    v0.4.2

    Fix some compatibility issue of NanoDet v0.4

    Fix pytorch-lightning compatibility. (#304 #309 ) Fix pytorch1.9 compatibility. (#308 ) Support not raising an error when evaluate with empty results. (#310)

    I'm doing a lot of refactoring. NanoDet v1.x is coming soon.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | ncnn model | ncnn-int8 | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | Download | Download NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| Download | Download | NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | Download | Download NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| Download | Download NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Jul 17, 2021)

    v0.4.1

    This is a final release of NanoDet v0.x.

    I'm doing a lot of refactoring. NanoDet v1.x is coming soon.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | ncnn model | ncnn-int8 | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | Download | Download NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| Download | Download | NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | Download | Download NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| Download | Download NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Jun 8, 2021)

    What's new in v0.4.0

    1. Fix a little bug in demo.py by BlainWu (#210)
    2. Add script to export TorchScript model by strawberrypie (#211)
    3. Use fixed output names when exporting ONNX (#218)
    4. Use scale_factor instead of fixed size in resize to support dynamic shape inference (#218)
    5. Ensure num_classes equal len(class_names) by ZHEQIUSHUI (#221)
    6. Fix a bug in mnn demo while using GPU device by AcherStyx (#234)
    7. Fix with_last_conv bug in shufflenet (#239)
    8. Support batch eval (#241)
    9. Add nanodet-m-1.5x models (#242)
    10. Update model benchmark (#246)
    11. Prevent lightning Trainer from disabling cudnn.benchmark (#249)
    12. Fix multi-GPU evaluation bug with pytorch-lightning (#254)

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Download ncnn models below

    Source code(tar.gz)
    Source code(zip)
    ncnn-nanodet-m-1.5x-416-int8.zip(1.82 MB)
    ncnn-nanodet-m-1.5x-416.zip(3.67 MB)
    ncnn-nanodet-m-1.5x-int8.zip(1.82 MB)
    ncnn-nanodet-m-1.5x.zip(3.66 MB)
    ncnn-nanodet-m-416-int8.zip(882.58 KB)
    ncnn-nanodet-m-416.zip(1.64 MB)
    ncnn-nanodet-m-int8.zip(888.76 KB)
    ncnn-nanodet-m.zip(1.64 MB)
  • v0.3.0(Apr 11, 2021)

    What's new in v0.3.0

    1. Refactor training and testing code with pytorch-lightning.
    2. Solving ONNX inference AxisError by zshn25 (#198).

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-t (NEW) | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
    nanodet_m_ncnn_model.zip(1.64 MB)
  • v0.2.0(Mar 29, 2021)

    What's new in v0.2.0

    1. Add pyncnn demo by caishanli (#167).
    2. Fix ncnn demo build failure without vulkan by nihui (#168).
    3. Add NanoDet-t with Transformer Attention Network (#183).
    4. Add Notebook demo by zhiqwang (#188).
    5. Add feature of saving demo inference result by wwdok (#191).
    6. Fix utf-8 decode bug (#184).
    7. Fix test bug.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-t (NEW) | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Mar 7, 2021)

    What's new in v0.1.0

    1. Support MNN python and cpp inference (#83 ).
    2. Support OpenVINO inference.
    3. Support libtorch inference experimentally.
    4. Add NanoDet-g.
    5. Add EfficientNet-Lite and Rep-VGG backbone.
    6. Add Model Zoo and provide more pre-trained model.
    7. Refactor GFL head (#154 ).

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416*416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Nov 22, 2020)

Owner
Away From Keyboard
Point-NeRF: Point-based Neural Radiance Fields

Point-NeRF: Point-based Neural Radiance Fields Project Sites | Paper | Primary c

Qiangeng Xu 662 Jan 01, 2023
A strongly-typed genetic programming framework for Python

monkeys "If an army of monkeys were strumming on typewriters they might write all the books in the British Museum." monkeys is a framework designed to

H. Chase Stevens 115 Nov 27, 2022
[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

SelectionGAN for Guided Image-to-Image Translation CVPR Paper | Extended Paper | Guided-I2I-Translation-Papers Citation If you use this code for your

Hao Tang 424 Dec 02, 2022
A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

Kakao Brain 604 Dec 14, 2022
Configure SRX interfaces with Scrapli

Configure SRX interfaces with Scrapli Overview This example will show how to configure interfaces on Juniper's SRX firewalls. In addition to the Pytho

Calvin Remsburg 1 Jan 07, 2022
CTRL-C: Camera calibration TRansformer with Line-Classification

CTRL-C: Camera calibration TRansformer with Line-Classification This repository contains the official code and pretrained models for CTRL-C (Camera ca

57 Nov 14, 2022
Time series annotation library.

CrowdCurio Time Series Annotator Library The CrowdCurio Time Series Annotation Library implements classification tasks for time series. Features Suppo

CrowdCurio 51 Sep 15, 2022
Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences

Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences 1. Introduction This project is for paper Model-free Vehicle Tracking and St

TuSimple 92 Jan 03, 2023
object recognition with machine learning on Respberry pi

Respberrypi_object-recognition object recognition with machine learning on Respberry pi line.py 建立一支與樹梅派連線的 linebot 使用此 linebot 遠端控制樹梅派拍照 config.ini l

1 Dec 11, 2021
OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

Siva Prakash 5 Apr 05, 2022
GenshinMapAutoMarkTools - Tools To add/delete/refresh resources mark in Genshin Impact Map

使用说明 适配 windows7以上 64位 原神1920x1080窗口(其他分辨率后续适配) 待更新渊下宫 English version is to be

Zero_Circle 209 Dec 28, 2022
Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

Differentiable Factor Graph Optimization for Learning Smoothers Overview Status Setup Datasets Training Evaluation Acknowledgements Overview Code rele

Brent Yi 60 Nov 14, 2022
A set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI.

Overview This is a set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI. Make TFRecords To run t

8 Nov 01, 2022
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

TensorFlowOnSpark TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters. By combining salient features from the T

Yahoo 3.8k Jan 04, 2023
Implementation of paper "Graph Condensation for Graph Neural Networks"

GCond A PyTorch implementation of paper "Graph Condensation for Graph Neural Networks" Code will be released soon. Stay tuned :) Abstract We propose a

Wei Jin 66 Dec 04, 2022
Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment This is a pytorch project for the paper Seeing Dynamic Scene i

DV Lab 21 Nov 28, 2022
A project for developing transformer-based models for clinical relation extraction

Clinical Relation Extration with Transformers Aim This package is developed for researchers easily to use state-of-the-art transformers models for ext

uf-hobi-informatics-lab 101 Dec 19, 2022
Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

The Boombox: Visual Reconstruction from Acoustic Vibrations Boyuan Chen, Mia Chiquier, Hod Lipson, Carl Vondrick Columbia University Project Website |

Boyuan Chen 12 Nov 30, 2022
Implementation of Barlow Twins paper

barlowtwins PyTorch Implementation of Barlow Twins paper: Barlow Twins: Self-Supervised Learning via Redundancy Reduction This is currently a work in

IgorSusmelj 86 Dec 20, 2022
Efficient 3D Backbone Network for Temporal Modeling

VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and

102 Dec 06, 2022