Object detection and instance segmentation toolkit based on PaddlePaddle.

Overview

简体中文 | English

PaddleDetection

PaddleDetection 2.0全面升级!目前默认使用动态图版本,静态图版本位于static

简介

PaddleDetection飞桨目标检测开发套件,旨在帮助开发者更快更好地完成检测模型的组建、训练、优化及部署等全开发流程。

PaddleDetection模块化地实现了多种主流目标检测算法,提供了丰富的数据增强策略、网络模块组件(如骨干网络)、损失函数等,并集成了模型压缩和跨平台高性能部署能力。

经过长时间产业实践打磨,PaddleDetection已拥有顺畅、卓越的使用体验,被工业质检、遥感图像检测、无人巡检、新零售、互联网、科研等十多个行业的开发者广泛应用。

产品动态

  • 2021.04.14: 发布release/2.0版本,PaddleDetection全面支持动态图,覆盖静态图模型算法,全面升级模型效果,同时发布PP-YOLO v2模型,新增旋转框检测S2ANet模型,详情参考PaddleDetection
  • 2021.02.07: 发布release/2.0-rc版本,PaddleDetection动态图试用版本,详情参考PaddleDetection动态图

特性

  • 模型丰富: 包含目标检测实例分割人脸检测100+个预训练模型,涵盖多种全球竞赛冠军方案
  • 使用简洁:模块化设计,解耦各个网络组件,开发者轻松搭建、试用各种检测模型及优化策略,快速得到高性能、定制化的算法。
  • 端到端打通: 从数据增强、组网、训练、压缩、部署端到端打通,并完备支持云端/边缘端多架构、多设备部署。
  • 高性能: 基于飞桨的高性能内核,模型训练速度及显存占用优势明显。支持FP16训练, 支持多机训练。

套件结构概览

Architectures Backbones Components Data Augmentation
  • Two-Stage Detection
    • Faster RCNN
    • FPN
    • Cascade-RCNN
    • Libra RCNN
    • Hybrid Task RCNN
    • PSS-Det
  • One-Stage Detection
    • RetinaNet
    • YOLOv3
    • YOLOv4
    • PP-YOLO
    • SSD
  • Anchor Free
    • CornerNet-Squeeze
    • FCOS
    • TTFNet
  • Instance Segmentation
    • Mask RCNN
    • SOLOv2
  • Face-Detction
    • FaceBoxes
    • BlazeFace
    • BlazeFace-NAS
  • ResNet(&vd)
  • ResNeXt(&vd)
  • SENet
  • Res2Net
  • HRNet
  • Hourglass
  • CBNet
  • GCNet
  • DarkNet
  • CSPDarkNet
  • VGG
  • MobileNetv1/v3
  • GhostNet
  • Efficientnet
  • Common
    • Sync-BN
    • Group Norm
    • DCNv2
    • Non-local
  • FPN
    • BiFPN
    • BFP
    • HRFPN
    • ACFPN
  • Loss
    • Smooth-L1
    • GIoU/DIoU/CIoU
    • IoUAware
  • Post-processing
    • SoftNMS
    • MatrixNMS
  • Speed
    • FP16 training
    • Multi-machine training
  • Resize
  • Flipping
  • Expand
  • Crop
  • Color Distort
  • Random Erasing
  • Mixup
  • Cutmix
  • Grid Mask
  • Auto Augment

模型性能概览

各模型结构和骨干网络的代表模型在COCO数据集上精度mAP和单卡Tesla V100上预测速度(FPS)对比图。

说明:

  • CBResNetCascade-Faster-RCNN-CBResNet200vd-FPN模型,COCO数据集mAP高达53.3%
  • Cascade-Faster-RCNNCascade-Faster-RCNN-ResNet50vd-DCN,PaddleDetection将其优化到COCO数据mAP为47.8%时推理速度为20FPS
  • PP-YOLO在COCO数据集精度45.9%,Tesla V100预测速度72.9FPS,精度速度均优于YOLOv4
  • PP-YOLO v2是对PP-YOLO模型的进一步优化,在COCO数据集精度49.5%,Tesla V100预测速度68.9FPS
  • 图中模型均可在模型库中获取

文档教程

入门教程

进阶教程

模型库

应用案例

第三方教程推荐

版本更新

v2.0版本已经在04/2021发布,全面支持动态图版本,新增支持BlazeFace, PSSDet等系列模型和大量骨干网络,发布PP-YOLO v2, PP-YOLO tiny和旋转框检测S2ANet模型。支持模型蒸馏、VisualDL,新增动态图预测部署benchmark,详细内容请参考版本更新文档

许可证书

本项目的发布受Apache 2.0 license许可认证。

贡献代码

我们非常欢迎你可以为PaddleDetection提供代码,也十分感谢你的反馈。

引用

@misc{ppdet2019,
title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
year={2019}
}
Comments
  • 🌟 PP-PicoDet已发布,欢迎大家试用&讨论

    🌟 PP-PicoDet已发布,欢迎大家试用&讨论

    PP-PicoDet是轻量级实时移动端目标检测模型,我们提出了从小到大的一系列模型,包括S、M、L等,超越现有SOTA模型。

    模型特色:

    • 🌟精度高:1M参数量以内mAP(0.5:0.95)达到30.6,3.3M参数量mAP(0.5:0.95)达到40.9。
    • 🚀速度快:在SD865上达到150FPS。
    • 😊部署友好:我们支持PaddleInference/PaddleLite/MNN/NCNN/OpenVINO,并且提供C++/Python/Android demo。

    链接:

    欢迎大家试用,有疑问欢迎讨论盖楼~

    和其他模型对比: picodet_map2

    FAQ汇总: (持续更新中)

    • 版本要求: 训练导出模型要求Paddle版本统一,同时 PaddlePaddle >= 2.1.2。
    • 学习率、GPU数和batch-size关系: 采用线性伸缩准则,发布的配置文件基本都是4卡GPU训练的,例如:变成单卡,请学习率除以4,如果batch size从80变成40,请学习率再除以2。
    • 配置优先级: 一般picodet_x_coco.yml中的配置优先级高于__base__中配置,picodet_x_coco.yml中的所有设置会覆盖__base__中配置,所以修改picodet_x_coco.yml的配置即可。
    • 在自己数据集上训练模型: 支持COCO和VOC两种数据格式,同时建议采用迁移学习加快收敛,具体步骤:从PicoDet的Readme中拷贝COCO上训好的pretrain weights链接,更新配置文件中pretrain_weights参数为COCO上训好的权重。

    为了方便大家交流沟通,欢迎扫码添加微信群,继续交流有关PP-PicoDet的使用及建议~

    status/close 
    opened by yghstill 124
  • C++部署paddledetectino 发现说明与现在最新版本对应不上。

    C++部署paddledetectino 发现说明与现在最新版本对应不上。

    https://paddledetection.readthedocs.io/advanced_tutorials/inference/docs/windows_vs2019_build.html 我根据上面的说明进行部署,发现报错。 CMakeCache.txt 指向 D:\1.6.1\paddle。 此目录不存在。 后来我打开下面网址https://paddledetection.readthedocs.io/advanced_tutorials/inference/docs/windows_vs2015_build.html 这个部署版本与paddleDetection 不同,还是需要1.6版本,我是应该部署1.6版本吗,可我的训练文件是0.2的paddleDetection。请问我应该如何解决。

    opened by wyc880622 61
  • Bug of quant-aware training of tinypose !

    Bug of quant-aware training of tinypose !

    问题确认 Search before asking

    • [X] 我已经查询历史issue,没有报过同样bug。I have searched the issues and found no similar bug report.

    bug描述 Describe the Bug

    @yghstill TinyPose模型的自动化压缩代码整理完成了:https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/auto_compression/detection 配置文件是:configs/tinypose_qat_dis.yaml,启动方式和readme中完全一致,如果全量化的话可以先使用这个 上面是你说的话,可是出现了几个问题。 1 你给出的这个链接 404打不开了,于是我找到了其他的位置,找到你说的yml,运行程序 2 python3 -m paddle.distributed.launch --log_dir=log0705 --gpus 0,1 run.py --config_path=./configs/tinypose_qat_dis.yaml --save_dir='./output0705/'
    python3 run.py --config_path=./configs/tinypose_qat_dis.yaml --save_dir='./output0705/'
    以上两种方式都尝试了,可是报错如下 image image 请大佬能认真帮忙看下吗? 3 是否有真正的可复现而且无bug的版本发出来,是否可以进行自测再发出来,多谢

    BR

    复现环境 Environment

           paddle-gpu 2.23 以上
           paddledet release/2.4
           paddleslim develop
           cuda 11.3
          pytorch 跟cuda配套
    

    是否愿意提交PR Are you willing to submit a PR?

    • [x] Yes I'd like to help by submitting a PR!
    status/close 
    opened by 2050airobert 33
  • [BUG]PPYOLOE训练问题

    [BUG]PPYOLOE训练问题

    训练yoloe_s报错,提示:

    ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of 
    X = [1, 3024, 2] and the shape of Y = [8400, 2]. Received [3024] in X is not equal to [8400] in Y at i:1.
      [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, 
    but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] 
    (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:240)
      [operator < elementwise_add > error]
    
    opened by m00nLi 30
  • 训练时报错:has no im_shape field

    训练时报错:has no im_shape field

    配置文件在 mask_rcnn_r50_2x.yml 的基础上进行修改,执行下面的命令之后:

    !python tools/train.py -c configs/myconfig/mask_rcnn_r50_2x.yml --eval -o use_gpu=true --use_vdl=True --vdl_log_dir=vdl_dir/scalar

    报错:

    Traceback (most recent call last):
      File "tools/train.py", line 377, in <module>
        main()
      File "tools/train.py", line 146, in main
        fetches = model.eval(feed_vars)
      File "/home/aistudio/work/PaddleDetection/ppdet/modeling/architectures/mask_rcnn.py", line 338, in eval
        return self.build(feed_vars, 'test')
      File "/home/aistudio/work/PaddleDetection/ppdet/modeling/architectures/mask_rcnn.py", line 81, in build
        self._input_check(required_fields, feed_vars)
      File "/home/aistudio/work/PaddleDetection/ppdet/modeling/architectures/mask_rcnn.py", line 271, in _input_check
        "{} has no {} field".format(feed_vars, var)
    AssertionError: OrderedDict([('image', name: "image"
    type {
      type: LOD_TENSOR
      lod_tensor {
        tensor {
          data_type: FP32
          dims: -1
          dims: 3
          dims: -1
          dims: -1
        }
        lod_level: 0
      }
    }
    persistable: false
    need_check_feed: true
    ), ('im_info', name: "im_info"
    type {
      type: LOD_TENSOR
      lod_tensor {
        tensor {
          data_type: FP32
          dims: -1
          dims: 3
        }
        lod_level: 0
      }
    }
    persistable: false
    need_check_feed: true
    ), ('im_id', name: "im_id"
    type {
      type: LOD_TENSOR
      lod_tensor {
        tensor {
          data_type: INT64
          dims: -1
          dims: 1
        }
        lod_level: 0
      }
    }
    persistable: false
    need_check_feed: true
    ), ('gt_bbox', name: "gt_bbox"
    type {
      type: LOD_TENSOR
      lod_tensor {
        tensor {
          data_type: FP32
          dims: -1
          dims: 4
        }
        lod_level: 1
      }
    }
    persistable: false
    need_check_feed: true
    ), ('gt_class', name: "gt_class"
    type {
      type: LOD_TENSOR
      lod_tensor {
        tensor {
          data_type: INT32
          dims: -1
          dims: 1
        }
        lod_level: 1
      }
    }
    persistable: false
    need_check_feed: true
    ), ('is_crowd', name: "is_crowd"
    type {
      type: LOD_TENSOR
      lod_tensor {
        tensor {
          data_type: INT32
          dims: -1
          dims: 1
        }
        lod_level: 1
      }
    }
    persistable: false
    need_check_feed: true
    ), ('gt_mask', name: "gt_mask"
    type {
      type: LOD_TENSOR
      lod_tensor {
        tensor {
          data_type: FP32
          dims: -1
          dims: 2
        }
        lod_level: 3
      }
    }
    persistable: false
    need_check_feed: true
    )]) has no im_shape field
    

    我查看了 mask_rcnn.py 第81 行,这里应该是因为 model != 'train' 所以 self._input_check(required_fields, feed_vars) check 了 im_shape 我的训练启动命令有问题吗?

    opened by iceriver97 29
  • S2Anet本地部署

    S2Anet本地部署

    问题确认 Search before asking

    • [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.

    请提出你的问题 Please ask your question

    https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/dota 请问这个链接的镜像是不是没了。 我显卡3070 cuda11.1 paddle2.2 paddledetection 2.2 按照教程训练显示cudnn错误。会不会是目前这个只支持cuda10 cudnn7.可30显卡好像对cuda10不太友好。请问有没有办法能本地安装上这个模型阿

    status/close 
    opened by jianxin123 28
  • 关键点检测模型评估的时候出现DataLoader reader thread raised an exception

    关键点检测模型评估的时候出现DataLoader reader thread raised an exception

    ERROR - DataLoader reader thread raised an exception!                                 
    Exception in thread Thread-1:                                                                                   
    Traceback (most recent call last):                                                                              
      File "/usr/local/python3/lib/python3.6/threading.py", line 916, in _bootstrap_inner                           
        self.run()                                                                                                  
      File "/usr/local/python3/lib/python3.6/threading.py", line 864, in run                                        
        self._target(*self._args, **self._kwargs)                                                                   
      File "/home/.local/lib/python3.6/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 662, 
    in _thread_loop                                                                                                 
        six.reraise(*sys.exc_info())                                                                                
      File "/home/.local/lib/python3.6/site-packages/six.py", line 719, in reraise                          
        raise value                                                                                                 
      File "/home/.local/lib/python3.6/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 650, 
    in _thread_loop                                                                                                 
        tmp.set(slot, core.CPUPlace())                                                                              
    ValueError: (InvalidArgument) Input object type error or incompatible array data type. tensor.set() supports arr
    ay with bool, float16, float32, float64, int8, int16, int32, int64, uint8 or uint16, please check your input or 
    input array data type. (at /paddle/paddle/fluid/pybind/tensor_py.h:355)  
    
    

    这里应该填入的是什么数据啊

    help wanted 
    opened by huangdf97 26
  • ValueError: Target 460 is out of upper bound.

    ValueError: Target 460 is out of upper bound.

    问题确认 Search before asking

    • [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.

    请提出你的问题 Please ask your question

    ppyoloe_crn_s_300e_coco VOC 数据集

    python tools/train.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml

    
    W0823 14:30:26.446256  4452 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
    W0823 14:30:26.461884  4452 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
    [08/23 14:30:27] ppdet.utils.checkpoint INFO: Finish loading model weights: C:\Users\fujunnnn/.cache/paddle/weights\CSPResNetb_s_pretrained.pdparams
    [08/23 14:30:30] ppdet.engine INFO: Epoch: [0] [  0/339] learning_rate: 0.000000 loss: 1931307253760.000000 loss_cls: 0.594841 loss_iou: 772522901504.000000 loss_dfl: 5885.125977 loss_l1: 0.105123 eta: 4 days, 9:30:32 batch_cost: 3.7348 data_cost: 0.2500 ips: 2.6775 images/s
    Traceback (most recent call last):
      File "tools/train.py", line 177, in <module>
        main()
      File "tools/train.py", line 173, in main
        run(FLAGS, cfg)
      File "tools/train.py", line 127, in run
        trainer.train(FLAGS.eval)
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\engine\trainer.py", line 454, in train
        outputs = model(data)
      File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
        return self._dygraph_call_func(*inputs, **kwargs)
      File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
        outputs = self.forward(*inputs, **kwargs)
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\architectures\meta_arch.py", line 59, in forward
        out = self.get_loss()
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 125, in get_loss
        return self._forward()
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 88, in _forward
        yolo_losses = self.yolo_head(neck_feats, self.inputs)
      File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
        return self._dygraph_call_func(*inputs, **kwargs)
      File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
        outputs = self.forward(*inputs, **kwargs)
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 217, in forward
        return self.forward_train(feats, targets)
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 160, in forward_train
        ], targets)
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 355, in get_loss
        assigned_scores_sum)
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 291, in _bbox_loss
        assigned_ltrb_pos) * bbox_weight
      File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 256, in _df_loss
        pred_dist, target_left, reduction='none') * weight_left
      File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\nn\functional\loss.py", line 1723, in cross_entropy
        label_max.item()))
    ValueError: Target 25479 is out of upper bound.
    

    python tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml

    
    W0823 21:31:38.730271 10200 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
    W0823 21:31:38.750262 10200 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
    [08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365] in pretrained weight yolo_head.pred_cls.0.bias is unmatched with the shape [4] in model yolo_head.pred_cls.0.bias. And the weight yolo_head.pred_cls.0.bias will not be loaded
    [08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365, 384, 3, 3] in pretrained weight yolo_head.pred_cls.0.weight is unmatched with the shape [4, 384, 3, 3] in model yolo_head.pred_cls.0.weight. And the weight yolo_head.pred_cls.0.weight will not be loaded
    [08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365] in pretrained weight yolo_head.pred_cls.1.bias is unmatched with the shape [4] in model yolo_head.pred_cls.1.bias. And the weight yolo_head.pred_cls.1.bias will not be loaded
    [08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365, 192, 3, 3] in pretrained weight yolo_head.pred_cls.1.weight is unmatched with the shape [4, 192, 3, 3] in model yolo_head.pred_cls.1.weight. And the weight yolo_head.pred_cls.1.weight will not be loaded
    [08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365] in pretrained weight yolo_head.pred_cls.2.bias is unmatched with the shape [4] in model yolo_head.pred_cls.2.bias. And the weight yolo_head.pred_cls.2.bias will not be loaded
    [08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365, 96, 3, 3] in pretrained weight yolo_head.pred_cls.2.weight is unmatched with the shape [4, 96, 3, 3] in model yolo_head.pred_cls.2.weight. And the weight yolo_head.pred_cls.2.weight will not be loaded
    [08/23 21:31:40] ppdet.utils.checkpoint INFO: Finish loading model weights: C:\Users\MM/.cache/paddle/weights\ppyoloe_crn_s_obj365_pretrained.pdparams
    Traceback (most recent call last):
      File "tools/train.py", line 172, in <module>
        main()
      File "tools/train.py", line 168, in main
        run(FLAGS, cfg)
      File "tools/train.py", line 132, in run
        trainer.train(FLAGS.eval)
      File "D:\0SDXX\PaddleDetection\ppdet\engine\trainer.py", line 504, in train
        outputs = model(data)
      File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
        return self._dygraph_call_func(*inputs, **kwargs)
      File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
        outputs = self.forward(*inputs, **kwargs)
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\architectures\meta_arch.py", line 59, in forward
        out = self.get_loss()
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 124, in get_loss
        return self._forward()
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 88, in _forward
        yolo_losses = self.yolo_head(neck_feats, self.inputs)
      File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
        return self._dygraph_call_func(*inputs, **kwargs)
      File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
        outputs = self.forward(*inputs, **kwargs)
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 216, in forward
        return self.forward_train(feats, targets)
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 161, in forward_train
        ], targets)
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 354, in get_loss
        assigned_scores_sum)
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 290, in _bbox_loss
        assigned_ltrb_pos) * bbox_weight
      File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 255, in _df_loss
        pred_dist, target_left, reduction='none') * weight_left
      File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\nn\functional\loss.py", line 1723, in cross_entropy
        label_max.item()))
    ValueError: Target 28 is out of upper bound.
    
    
    windows status/close 
    opened by monkeycc 25
  • 只训练一轮epoch就结束了

    只训练一轮epoch就结束了

    根据教程,输入 python tools/train.py -c configs/yolov3/yolov3_mobilenet_v1_roadsign.yml ,只训练一轮就结束了,没有任何报错,环境python3.7,paddledetection是clone的2.1-gpu版本,epoch是默认的12。 1

    opened by lizhenhanabc 25
  • PPYOLOFPN结构修改

    PPYOLOFPN结构修改

    我在PaddleDetection/ppdet/modeling/necks/yolo_fpn.py 做了如下修改: 【在forward中加入 block = ChannelAttention(block); block = SpatialAttention()】

    class PPYOLOFPN(nn.Layer):
        __shared__ = ['norm_type', 'data_format']
    
        def __init__(self,
                     in_channels=[512, 1024, 2048],
                     norm_type='bn',
                     data_format='NCHW',
                     coord_conv=False,
                     conv_block_num=2,
                     drop_block=False,
                     block_size=3,
                     keep_prob=0.9,
                     spp=False):
            """
            PPYOLOFPN layer
    
            Args:
                in_channels (list): input channels for fpn
                norm_type (str): batch norm type, default bn
                data_format (str): data format, NCHW or NHWC
                coord_conv (bool): whether use CoordConv or not
                conv_block_num (int): conv block num of each pan block
                drop_block (bool): whether use DropBlock or not
                block_size (int): block size of DropBlock
                keep_prob (float): keep probability of DropBlock
                spp (bool): whether use spp or not
    
            """
            super(PPYOLOFPN, self).__init__()
            assert len(in_channels) > 0, "in_channels length should > 0"
            self.in_channels = in_channels
            self.num_blocks = len(in_channels)
            # parse kwargs
            self.coord_conv = coord_conv
            self.drop_block = drop_block
            self.block_size = block_size
            self.keep_prob = keep_prob
            self.spp = spp
            self.conv_block_num = conv_block_num
            self.data_format = data_format
            if self.coord_conv:
                ConvLayer = CoordConv
            else:
                ConvLayer = ConvBNLayer
    
            if self.drop_block:
                dropblock_cfg = [[
                    'dropblock', DropBlock, [self.block_size, self.keep_prob],
                    dict()
                ]]
            else:
                dropblock_cfg = []
    
            self._out_channels = []
            self.yolo_blocks = []
            self.routes = []
            for i, ch_in in enumerate(self.in_channels[::-1]):
                if i > 0:
                    ch_in += 512 // (2**i)
                channel = 64 * (2**self.num_blocks) // (2**i)
                base_cfg = []
                c_in, c_out = ch_in, channel
                for j in range(self.conv_block_num):
                    base_cfg += [
                        [
                            'conv{}'.format(2 * j), ConvLayer, [c_in, c_out, 1],
                            dict(
                                padding=0, norm_type=norm_type)
                        ],
                        [
                            'conv{}'.format(2 * j + 1), ConvBNLayer,
                            [c_out, c_out * 2, 3], dict(
                                padding=1, norm_type=norm_type)
                        ],
                    ]
                    c_in, c_out = c_out * 2, c_out
    
                base_cfg += [[
                    'route', ConvLayer, [c_in, c_out, 1], dict(
                        padding=0, norm_type=norm_type)
                ], [
                    'tip', ConvLayer, [c_out, c_out * 2, 3], dict(
                        padding=1, norm_type=norm_type)
                ]]
    
                if self.conv_block_num == 2:
                    if i == 0:
                        if self.spp:
                            spp_cfg = [[
                                'spp', SPP, [channel * 4, channel, 1], dict(
                                    pool_size=[5, 9, 13], norm_type=norm_type)
                            ]]
                        else:
                            spp_cfg = []
                        cfg = base_cfg[0:3] + spp_cfg + base_cfg[
                            3:4] + dropblock_cfg + base_cfg[4:6]
                    else:
                        cfg = base_cfg[0:2] + dropblock_cfg + base_cfg[2:6]
                elif self.conv_block_num == 0:
                    if self.spp and i == 0:
                        spp_cfg = [[
                            'spp', SPP, [c_in * 4, c_in, 1], dict(
                                pool_size=[5, 9, 13], norm_type=norm_type)
                        ]]
                    else:
                        spp_cfg = []
                    cfg = spp_cfg + dropblock_cfg + base_cfg
                name = 'yolo_block.{}'.format(i)
                yolo_block = self.add_sublayer(name, PPYOLODetBlock(cfg, name))
                self.yolo_blocks.append(yolo_block)
                self._out_channels.append(channel * 2)
                if i < self.num_blocks - 1:
                    name = 'yolo_transition.{}'.format(i)
                    route = self.add_sublayer(
                        name,
                        ConvBNLayer(
                            ch_in=channel,
                            ch_out=256 // (2**i),
                            filter_size=1,
                            stride=1,
                            padding=0,
                            norm_type=norm_type,
                            data_format=data_format,
                            name=name))
                    self.routes.append(route)
    
        def forward(self, blocks):
            assert len(blocks) == self.num_blocks
            blocks = blocks[::-1]
            yolo_feats = []
            for i, block in enumerate(blocks):
                if i > 0:
                    if self.data_format == 'NCHW':
                        logger.info("进入ChannelA")
                        block = ChannelAttention(block)
                        block = SpatialAttention()
                        block = paddle.concat([route, block], axis=1)
                    else:
                        block = ChannelAttention(block)
                        block = SpatialAttention()
                        block = paddle.concat([route, block], axis=-1)
                route, tip = self.yolo_blocks[i](block)
                yolo_feats.append(tip)
    
                if i < self.num_blocks - 1:
                    route = self.routes[i](route)
                    route = F.interpolate(
                        route, scale_factor=2., data_format=self.data_format)
    
            return yolo_feats
    
        @classmethod
        def from_config(cls, cfg, input_shape):
            return {'in_channels': [i.channels for i in input_shape], }
    
        @property
        def out_shape(self):
            return [ShapeSpec(channels=c) for c in self._out_channels]
    
    class ChannelAttention(nn.Layer):
        def __init__(self, in_planes, ratio=16):
            super(ChannelAttention, self).__init__()
            self.avg_pool = nn.AdaptiveAvgPool2D(1)
            self.max_pool = nn.AdaptiveAvgPool2D(1)
    
            self.fc1   = nn.Conv2D(in_planes, in_planes // 16, 1, bias=False)
            self.relu1 = F.relu()
            self.fc2   = nn.Conv2D(in_planes // 16, in_planes, 1, bias=False)
    
            self.sigmoid = F.sigmoid()
    
        def forward(self, x):
            logger.info("进入ChannelAttention")
            
            avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x))))
            max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x))))
            out = avg_out + max_out
            return self.sigmoid(out)
    
    class SpatialAttention(nn.Layer):
        def __init__(self, kernel_size=7):
            super(SpatialAttention, self).__init__()
            logger.info("进入SpatialAttention")
    
            assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
            padding = 3 if kernel_size == 7 else 1
    
            self.conv1 = nn.Conv2D(2, 1, kernel_size, padding=padding, bias=False)
            self.sigmoid = F.sigmoid()
    
        def forward(self, x):
            avg_out = paddle.mean(x, dim=1, keepdim=True)
            max_out, _ = paddle.max(x, dim=1, keepdim=True)
            x = paddle.concat([avg_out, max_out], dim=1)
            x = self.conv1(x)
            return self.sigmoid(x)
    
    enhancement 
    opened by zsbjmy 25
  • jetson nano 部署 路标检测 模型 :GPU显示没有目标,CPU则正确显示目标。

    jetson nano 部署 路标检测 模型 :GPU显示没有目标,CPU则正确显示目标。

    paddlepaddle - gpu 版本2.0.0 PaddleDetection版本release 2.0 rc 硬件:jetson nano jetpack4.3 场景: 1、使用tools/train.py 对 路标检测模型(yolov3_mobilenet_v1_roadsign.yml)进行训练,得到best_model。 2、使用tools/export_model.py进行导出模型。 python tools/export_model.py -c configs/yolov3_mobilenet_v1_roadsign.yml
    --output_dir=./export_model
    -o weights=./output/yolov3_mobilenet_v1_roadsign/best_model TestReader,input_def.image_shape=[3,320,320] 3、使用deploy/python/infer.py在jetson nano下进行部署。 (1)使用gpu预测: gpu (2)使用cpu预测正常显示。 cpu

    麻烦跟进一下,后续需要提供什么信息请告知我就行,谢谢!

    deploy 
    opened by GZHUZhao 24
  • VisDrone-DET 检测模型检测类别说明有误

    VisDrone-DET 检测模型检测类别说明有误

    文档链接&描述 Document Links & Description

    链接:https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/visdrone 第一段存在问题:PaddleDetection团队提供了针对VisDrone-DET小目标数航拍场景的基于PP-YOLOE的检测模型,用户可以下载模型进行使用。整理后的COCO格式VisDrone-DET数据集下载链接,检测其中的10类,包括 pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10),原始数据集下载链接。 问题:这一部分“检测其中的10类,包括 pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10)。”类别名和序号应该存在问题。 发现问题步骤: 1、执行如下命令python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --infer_img=demo/000000014439.jpg -o use_gpu=False weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_img=demo/000000014439.jpg --draw_threshold=0.1 2、output目录生成的文件打印的类别中包括traffic_light。 000000014439

    请提出你的建议 Please give your suggestion

    从结果来看应该是有错的,暂时我也没找到正确的类别。

    opened by tianmaxingkong168 0
  • 为什么使用多进程以后,时间会更慢了?

    为什么使用多进程以后,时间会更慢了?

    问题确认 Search before asking

    • [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.

    请提出你的问题 Please ask your question

    为什么使用多进程以后,时间会更慢了?本来行人检测时间也就6ms左右,用了如下方式启用多进程以后,该行人检测时间竟然达到了48ms以上,这是为什么? image read_frame函数在做读帧,并做行人检测处理。

    opened by qq-tt 2
  • paddledetection 中车牌识别因文字影响无法识别车牌的问题

    paddledetection 中车牌识别因文字影响无法识别车牌的问题

    问题确认 Search before asking

    • [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.

    请提出你的问题 Please ask your question

    当用paddledetection中的车牌检测时。车头有文字的时候,没有检测到车牌,而是检测到文字上了,想问问能怎么解决!

    OCRv3
    opened by PengKunGit 2
  • 多进程报错

    多进程报错

    问题确认 Search before asking

    • [X] 我已经查询历史issue,没有发现相似的bug。I have searched the issues and found no similar bug report.

    Bug组件 Bug Component

    No response

    Bug描述 Describe the Bug

    多进程代码写法是参考网上的,能跑得通,但是用在飞桨这里就报错 image 报错如下: image

    复现环境 Environment

    linux:PadlepadleDetection

    Bug描述确认 Bug description confirmation

    • [X] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

    是否愿意提交PR? Are you willing to submit a PR?

    • [X] 我愿意提交PR!I'd like to help by submitting a PR!
    opened by qq-tt 2
Releases(v2.5.0)
  • v2.5.0(Sep 13, 2022)

    2.5(08.26/2022)

    • 特色模型

      • PP-YOLOE+:
        • 发布PP-YOLOE+模型,COCO test2017数据集精度提升0.7%-2.4% mAP,模型训练收敛速度提升3.75倍,端到端预测速度提升1.73-2.3倍
        • 发布智慧农业,夜间安防检测,工业质检场景预训练模型,精度提升1.3%-8.1% mAP
        • 支持分布式训练、在线量化、serving部署等10大高性能训练部署能力,新增C++/Python Serving、TRT原生推理、ONNX Runtime等5+部署demo教程
      • PP-PicoDet:
        • 发布PicoDet-NPU模型,支持模型全量化部署
        • 新增PicoDet版面分析模型,基于FGD蒸馏算法精度提升0.5% mAP
      • PP-TinyPose
        • 发布PP-TinyPose增强版,在健身、舞蹈等场景的业务数据集端到端AP提升9.1% AP
        • 覆盖侧身、卧躺、跳跃、高抬腿等非常规动作
        • 新增滤波稳定模块,关键点稳定性显著增强
    • 场景能力

      • PP-Human v2
        • 发布PP-Human v2,支持四大产业特色功能:多方案行为识别案例库、人体属性识别、人流检测与轨迹留存以及高精度跨镜跟踪
        • 底层算法能力升级,行人检测精度提升1.5% mAP;行人跟踪精度提升10.2% MOTA,轻量级模型速度提升34%;属性识别精度提升0.6% ma,轻量级模型速度提升62.5%
        • 提供全流程教程,覆盖数据采集标注,模型训练优化和预测部署,及pipeline中后处理代码修改
        • 新增在线视频流输入支持
        • 易用性提升,一行代码执行功能,执行流程判断、模型下载背后自动完成。
      • PP-Vehicle
        • 全新发布PP-Vehicle,支持四大交通场景核心功能:车牌识别、属性识别、车流量统计、违章检测
        • 车牌识别支持基于PP-OCR v3的轻量级车牌识别模型
        • 车辆属性识别支持基于PP-LCNet多标签分类模型
        • 兼容图片、视频、在线视频流等各类数据输入格式
        • 易用性提升,一行代码执行功能,执行流程判断、模型下载背后自动完成。
    • 前沿算法

      • YOLO家族全系列模型
        • 发布YOLO家族全系列模型,覆盖前沿检测算法YOLOv5、MT-YOLOv6及YOLOv7
        • 基于ConvNext骨干网络,YOLO各算法训练周期缩5-8倍,精度普遍提升1%-5% mAP;使用模型压缩策略实现精度无损的同时速度提升30%以上
      • 新增基于ViT骨干网络高精度检测模型,COCO数据集精度达到55.7% mAP
      • 新增OC-SORT多目标跟踪模型
      • 新增ConvNeXt骨干网络
    • 产业实践范例教程

      • 基于PP-TinyPose增强版的智能健身动作识别
      • 基于PP-Human的打架识别
      • 基于PP-Human的营业厅来客分析
      • 基于PP-Vehicle的车辆结构化分析
      • 基于PP-YOLOE+的PCB电路板缺陷检测
    • 框架能力

      • 功能新增
        • 新增自动压缩工具支持并提供demo,PP-YOLOE l版本精度损失0.3% mAP,V100速度提升13%
        • 新增PaddleServing python/C++和ONNXRuntime部署demo
        • 新增PP-YOLOE 端到端TensorRT部署demo
        • 新增FGC蒸馏算法,RetinaNet精度提升3.3%
        • 新增分布式训练文档
      • 功能完善/Bug修复
        • 修复Windows c++部署编译问题
        • 修复VOC格式数据预测时保存结果问题
        • 修复FairMOT c++部署检测框输出
        • 旋转框检测模型S2ANet支持batch size>1部署
    Source code(tar.gz)
    Source code(zip)
  • v2.4.0(Apr 24, 2022)

    2.4(03.24/2022)

    • PP-YOLOE:

      • 发布PP-YOLOE特色模型,l版本COCO test2017数据集精度51.6%,V100预测速度78.1 FPS,精度速度服务器端SOTA
      • 发布s/m/l/x系列模型,打通TensorRT、ONNX部署能力
      • 支持混合精度训练,训练较PP-YOLOv2加速33%
    • PP-PicoDet:

      • 发布PP-PicoDet优化模型,精度提升2%左右,CPU预测速度提升63%。
      • 新增参数量0.7M的PicoDet-XS模型
      • 后处理集成到网络中,优化端到端部署成本
    • 行人分析Pipeline:

      • 发布PP-Human行人分析Pipeline,覆盖行人检测、属性识别、行人跟踪、跨镜跟踪、人流量统计、动作识别多种功能,打通TensorRT部署
      • 属性识别支持StrongBaseline模型
      • ReID支持Centroid模型
      • 动作识别支持ST-GCN摔倒检测
    • 模型丰富度:

      • 发布YOLOX,支持nano/tiny/s/m/l/x版本,x版本COCO val2017数据集精度51.8%
    • 框架功能优化:

      • EMA训练速度优化20%,优化EMA训练模型保存方式
      • 支持infer预测结果保存为COCO格式
    • 部署优化:

      • RCNN全系列模型支持Paddle2ONNX导出ONNX模型
      • SSD模型支持导出时融合解码OP,优化边缘端部署速度
      • 支持NMS导出TensorRT,TensorRT部署端到端速度提升
    Source code(tar.gz)
    Source code(zip)
  • v2.3.0(Dec 9, 2021)

    • 检测: 轻量级移动端检测模型PP-PicoDet,精度速度达到移动端SOTA

    • 关键点: 轻量级移动端关键点模型PP-TinyPose

    • 模型丰富度:

      • 检测:

        • 新增Swin-Transformer目标检测模型
        • 新增TOOD(Task-aligned One-stage Object Detection)模型
        • 新增GFL(Generalized Focal Loss)目标检测模型
        • 发布Sniper小目标检测优化方法,支持Faster RCNN及PP-YOLO系列模型
        • 发布针对EdgeBoard优化的PP-YOLO-EB模型
      • 跟踪

        • 发布实时跟踪系统PP-Tracking
        • 发布FairMot高精度模型、小尺度模型和轻量级模型
        • 发布行人、人头和车辆实跟踪垂类模型库,覆盖航拍监控、自动驾驶、密集人群、极小目标等场景
        • DeepSORT模型适配PP-YOLO, PP-PicoDet等更多检测器
      • 关键点

        • 新增Lite HRNet模型
    • 预测部署:

      • YOLOv3系列模型支持NPU预测部署
      • FairMot模型C++预测部署打通
      • 关键点系列模型C++预测部署打通, Paddle Lite预测部署打通
    • 文档:

      • 新增各系列模型英文文档
    Source code(tar.gz)
    Source code(zip)
  • v2.2.0(Aug 17, 2021)

    • 模型丰富度:

      • 发布Transformer检测模型:DETR、Deformable DETR、Sparse RCNN
      • 关键点检测新增Dark模型,发布Dark HRNet模型
      • 发布MPII数据集HRNet关键点检测模型
      • 发布人头、车辆跟踪垂类模型
    • 模型优化:

      • 旋转框检测模型S2ANet发布Align Conv优化模型,DOTA数据集mAP优化至74.0
    • 预测部署

      • 主流模型支持batch size>1预测部署,包含YOLOv3,PP-YOLO,Faster RCNN,SSD,TTFNet,FCOS
      • 新增多目标跟踪模型(JDE, FairMot, DeepSort) Python端预测部署支持,并支持TensorRT预测
      • 新增多目标跟踪模型FairMot联合关键点检测模型部署Python端预测部署支持
      • 新增关键点检测模型联合PP-YOLO预测部署支持
    • 文档:

      • Windows预测部署文档新增TensorRT版本说明
      • FAQ文档更新发布
    • 问题修复:

      • 修复PP-YOLO系列模型训练收敛性问题
      • 修复batch size>1时无标签数据训练问题
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(May 20, 2021)

    • 模型丰富度提升:

      • 发布关键点模型HRNet,HigherHRNet
      • 发布多目标跟踪模型DeepSort, FairMot, JDE
    • 框架基础能力:

      • 支持无标注框训练
    • 预测部署:

      • Paddle Inference YOLOv3系列模型支持batch size>1预测
      • 旋转框检测S2ANet模型预测部署打通
      • 增加量化模型Benchmark
      • 增加动态图模型与静态图模型Paddle-Lite demo
    • 检测模型压缩:

      • 发布PPYOLO系列模型压缩模型
    • 文档:

      • 更新快速开始,预测部署等教程文档
      • 新增ONNX模型导出教程
      • 新增移动端部署文档
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Apr 19, 2021)

    2.0(04.15/2021)

    说明: 自2.0版本开始,动态图作为PaddleDetection默认版本,原dygraph目录切换为根目录,原静态图实现移动到static目录下。

    • 动态图模型丰富度提升:

      • 发布PP-YOLOv2及PP-YOLO tiny模型,PP-YOLOv2 COCO test数据集精度达到49.5%,V100预测速度达到68.9 FPS
      • 发布旋转框检测模型S2ANet
      • 发布两阶段实用模型PSS-Det
      • 发布人脸检测模型Blazeface
    • 新增基础模块:

      • 新增SENet,GhostNet,Res2Net骨干网络
      • 新增VisualDL训练可视化支持
      • 新增单类别精度计算及PR曲线绘制功能
      • YOLO系列模型支持NHWC数据格式
    • 预测部署:

      • 发布主要模型的预测benchmark数据
      • 适配TensorRT6,支持TensorRT动态尺寸输入,支持TensorRT int8量化预测
      • PP-YOLO, YOLOv3, SSD, TTFNet, FCOS, Faster RCNN等7类模型在Linux、Windows、NV Jetson平台下python/cpp/TRT预测部署打通:
    • 检测模型压缩:

      • 蒸馏:新增动态图蒸馏支持,并发布YOLOv3-MobileNetV1蒸馏模型
      • 联合策略:新增动态图剪裁+蒸馏联合策略压缩方案,并发布YOLOv3-MobileNetV1的剪裁+蒸馏压缩模型
      • 问题修复:修复动态图量化模型导出问题
    • 文档:

      • 新增动态图英文文档:包含首页文档,入门使用,快速开始,模型算法、新增数据集等
      • 新增动态图中英文安装文档
      • 新增动态图RCNN系列和YOLO系列配置文件模板及配置项说明文档
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Nov 27, 2019)

    • 基于PaddlePaddle v1.6.1版本.
    • 模型包括: Faster R-CNN, Mask R-CNN, Faster R-CNN+FPN, Mask R-CNN+FPN, Cascade-Faster-RCNN+FPN, Cascade-Mask-RCNN+FPN, RetinaNet, YOLOv3, SSD,以及人脸检测模型Faceboxes, BlazeFace.
    • 增强版的YOLOv3在COCO上精度达到41.4%,CBResNet200-vd-FPN-Nonlocal模型在COCO上精度达到53.3%,包含行人检测和车辆检测预训练模型
    • 支持sync-bn、多尺度训练、多尺度测试、FP16训练,包含预测benchmark
    Source code(tar.gz)
    Source code(zip)
N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting

N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting Recent progress in neural forecasting instigated significant improvements in the

Cristian Challu 82 Jan 04, 2023
Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

RawVSR This repo contains the official codes for our paper: Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference Xiaoh

Xiaohong Liu 23 Oct 08, 2022
Predicting Event Memorability from Contextual Visual Semantics

Predicting Event Memorability from Contextual Visual Semantics

0 Oct 06, 2021
This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

neon_course This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework. For more information, see

Nervana 92 Jan 03, 2023
Using some basic methods to show linkages and transformations of robotic arms

roboticArmVisualizer Python GUI application to create custom linkages and adjust joint angles. In the future, I plan to add 2d inverse kinematics solv

Sandesh Banskota 1 Nov 19, 2021
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
Machine Learning Model deployment for Container (TensorFlow Serving)

try_tf_serving ├───dataset │ ├───testing │ │ ├───paper │ │ ├───rock │ │ └───scissors │ └───training │ ├───paper │ ├───rock

Azhar Rizki Zulma 5 Jan 07, 2022
DataCLUE: 国内首个以数据为中心的AI测评(含模型分析报告)

DataCLUE: A Benchmark Suite for Data-centric NLP You can get the english version of README. 以数据为中心的AI测评(DataCLUE) 内容导引 章节 描述 简介 介绍以数据为中心的AI测评(DataCLUE

CLUE benchmark 135 Dec 22, 2022
Probabilistic Cross-Modal Embedding (PCME) CVPR 2021

Probabilistic Cross-Modal Embedding (PCME) CVPR 2021 Official Pytorch implementation of PCME | Paper Sanghyuk Chun1 Seong Joon Oh1 Rafael Sampaio de R

NAVER AI 87 Dec 21, 2022
Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation of the NeurIPS 2021 paper Alias-Free Generative Adversarial Net

Diego Porres 185 Dec 24, 2022
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP

CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2

168 Dec 29, 2022
The repository offers the official implementation of our paper in PyTorch.

Cloth Interactive Transformer (CIT) Cloth Interactive Transformer for Virtual Try-On Bin Ren1, Hao Tang1, Fanyang Meng2, Runwei Ding3, Ling Shao4, Phi

Bingoren 49 Dec 01, 2022
KIDA: Knowledge Inheritance in Data Aggregation

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

24 Sep 08, 2022
A Pytree Module system for Deep Learning in JAX

Treex A Pytree-based Module system for Deep Learning in JAX Intuitive: Modules are simple Python objects that respect Object-Oriented semantics and sh

Cristian Garcia 216 Dec 20, 2022
Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Delving into Localization Errors for Monocular 3D Detection By Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang. Intr

XINZHU.MA 124 Jan 04, 2023
Source code for paper "Deep Diffusion Models for Robust Channel Estimation", TBA.

diffusion-channels Source code for paper "Deep Diffusion Models for Robust Channel Estimation". Generic flow: Use 'matlab/main.mat' to generate traini

The University of Texas Computational Sensing and Imaging Lab 15 Dec 22, 2022
Background-Click Supervision for Temporal Action Localization

Background-Click Supervision for Temporal Action Localization This repository is the official implementation of BackTAL. In this work, we study the te

LeYang 221 Oct 09, 2022
Simple machine learning library / 簡單易用的機器學習套件

FukuML Simple machine learning library / 簡單易用的機器學習套件 Installation $ pip install FukuML Tutorial Lesson 1: Perceptron Binary Classification Learning Al

Fukuball Lin 279 Sep 15, 2022
Like Dirt-Samples, but cleaned up

Clean-Samples Like Dirt-Samples, but cleaned up, with clear provenance and license info (generally a permissive creative commons licence but check the

TidalCycles 39 Nov 30, 2022
Populating 3D Scenes by Learning Human-Scene Interaction https://posa.is.tue.mpg.de/

Populating 3D Scenes by Learning Human-Scene Interaction [Project Page] [Paper] License Software Copyright License for non-commercial scientific resea

Mohamed Hassan 81 Nov 08, 2022