tf2-keras implement yolov5

Overview

YOLOv5 in tesnorflow2.x-keras

模型测试

  • 训练 COCO2017(val 5k)

  • 检测效果

  • 精度/召回率

Requirements

pip3 install -r requirements.txt

Get start

  1. 训练
python3 train.py
  1. tensorboard
tensorboard --host 0.0.0.0 --logdir ./logs/ --port 8053 --samples_per_plugin=images=40
  1. 查看
http://127.0.0.1:8053
  1. 测试, 修改detect.py里面input_imagemodel_path
python3 detect.py

训练自己的数据

  1. labelme打标自己的数据
  2. 打开data/labelme2coco.py脚本, 修改如下地方
input_dir = '这里写labelme打标时保存json标记文件的目录'
output_dir = '这里写要转CoCo格式的目录,建议建一个空目录'
labels = "这里是你打标时所有的类别名, txt文本即可, 每行一个类, 类名无需加引号"
  1. 执行data/labelme2coco.py脚本会在output_dir生成对应的json文件和图片
  2. 修改train.py文件中coco_annotation_file以及num_class, 注意classes通过CoCoDataGenrator(*).coco.cats[label_id]['name']可获得,由于coco中类别不连续,所以通过coco.cats拿到的数组下标拿到的类别可能不准.
  3. 开始训练, python3 train.py
Comments
  • 关于类别损失计算的问题

    关于类别损失计算的问题

    您好,loss这段不是很理解, https://github.com/yyccR/yolov5_in_tf2_keras/blob/3e6645cbf94d2a1e11c33663e80113daa4590321/loss.py#L142-L152 请问targets最后两位应该是置信度1和最佳的anchor索引吗? https://github.com/yyccR/yolov5_in_tf2_keras/blob/3e6645cbf94d2a1e11c33663e80113daa4590321/loss.py#L288-L293 那这边split出来的true_obj, true_cls应该就是对应的置信度1和最佳的anchor索引吧。 那这个类别损失 https://github.com/yyccR/yolov5_in_tf2_keras/blob/3e6645cbf94d2a1e11c33663e80113daa4590321/loss.py#L356 计算的不是最佳anchor索引吗,是跟obj_mask 有关系吗

    opened by whalefa1I 5
  • sparse_categorical_crossentropy训练时有nan结果

    sparse_categorical_crossentropy训练时有nan结果

    有的数据会在这行出现nan https://github.com/yyccR/yolov5_in_tf2_keras/blob/033a1156c1481f4258bf24a4a8215af39682da94/loss.py#L357 查看了input的is_nan,都正常。而且把sparse_categorical_crossentropy换成binary_crossentropy就好了。 请问这两者在这里计算有差别吗,是否可以进行替换

    opened by whalefa1I 3
  • lebelme2coco处理逻辑有误

    lebelme2coco处理逻辑有误

    我在实际使用您的代码训练自己的数据集时发现,labelme2coco.py 好像缺少对shape_type == "rectangle"时的处理,导致我最后生成的json文件annotations项为空。 以下是labelme2coco.py文件100行到124行代码: ` if shape_type == "polygon": mask = labelme.utils.shape_to_mask( img.shape[:2], points, shape_type ) # cv2.imshow("",np.array(mask, dtype=np.uint8)*255) # cv2.waitKey(0)

                if group_id is None:
                    group_id = uuid.uuid1()
    
                instance = (label, group_id)
                # print(instance)
    
                if instance in masks:
                    masks[instance] = masks[instance] | mask
                else:
                    masks[instance] = mask
                # print(masks[instance].shape)
    
                if shape_type == "rectangle":
                    (x1, y1), (x2, y2) = points
                    x1, x2 = sorted([x1, x2])
                    y1, y2 = sorted([y1, y2])
                    points = [x1, y1, x2, y1, x2, y2, x1, y2]
                if shape_type == "circle": 
                ....
    

    ` 代码永远不会执行到shape_type == "rectangle"或shape_type == "circle"。

    opened by aijialin 2
  • layers.py

    layers.py

    根據ultralytics/yolov5:

    https://github.com/ultralytics/yolov5/blob/63ddb6f0d06f6309aa42bababd08c859197a27af/models/common.py#L70-L73

    這一段程式:

    https://github.com/yyccR/yolov5_in_tf2_keras/blob/46298d7c98073750176d64896ee9dc01b55c5aca/layers.py#L127-L132

    是不是應該改寫成:

        def call(self, inputs, *args, **kwargs):
            y = self.multiheadAttention(self.q(inputs), self.v(inputs), self.k(inputs)) + inputs
            x = self.fc1(x)
            x = self.fc2(x)
            x = x +  y
            return x
    
    opened by AugustusHsu 1
  • What is the mAP on COCO17 val ?

    What is the mAP on COCO17 val ?

    Hi @yyccR, thanks for your repo. I want to know if you can reach the same mAP as in original YOLOV5 (Train on COCO17 train and test on COCO17 val)? And do you have plan to release some pretrained checkpoint ?

    opened by Tyler-D 1
Releases(v1.1)
  • v1.1(Jun 24, 2022)

    v1.1 几个总结:

    • [1]. 调整tf.keras.layers.BatchNormalization的__call__方法中training=True
    • [2]. 新增TFLite/onnx格式导出与验证,详见/data/h5_to_tflite.py, /data/h5_to_onnx.py
    • [3]. 修改backbone网络里batch_size,在训练和测试时需指定,避免tflite导出时FlexOps问题
    • [4]. YoloHead里对类别不再做softmax,直接sigmoid,支持多类别输出
    • [5]. release里的yolov5s-best.h5为kaggle猫狗脸数据集的重新训练权重,训练:测试为8:2,val精度大概如下:

    | class | [email protected] | [email protected]:0.95 | precision | recall | | :-: | :-: | :-: | :-: | :-: | | cat | 0.962680 | 0.672483 | 0.721003 | 0.958333 | | dog | 0.934285 | 0.546893 | 0.770701 | 0.923664 | | total | 0.948482 | 0.609688 | 0.745852 | 0.940999 |

    • [6]. release里的yolov5s-best.tflite为上述yolov5s-best.h5的tflite量化模型,建议用Netron软件打开查看输入输出
    • [7]. release里的yolov5s-best.onnx为上述yolov5s-best.h5的onnx模型,建议用Netron软件打开查看输入输出
    • [8]. android 模型测试效果如下:

    就这样,继续加油!💪🏻💪🏻💪🏻

    Source code(tar.gz)
    Source code(zip)
    yolov5s-best.h5(27.51 MB)
    yolov5s-best.onnx(27.25 MB)
    yolov5s-best.tflite(6.95 MB)
  • v1.0(Jun 21, 2022)

    v1.0 几个总结:

    • [1]. 模型结构总的与 ultralytics/yolov5 v6.0 保持一致
    • [2]. 其中Conv层替换swishRelu
    • [3]. 整体数据增强与 ultralytics/yolov5 保持一致
    • [4]. readme中训练所需的数据集为kaggle公开猫狗脸检测数据集,已放到release列表中
    • [5]. 为什么不训练coco数据集?因为没资源,跑一个coco要很久的,服务器一直都有任务在跑所以没空去跑 - . -
    • [6]. release里的yolov5s-best.h5为上述kaggle猫狗脸数据集的训练权重,训练:测试为8:2,val精度大概如下:

    | class | [email protected] | [email protected]:0.95 | precision | recall | | :-: | :-: | :-: | :-: | :-: | | cat | 0.905156 | 0.584378 | 0.682848 | 0.886555 | | dog | 0.940633 | 0.513005 | 0.724036 | 0.934866 | | total | 0.922895 | 0.548692 | 0.703442 | 0.910710 |

    就这样,继续加油!💪🏻💪🏻💪🏻

    Source code(tar.gz)
    Source code(zip)
    JPEGImages.zip(260.17 MB)
    yolov5s-best.h5(27.51 MB)
Owner
yangcheng
yangcheng
This repo holds the code of TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

TransFuse This repo holds the code of TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation Requirements Pytorch=1.6.0, 1.9.0 (=1.

Rayicer 93 Dec 19, 2022
[NeurIPS 2021] SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning

SSUL - Official Pytorch Implementation (NeurIPS 2021) SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning Sun

Clova AI Research 44 Dec 27, 2022
DNA sequence classification by Deep Neural Network

DNA sequence classification by Deep Neural Network: Project Overview worked on the DNA sequence classification problem where the input is the DNA sequ

Mohammed Jawwadul Islam Fida 0 Aug 02, 2022
PyTorch EO aims to make Deep Learning for Earth Observation data easy and accessible to real-world cases and research alike.

Pytorch EO Deep Learning for Earth Observation applications and research. 🚧 This project is in early development, so bugs and breaking changes are ex

earthpulse 28 Aug 25, 2022
Unsupervised clustering of high content screen samples

Microscopium Unsupervised clustering and dataset exploration for high content screens. See microscopium in action Public dataset BBBC021 from the Broa

60 Dec 05, 2022
Multi-Modal Machine Learning toolkit based on PaddlePaddle.

简体中文 | English PaddleMM 简介 飞桨多模态学习工具包 PaddleMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 PaddleMM 初始版本 v1.0 特性 丰富的任务

njustkmg 520 Dec 28, 2022
EssentialMC2 Video Understanding

EssentialMC2 Introduction EssentialMC2 is a complete system to solve video understanding tasks including MHRL(representation learning), MECR2( relatio

Alibaba 106 Dec 11, 2022
MLPs for Vision and Langauge Modeling (Coming Soon)

MLP Architectures for Vision-and-Language Modeling: An Empirical Study MLP Architectures for Vision-and-Language Modeling: An Empirical Study (Code wi

Yixin Nie 27 May 09, 2022
Unoffical reMarkable AddOn for Firefox.

reMarkable for Firefox (Download) This repo converts the offical reMarkable Chrome Extension into a Firefox AddOn published here under the name "Unoff

Jelle Schutter 45 Nov 28, 2022
[ICCV 2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation

EPCDepth EPCDepth is a self-supervised monocular depth estimation model, whose supervision is coming from the other image in a stereo pair. Details ar

Rui Peng 110 Dec 23, 2022
✨✨✨An awesome open source toolbox for stereo matching.

OpenStereo This is an awesome open source toolbox for stereo matching. Supported Methods: BM SGM(T-PAMI'07) GCNet(ICCV'17) PSMNet(CVPR'18) StereoNet(E

Wang Qingyu 6 Nov 04, 2022
TensorFlow-based implementation of "Pyramid Scene Parsing Network".

PSPNet_tensorflow Important Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If yo

HsuanKung Yang 323 Dec 20, 2022
Multiview 3D object detection on MultiviewC dataset through moft3d.

Multiview Orthographic Feature Transformation for 3D Object Detection Multiview 3D object detection on MultiviewC dataset through moft3d. Introduction

Jiahao Ma 20 Dec 21, 2022
Experiments and examples converting Transformers to ONNX

Experiments and examples converting Transformers to ONNX This repository containes experiments and examples on converting different Transformers to ON

Philipp Schmid 4 Dec 24, 2022
Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

CTDNet The PyTorch code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection" Requirements Python 3.6

CVTEAM 28 Oct 20, 2022
Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

Official-PyTorch-Implementation-of-TransMEF Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fu

117 Dec 27, 2022
TRIQ implementation

TRIQ Implementation TF-Keras implementation of TRIQ as described in Transformer for Image Quality Assessment. Installation Clone this repository. Inst

Junyong You 115 Dec 30, 2022
buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

buildseg buildseg is a Building Extraction plugin for QGIS based on PaddlePaddle. How to use Download and install QGIS and clone the repo : git clone

39 Dec 09, 2022
Reproduced Code for Image Forgery Detection papers.

Image Forgery Detection With over 4.5 billion active internet users, the amount of multimedia content being shared every day has surpassed everyone’s

Umar Masud 15 Dec 06, 2022
Official Pytorch Implementation for Splicing ViT Features for Semantic Appearance Transfer presenting Splice

Splicing ViT Features for Semantic Appearance Transfer [Project Page] Splice is a method for semantic appearance transfer, as described in Splicing Vi

Omer Bar Tal 253 Jan 06, 2023