This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Overview

YOLOv2 in Keras and Applications

This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend. It supports training YOLOv2 network with various backends such as MobileNet and InceptionV3. Links to demo applications are shown below. Check out https://experiencor.github.io/yolo_demo/demo.html for a Raccoon Detector demo run entirely in brower with DeepLearn.js and MobileNet backend (it somehow breaks in Window). Source code of this demo is located at https://git.io/vF7vG.

Todo list:

  • Warmup training
  • Raccoon detection, Self-driving car, and Kangaroo detection
  • SqueezeNet, MobileNet, InceptionV3, and ResNet50 backends
  • Support python 2.7 and 3.6
  • Multiple-GPU training
  • Multiscale training
  • mAP Evaluation

Some example applications (click for videos):

Raccon detection

Dataset => https://github.com/experiencor/raccoon_dataset

Kangaroo detection

Dataset => https://github.com/experiencor/kangaroo

Self-driving Car

Dataset => http://cocodataset.org/#detections-challenge2017

Red blod cell detection

Dataset => https://github.com/cosmicad/dataset

Hand detection

Dataset => http://cvrr.ucsd.edu/vivachallenge/index.php/hands/hand-detection/

Usage for python code

0. Requirement

python 2.7

keras >= 2.0.8

imgaug

1. Data preparation

Download the Raccoon dataset from from https://github.com/experiencor/raccoon_dataset.

Organize the dataset into 4 folders:

  • train_image_folder <= the folder that contains the train images.

  • train_annot_folder <= the folder that contains the train annotations in VOC format.

  • valid_image_folder <= the folder that contains the validation images.

  • valid_annot_folder <= the folder that contains the validation annotations in VOC format.

There is a one-to-one correspondence by file name between images and annotations. If the validation set is empty, the training set will be automatically splitted into the training set and validation set using the ratio of 0.8.

2. Edit the configuration file

The configuration file is a json file, which looks like this:

{
    "model" : {
        "architecture":         "Full Yolo",    # "Tiny Yolo" or "Full Yolo" or "MobileNet" or "SqueezeNet" or "Inception3"
        "input_size":           416,
        "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
        "max_box_per_image":    10,        
        "labels":               ["raccoon"]
    },

    "train": {
        "train_image_folder":   "/home/andy/data/raccoon_dataset/images/",
        "train_annot_folder":   "/home/andy/data/raccoon_dataset/anns/",      
          
        "train_times":          10,             # the number of time to cycle through the training set, useful for small datasets
        "pretrained_weights":   "",             # specify the path of the pretrained weights, but it's fine to start from scratch
        "batch_size":           16,             # the number of images to read in each batch
        "learning_rate":        1e-4,           # the base learning rate of the default Adam rate scheduler
        "nb_epoch":             50,             # number of epoches
        "warmup_epochs":        3,              # the number of initial epochs during which the sizes of the 5 boxes in each cell is forced to match the sizes of the 5 anchors, this trick seems to improve precision emperically

        "object_scale":         5.0 ,           # determine how much to penalize wrong prediction of confidence of object predictors
        "no_object_scale":      1.0,            # determine how much to penalize wrong prediction of confidence of non-object predictors
        "coord_scale":          1.0,            # determine how much to penalize wrong position and size predictions (x, y, w, h)
        "class_scale":          1.0,            # determine how much to penalize wrong class prediction

        "debug":                true            # turn on/off the line that prints current confidence, position, size, class losses and recall
    },

    "valid": {
        "valid_image_folder":   "",
        "valid_annot_folder":   "",

        "valid_times":          1
    }
}

The model section defines the type of the model to construct as well as other parameters of the model such as the input image size and the list of anchors. The labels setting lists the labels to be trained on. Only images, which has labels being listed, are fed to the network. The rest images are simply ignored. By this way, a Dog Detector can easily be trained using VOC or COCO dataset by setting labels to ['dog'].

Download pretrained weights for backend (tiny yolo, full yolo, squeezenet, mobilenet, and inceptionV3) at:

https://drive.google.com/drive/folders/10oym4eL2RxJa0gro26vzXK__TtYOP5Ng

These weights must be put in the root folder of the repository. They are the pretrained weights for the backend only and will be loaded during model creation. The code does not work without these weights.

The link to the pretrained weights for the whole model (both frontend and backend) of the raccoon detector can be downloaded at:

https://drive.google.com/drive/folders/10oym4eL2RxJa0gro26vzXK__TtYOP5Ng

These weights can be used as the pretrained weights for any one class object detectors.

3. Generate anchors for your dataset (optional)

python gen_anchors.py -c config.json

Copy the generated anchors printed on the terminal to the anchors setting in config.json.

4. Start the training process

python train.py -c config.json

By the end of this process, the code will write the weights of the best model to file best_weights.h5 (or whatever name specified in the setting "saved_weights_name" in the config.json file). The training process stops when the loss on the validation set is not improved in 3 consecutive epoches.

5. Perform detection using trained weights on an image by running

python predict.py -c config.json -w /path/to/best_weights.h5 -i /path/to/image/or/video

It carries out detection on the image and write the image with detected bounding boxes to the same folder.

Usage for jupyter notebook

Refer to the notebook (https://github.com/experiencor/basic-yolo-keras/blob/master/Yolo%20Step-by-Step.ipynb) for a complete walk-through implementation of YOLOv2 from scratch (training, testing, and scoring).

Evaluation of the current implementation:

Train Test mAP (with this implementation) mAP (on released weights)
COCO train COCO val 28.6 42.1

The code to evaluate detection results can be found at https://github.com/experiencor/basic-yolo-keras/issues/27.

Copyright

See LICENSE for details.

Comments
  • Quick Questions

    Quick Questions

    Hello

    Are you using Multiscale Training of Data..also You have Pretrained Weights on VOC Data ...Below is the Image of Blood Smear :- 111

    I Want to Detect The Purple Color and Red Color Cells...I have Done the Annotations ... I Only have 300 Images With Me with 15-20 Annotation in an Image...What do u Recommand ...

    opened by akshaylamba 32
  • Difficulty in training multiple classes....

    Difficulty in training multiple classes....

    ValueError: Cannot feed value of shape (30,) for Tensor u'Placeholder_41:0', which has shape '(35,)'

    Getting this error when I try to train two classes on tiny_yolo_raccoon.h5 pretrained weight when I train for one class its not a problem but when I do it for two this is the error I am getting. I want it for traffic sign detecting and classification with more than 10 classes can I do it also if anyone have weights for multi-class training please share it it would be much helpful.

    Can anyone help. Thanks in advance.

    opened by Dhagash4 27
  • Understand how to start

    Understand how to start

    Hi :) My goal is to understand how I can apply fine tuning to such neural network, and so I want to run your code and play with it ;) My problem is that deep learning is a new topic for my studies, so I have some problems to understand each part of your software. I have read the YOLO paper and I have an idea about how it works. I have prepared dataset, as you suggest in the section "Usage for python Code", but my first question is:

    • Who generates annotations from images in VOC format? Are annotations only labels for images? What is their meaning?

    Once that annotations are generated, I have to generate anchors. Does your gen_anchors script updates the raw about anchors in the conf.json file? Thank you so much for your response!

    opened by frasab 25
  • local variable epoch_logs is not assigned

    local variable epoch_logs is not assigned

    Hi @experiencor , i always have the same error appeared every time and i can't solve it the error is UnboundLocalError: local variable 'epoch_logs' referenced before assignment

    how to solve it.

    opened by AhmedAAkl 22
  • YOLO version 3 equivalent in Keras

    YOLO version 3 equivalent in Keras

    Can anybody help me in building YOLOv3 architecture in keras? here is the link I think its residual network https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg @experiencor

    opened by hiba007 19
  • No module named expat; use SimpleXMLTreeBuilder instead error

    No module named expat; use SimpleXMLTreeBuilder instead error

    Hi, When I run python train.py -c config.json. The following error appears:

    Using TensorFlow backend.
    Traceback (most recent call last):
      File "train.py", line 140, in <module>
        _main_(args)
      File "train.py", line 76, in _main_
        config['model']['labels'])
      File "/home/julio_jcgc/yolo_tutorial/basic-yolo-keras/preprocessing.py", line 21, in parse_annotation
        tree = ET.parse(ann_dir + ann)
      File "/opt/bitnami/python/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
        tree.parse(source, parser)
      File "/opt/bitnami/python/lib/python2.7/xml/etree/ElementTree.py", line 651, in parse
        parser = XMLParser(target=TreeBuilder())
      File "/opt/bitnami/python/lib/python2.7/xml/etree/ElementTree.py", line 1476, in __init__
        "No module named expat; use SimpleXMLTreeBuilder instead"
    ImportError: No module named expat; use SimpleXMLTreeBuilder instead
    

    Thank you.

    opened by jcgarciaca 18
  • Load VOC 2007+2012 weights

    Load VOC 2007+2012 weights

    I am trying to load the VOC YOLOv2 weights from the yolo website. I am working in the jupyter notebook provided in this repository. Here are the modified parameters:

    LABELS = ['Person', 'Car', 'Bicycle', 'Bus', 'Motorbike', 'Train', 'Aeroplane', 'Chair', 'Bottle', 'Dining Table', 'Potted Plant', 'TV/Monitor', 'Sofa', 'Bird', 'Cat', 'Cow', 'Dog', 'Horse', 'Sheep']
    
    IMAGE_H, IMAGE_W = 416, 416
    GRID_H,  GRID_W  = 13 , 13
    BOX              = 5
    CLASS            = len(LABELS)
    CLASS_WEIGHTS    = np.ones(CLASS, dtype='float32')
    OBJ_THRESHOLD    = 0.3#0.5
    NMS_THRESHOLD    = 0.3#0.45
    ANCHORS          = [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828]
    
    NO_OBJECT_SCALE  = 1.0
    OBJECT_SCALE     = 5.0
    COORD_SCALE      = 1.0
    CLASS_SCALE      = 1.0
    
    BATCH_SIZE       = 16
    WARM_UP_BATCHES  = 0
    TRUE_BOX_BUFFER  = 50
    

    Here is the model summary:

    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_5 (InputLayer)            (None, 416, 416, 3)  0                                            
    __________________________________________________________________________________________________
    conv_0 (Conv2D)                 (None, 416, 416, 32) 864         input_5[0][0]                    
    __________________________________________________________________________________________________
    batch_norm_0 (BatchNormalizatio (None, 416, 416, 32) 128         conv_0[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_45 (LeakyReLU)      (None, 416, 416, 32) 0           batch_norm_0[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_11 (MaxPooling2D) (None, 208, 208, 32) 0           leaky_re_lu_45[0][0]             
    __________________________________________________________________________________________________
    conv_1 (Conv2D)                 (None, 208, 208, 64) 18432       max_pooling2d_11[0][0]           
    __________________________________________________________________________________________________
    batch_norm_1 (BatchNormalizatio (None, 208, 208, 64) 256         conv_1[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_46 (LeakyReLU)      (None, 208, 208, 64) 0           batch_norm_1[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_12 (MaxPooling2D) (None, 104, 104, 64) 0           leaky_re_lu_46[0][0]             
    __________________________________________________________________________________________________
    conv_2 (Conv2D)                 (None, 104, 104, 128 73728       max_pooling2d_12[0][0]           
    __________________________________________________________________________________________________
    batch_norm_2 (BatchNormalizatio (None, 104, 104, 128 512         conv_2[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_47 (LeakyReLU)      (None, 104, 104, 128 0           batch_norm_2[0][0]               
    __________________________________________________________________________________________________
    conv_3 (Conv2D)                 (None, 104, 104, 64) 8192        leaky_re_lu_47[0][0]             
    __________________________________________________________________________________________________
    batch_norm_3 (BatchNormalizatio (None, 104, 104, 64) 256         conv_3[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_48 (LeakyReLU)      (None, 104, 104, 64) 0           batch_norm_3[0][0]               
    __________________________________________________________________________________________________
    conv_4 (Conv2D)                 (None, 104, 104, 128 73728       leaky_re_lu_48[0][0]             
    __________________________________________________________________________________________________
    batch_norm_4 (BatchNormalizatio (None, 104, 104, 128 512         conv_4[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_49 (LeakyReLU)      (None, 104, 104, 128 0           batch_norm_4[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_13 (MaxPooling2D) (None, 52, 52, 128)  0           leaky_re_lu_49[0][0]             
    __________________________________________________________________________________________________
    conv_5 (Conv2D)                 (None, 52, 52, 256)  294912      max_pooling2d_13[0][0]           
    __________________________________________________________________________________________________
    batch_norm_5 (BatchNormalizatio (None, 52, 52, 256)  1024        conv_5[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_50 (LeakyReLU)      (None, 52, 52, 256)  0           batch_norm_5[0][0]               
    __________________________________________________________________________________________________
    conv_6 (Conv2D)                 (None, 52, 52, 128)  32768       leaky_re_lu_50[0][0]             
    __________________________________________________________________________________________________
    batch_norm_6 (BatchNormalizatio (None, 52, 52, 128)  512         conv_6[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_51 (LeakyReLU)      (None, 52, 52, 128)  0           batch_norm_6[0][0]               
    __________________________________________________________________________________________________
    conv_7 (Conv2D)                 (None, 52, 52, 256)  294912      leaky_re_lu_51[0][0]             
    __________________________________________________________________________________________________
    batch_norm_7 (BatchNormalizatio (None, 52, 52, 256)  1024        conv_7[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_52 (LeakyReLU)      (None, 52, 52, 256)  0           batch_norm_7[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_14 (MaxPooling2D) (None, 26, 26, 256)  0           leaky_re_lu_52[0][0]             
    __________________________________________________________________________________________________
    conv_8 (Conv2D)                 (None, 26, 26, 512)  1179648     max_pooling2d_14[0][0]           
    __________________________________________________________________________________________________
    batch_norm_8 (BatchNormalizatio (None, 26, 26, 512)  2048        conv_8[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_53 (LeakyReLU)      (None, 26, 26, 512)  0           batch_norm_8[0][0]               
    __________________________________________________________________________________________________
    conv_9 (Conv2D)                 (None, 26, 26, 256)  131072      leaky_re_lu_53[0][0]             
    __________________________________________________________________________________________________
    batch_norm_9 (BatchNormalizatio (None, 26, 26, 256)  1024        conv_9[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_54 (LeakyReLU)      (None, 26, 26, 256)  0           batch_norm_9[0][0]               
    __________________________________________________________________________________________________
    conv_10 (Conv2D)                (None, 26, 26, 512)  1179648     leaky_re_lu_54[0][0]             
    __________________________________________________________________________________________________
    batch_norm_10 (BatchNormalizati (None, 26, 26, 512)  2048        conv_10[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_55 (LeakyReLU)      (None, 26, 26, 512)  0           batch_norm_10[0][0]              
    __________________________________________________________________________________________________
    conv_11 (Conv2D)                (None, 26, 26, 256)  131072      leaky_re_lu_55[0][0]             
    __________________________________________________________________________________________________
    batch_norm_11 (BatchNormalizati (None, 26, 26, 256)  1024        conv_11[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_56 (LeakyReLU)      (None, 26, 26, 256)  0           batch_norm_11[0][0]              
    __________________________________________________________________________________________________
    conv_12 (Conv2D)                (None, 26, 26, 512)  1179648     leaky_re_lu_56[0][0]             
    __________________________________________________________________________________________________
    batch_norm_12 (BatchNormalizati (None, 26, 26, 512)  2048        conv_12[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_57 (LeakyReLU)      (None, 26, 26, 512)  0           batch_norm_12[0][0]              
    __________________________________________________________________________________________________
    max_pooling2d_15 (MaxPooling2D) (None, 13, 13, 512)  0           leaky_re_lu_57[0][0]             
    __________________________________________________________________________________________________
    conv_13 (Conv2D)                (None, 13, 13, 1024) 4718592     max_pooling2d_15[0][0]           
    __________________________________________________________________________________________________
    batch_norm_13 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_13[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_58 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_13[0][0]              
    __________________________________________________________________________________________________
    conv_14 (Conv2D)                (None, 13, 13, 512)  524288      leaky_re_lu_58[0][0]             
    __________________________________________________________________________________________________
    batch_norm_14 (BatchNormalizati (None, 13, 13, 512)  2048        conv_14[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_59 (LeakyReLU)      (None, 13, 13, 512)  0           batch_norm_14[0][0]              
    __________________________________________________________________________________________________
    conv_15 (Conv2D)                (None, 13, 13, 1024) 4718592     leaky_re_lu_59[0][0]             
    __________________________________________________________________________________________________
    batch_norm_15 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_15[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_60 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_15[0][0]              
    __________________________________________________________________________________________________
    conv_16 (Conv2D)                (None, 13, 13, 512)  524288      leaky_re_lu_60[0][0]             
    __________________________________________________________________________________________________
    batch_norm_16 (BatchNormalizati (None, 13, 13, 512)  2048        conv_16[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_61 (LeakyReLU)      (None, 13, 13, 512)  0           batch_norm_16[0][0]              
    __________________________________________________________________________________________________
    conv_17 (Conv2D)                (None, 13, 13, 1024) 4718592     leaky_re_lu_61[0][0]             
    __________________________________________________________________________________________________
    batch_norm_17 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_17[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_62 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_17[0][0]              
    __________________________________________________________________________________________________
    conv_18 (Conv2D)                (None, 13, 13, 1024) 9437184     leaky_re_lu_62[0][0]             
    __________________________________________________________________________________________________
    batch_norm_18 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_18[0][0]                    
    __________________________________________________________________________________________________
    conv_20 (Conv2D)                (None, 26, 26, 64)   32768       leaky_re_lu_57[0][0]             
    __________________________________________________________________________________________________
    leaky_re_lu_63 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_18[0][0]              
    __________________________________________________________________________________________________
    batch_norm_20 (BatchNormalizati (None, 26, 26, 64)   256         conv_20[0][0]                    
    __________________________________________________________________________________________________
    conv_19 (Conv2D)                (None, 13, 13, 1024) 9437184     leaky_re_lu_63[0][0]             
    __________________________________________________________________________________________________
    leaky_re_lu_65 (LeakyReLU)      (None, 26, 26, 64)   0           batch_norm_20[0][0]              
    __________________________________________________________________________________________________
    batch_norm_19 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_19[0][0]                    
    __________________________________________________________________________________________________
    lambda_4 (Lambda)               (None, 13, 13, 256)  0           leaky_re_lu_65[0][0]             
    __________________________________________________________________________________________________
    leaky_re_lu_64 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_19[0][0]              
    __________________________________________________________________________________________________
    concatenate_3 (Concatenate)     (None, 13, 13, 1280) 0           lambda_4[0][0]                   
                                                                     leaky_re_lu_64[0][0]             
    __________________________________________________________________________________________________
    conv_21 (Conv2D)                (None, 13, 13, 1024) 11796480    concatenate_3[0][0]              
    __________________________________________________________________________________________________
    batch_norm_21 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_21[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_66 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_21[0][0]              
    __________________________________________________________________________________________________
    conv2d_3 (Conv2D)               (None, 13, 13, 120)  123000      leaky_re_lu_66[0][0]             
    __________________________________________________________________________________________________
    reshape_3 (Reshape)             (None, 13, 13, 5, 24 0           conv2d_3[0][0]                   
    __________________________________________________________________________________________________
    input_4 (InputLayer)            (None, 1, 1, 1, 50,  0                                            
    __________________________________________________________________________________________________
    lambda_5 (Lambda)               (None, 13, 13, 5, 24 0           reshape_3[0][0]                  
                                                                     input_4[0][0]                    
    ==================================================================================================
    Total params: 50,670,936
    Trainable params: 50,650,264
    Non-trainable params: 20,672
    __________________________________________________________________________________________________
    

    I "succesfully" read the weights (succesfully as in no errors):

    weight_reader = WeightReader('yolov2-voc.weights')
    
    for index in range(conv_count):
        conv_layer = model.get_layer('conv_%i' % index)
        norm_layer = model.get_layer('batch_norm_%i' % index)
        
        size = np.prod(norm_layer.get_weights()[0].shape) # get product of shape (total values)
        
        # read sizes
        beta  = weight_reader.read(size)
        gamma = weight_reader.read(size)
        mean  = weight_reader.read(size)
        var   = weight_reader.read(size)
        
        norm_layer.set_weights([gamma, beta, mean, var])
        
        if len(conv_layer.get_weights()) > 1: 
            bias   = weight_reader.read(np.prod(conv_layer.get_weights()[1].shape)) 
            kernel = weight_reader.read(np.prod(conv_layer.get_weights()[0].shape))
            kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape))) 
            kernel = kernel.transpose([2, 3, 1, 0])
            conv_layer.set_weights([kernel, bias])
        else:
            kernel = weight_reader.read(np.prod(conv_layer.get_weights()[0].shape))
            kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape)))
            kernel = kernel.transpose([2, 3, 1, 0])
            conv_layer.set_weights([kernel])
    

    When I try to run the following code, no boxes are created:

    img = cv2.imread("dog-cycle-car.png")
    img = cv2.resize(img, (416, 416)) # resize to the input dimension
    img = img / 255
    img = img[..., ::-1] # .transpose((2, 0, 1))  # BGR -> RGB | H X W C -> C X H X W 
    img_input = np.array([img])
    
    dummy_array = np.zeros((1, 1, 1, 1, TRUE_BOX_BUFFER, 4))
    
    test_prediction = model.predict([img_input, dumby_array])
    boxes = decode_netout(test_prediction[0],
                          obj_threshold=OBJ_THRESHOLD,
                          nms_threshold=NMS_THRESHOLD,
                          anchors=ANCHORS, 
                          nb_class=CLASS)
    img = draw_boxes(img, boxes, labels=LABELS)
    img.shape
    plt.imshow(img)
    boxes
    

    I assume that there must be an issue with how I am loading the weights because the summary of the model looks the same and that is the only thing that I have modified in the notebook. I would be super grateful if someone could shine some light on to why this may be happening and what I am doing wrong.

    Edit

    If it would be helpful to see my jupyter notebook just ask for a link.

    opened by zoecarver 17
  • train on raccoon

    train on raccoon

    Hi , i want to trained the full-yolo with one GTX 1080 on raccoon. Firstly , please see my config.json :

    {
        "model" : {
            "architecture":         "Full Yolo",
            "input_size":           416,
            "anchors":            [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
            "max_box_per_image":    20,        
            "labels":               ["raccoon"]
    
        },
    
        "train": {
            "train_image_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/images/train/",
            "train_annot_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/annotations/train/",     
              
            "train_times":          10,
            "pretrained_weights":   "",
            "batch_size":           12,
            "learning_rate":        1e-4,
            "nb_epoch":             50,
            "warmup_epochs":        3,
    
            "object_scale":         5.0 ,
            "no_object_scale":      1.0,
            "coord_scale":          1.0,
            "class_scale":          1.0,
    
            "saved_weights_name":   "full_yolo_raccoon.h5",
            "debug":                true
        },
    
        "valid": {
            "valid_image_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/images/val/",
            "valid_annot_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/annotations/val/",
    
            "valid_times":          1
        }
    }
    

    Now i train , Epoch 1 : current recalls are above 98 % http://uupload.ir/files/jv2c_screenshot_from_2018-03-14_21-54-08.png

    Epoch 1 : couple of iterations : recall goes to zeros. http://uupload.ir/files/pxfl_screenshot_from_2018-03-14_21-54-27.png

    Epoch 3 , End of training , Still recall=0 http://uupload.ir/files/i2qp_screenshot_from_2018-03-14_21-57-35.png

    when i tested on raccoon images , no object found . http://uupload.ir/files/sgms_screenshot_from_2018-03-14_22-21-59.png

    Why? where are the problems ?

    opened by PythonImageDeveloper 17
  • Support single channel images

    Support single channel images

    Hi, it seems that your script always assume that the image has 3 channels:

    input_image = Input(shape=(self.input_size, self.input_size, 3))

    It would be nice if you change the script that it also supports single channgel images.

    Thanks for the nice work!

    enhancement 
    opened by thorstenwagner 17
  • Issues in training on own dataset

    Issues in training on own dataset

    I've done some changes to train the model on my own data set.(4 classes). Model architecture is formed and while training epoch 1, the following error is coming:

    Epoch 1/100000 Traceback (most recent call last): File "train.py", line 137, in main(args) File "train.py", line 133, in main debug = config['train']['debug']) File "/home/bhanu/Yolo-Keras/frontend.py", line 447, in train max_queue_size = 8) File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/legacy/interfaces.py", line 87, in wrapper File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 2114, in fit_generator File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 1826, in train_on_batch File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 1411, in _standardize_user_data File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 153, in _standardize_input_data ValueError: Error when checking target: expected lambda_2 to have shape (None, 11, 11, 5, 9) but got array with shape (1, 11, 11, 5, 6)

    Please help me with this.

    opened by bhanu223 16
  • Issue in training on Raccoon dataset

    Issue in training on Raccoon dataset

    Hi. I have a problem when training with Raccoon dataset. I dont know why it always stop when I run after a few minutes. I re-train many time but the result no change.

    This is my config:

    {
        "model" : {
            "architecture":         "Tiny Yolo",
            "input_size":           416,
            "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
            "max_box_per_image":    10,        
            "labels":               ["raccoon"]
        },
    
        "train": {
            "train_image_folder":   "../raccoon_dataset/train/images/",
            "train_annot_folder":   "../raccoon_dataset/train/annotations/",     
              
            "train_times":          10,
            "pretrained_weights":   "",
            "batch_size":           16,
            "learning_rate":        1e-4,
            "nb_epoch":             50,
            "warmup_epochs":        0,
    
            "object_scale":         5.0 ,
            "no_object_scale":      1.0,
            "coord_scale":          1.0,
            "class_scale":          1.0,
    
            "saved_weights_name":   "tiny_yolo_raccoon_save.h5",
            "debug":                false
        },
    
        "valid": {
            "valid_image_folder":   "../raccoon_dataset/valid/images/",
            "valid_annot_folder":   "../raccoon_dataset/valid/annotations/",
            "valid_times":          1
        }
    }
    
    
    Epoch 1/50
     10/100 [==>...........................] - ETA: 1:11 - loss: 11.5398Epoch 00001: val_loss improved from inf to 5.79008, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 1:22 - loss: 11.5398 - val_loss: 0.0000e+00Epoch 2/50
     10/100 [==>...........................] - ETA: 44s - loss: 9.3070Epoch 00002: val_loss improved from 5.79008 to 5.75633, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 55s - loss: 9.3070 - val_loss: 0.0000e+00Epoch 3/50
     10/100 [==>...........................] - ETA: 44s - loss: 6.9476Epoch 00003: val_loss improved from 5.75633 to 4.91244, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 55s - loss: 6.9476 - val_loss: 0.0000e+00Epoch 4/50
     10/100 [==>...........................] - ETA: 44s - loss: 4.7331Epoch 00004: val_loss improved from 4.91244 to 4.88484, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 55s - loss: 4.7331 - val_loss: 0.0000e+00Epoch 5/50
     10/100 [==>...........................] - ETA: 44s - loss: 3.8645Epoch 00005: val_loss did not improve
     10/100 [==>...........................] - ETA: 53s - loss: 3.8645 - val_loss: 0.0000e+00Epoch 6/50
     10/100 [==>...........................] - ETA: 44s - loss: 3.8343Epoch 00006: val_loss did not improve
     10/100 [==>...........................] - ETA: 52s - loss: 3.8343 - val_loss: 0.0000e+00Epoch 7/50
     10/100 [==>...........................] - ETA: 44s - loss: 3.4981Epoch 00007: val_loss did not improve
     10/100 [==>...........................] - ETA: 53s - loss: 3.4981 - val_loss: 0.0000e+00Epoch 00007: early stopping
    

    It train in to few minute ??? Too fast !!!!!. Are there something wrong in here, the val_loss is 0.0000+e00, but there are no box found when I predict ? And I have question, why the progress bar only run to 10/100 each epoch ? I never seen it run to 100/100 ? Thank you so much, hope anyone reply, I tried many many times but the result seem not good :(

    opened by khiemntu 14
  • How to train my Custom Dataset with this notebook

    How to train my Custom Dataset with this notebook

    Hi, I have been trying for a long time to train my own data set with this notebook but I have faced many issues and solved some of them Firsy My Data Set is as follows images folder ----> with training images of size 640x512 labels folder ---> text files for each image above with following data 0 311.489379882812 204.399459838867 36.0547180175781 24.1059265136719 i-e class label, x , y, w, h, (these values are not normalized.

    I have modified preprocessinmg.py to output the dataset as required by the notebook but still I get the error

    c:\Users\Usman\anaconda3\envs\tf-gpu\lib\site-packages\numpy\core\fromnumeric.py:43 _wrapit
        result = getattr(asarray(obj), method)(*args, **kwds)
    
    ValueError: cannot reshape array of size 10 into shape (1,1,1,1,2)
    

    and i cannt seen to find where my error is

    Ihave boon stuck at this for like 4 doays now please help

    opened by 316usman 0
  • Bump tensorflow-gpu from 1.3 to 2.9.3

    Bump tensorflow-gpu from 1.3 to 2.9.3

    Bumps tensorflow-gpu from 1.3 to 2.9.3.

    Release notes

    Sourced from tensorflow-gpu's releases.

    TensorFlow 2.9.3

    Release 2.9.3

    This release introduces several vulnerability fixes:

    TensorFlow 2.9.2

    Release 2.9.2

    This releases introduces several vulnerability fixes:

    ... (truncated)

    Changelog

    Sourced from tensorflow-gpu's changelog.

    Release 2.9.3

    This release introduces several vulnerability fixes:

    Release 2.8.4

    This release introduces several vulnerability fixes:

    ... (truncated)

    Commits
    • a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2
    • 258f9a1 Update py_func.cc
    • cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474
    • 3e75385 Update version numbers to 2.9.3
    • bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695
    • 3506c90 Update RELEASE.md
    • 8dcb48e Update RELEASE.md
    • 4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...
    • 6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple
    • 5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • tf.Print is deprecated

    tf.Print is deprecated

    tf.Print is deprecated so I get a Tensorflow warning about replacing it with tf.print, which unfortunately doesn't work either for some reason. After a bit of online research I think I get how tf.print works on Tensorflow 2, but I'm still confused about its function on Tensorflow 1. I run the Yolo_Step_by_Step.ipynb with Tensorflow 1.x. Has anyone been able to make these prints work?

    opened by eirini5th 0
  • NotFoundError: 2 root error(s) found when trying to run the code on Tensorflow 2

    NotFoundError: 2 root error(s) found when trying to run the code on Tensorflow 2

    I am using the notebook on colab and I want to run it with TF2. However, I come across this error on calling model.fit_generator:

    NotFoundError:` 2 root error(s) found.
      (0) Not found: Resource localhost/loss/lambda_3_loss/Variable/N10tensorflow3VarE does not exist.
    	 [[{{node loss/lambda_3_loss/AssignAddVariableOp}}]]
    	 [[Func/training/Adam/gradients/gradients/norm_6_1/cond_grad/StatelessIf/then/_1694/input/_4425/_2667]]
      (1) Not found: Resource localhost/loss/lambda_3_loss/Variable/N10tensorflow3VarE does not exist.
    	 [[{{node loss/lambda_3_loss/AssignAddVariableOp}}]]
    0 successful operations.
    0 derived errors ignored.
    

    The problem seems to be caused by the custom loss function, since I tried using a simple dummy loss function with no errors.

    The changes I've made (to no avail) are these 2:

    1. to include the lines from tensorflow.python.framework.ops import disable_eager_execution disable_eager_execution() before creating the model. Before adding these lines I came across this error:
    TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
    This error may indicate that you're trying to pass a symbolic value to a NumPy call,
    which is not supported.
    Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching,
    preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
    
    1. to use tensorflow.keras instead of keras, after suggestions from similar github issues and stackoverflow posts.

    I am also adding the custom_loss code to include a few changes I made to use TF2 instead of TF1 (basically some tf.compat.v1.* additions).

    def custom_loss(y_true, y_pred):
        mask_shape = tf.shape(y_true)[:4]
        
        cell_x = tf.compat.v1.to_float(tf.reshape(tf.tile(tf.range(GRID_W), [GRID_H]), (1, GRID_H, GRID_W, 1, 1)))
        cell_y = tf.transpose(cell_x, (0,2,1,3,4))
    
        cell_grid = tf.tile(tf.concat([cell_x,cell_y], -1), [BATCH_SIZE, 1, 1, 5, 1])
        
        coord_mask = tf.zeros(mask_shape)
        conf_mask  = tf.zeros(mask_shape)
        class_mask = tf.zeros(mask_shape)
        
        seen = tf.Variable(0.)
        total_recall = tf.Variable(0.)
        
        """
        Adjust prediction
        """
        ### adjust x and y      
        pred_box_xy = tf.sigmoid(y_pred[..., :2]) + cell_grid
        
        ### adjust w and h
        pred_box_wh = tf.exp(y_pred[..., 2:4]) * np.reshape(ANCHORS, [1,1,1,BOX,2])
        
        ### adjust confidence
        pred_box_conf = tf.sigmoid(y_pred[..., 4])
        
        ### adjust class probabilities
        pred_box_class = y_pred[..., 5:]
        
        """
        Adjust ground truth
        """
        ### adjust x and y
        true_box_xy = y_true[..., 0:2] # relative position to the containing cell
        
        ### adjust w and h
        true_box_wh = y_true[..., 2:4] # number of cells accross, horizontally and vertically
        
        ### adjust confidence
        true_wh_half = true_box_wh / 2.
        true_mins    = true_box_xy - true_wh_half
        true_maxes   = true_box_xy + true_wh_half
        
        pred_wh_half = pred_box_wh / 2.
        pred_mins    = pred_box_xy - pred_wh_half
        pred_maxes   = pred_box_xy + pred_wh_half       
        
        intersect_mins  = tf.maximum(pred_mins,  true_mins)
        intersect_maxes = tf.minimum(pred_maxes, true_maxes)
        intersect_wh    = tf.maximum(intersect_maxes - intersect_mins, 0.)
        intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
        
        true_areas = true_box_wh[..., 0] * true_box_wh[..., 1]
        pred_areas = pred_box_wh[..., 0] * pred_box_wh[..., 1]
    
        union_areas = pred_areas + true_areas - intersect_areas
        iou_scores  = tf.truediv(intersect_areas, union_areas)
        
        true_box_conf = iou_scores * y_true[..., 4]
        
        ### adjust class probabilities
        true_box_class = tf.argmax(y_true[..., 5:], -1)
        
        """
        Determine the masks
        """
        ### coordinate mask: simply the position of the ground truth boxes (the predictors)
        coord_mask = tf.expand_dims(y_true[..., 4], axis=-1) * COORD_SCALE
        
        ### confidence mask: penelize predictors + penalize boxes with low IOU
        # penalize the confidence of the boxes, which have IOU with some ground truth box < 0.6
        true_xy = true_boxes[..., 0:2]
        true_wh = true_boxes[..., 2:4]
        
        true_wh_half = true_wh / 2.
        true_mins    = true_xy - true_wh_half
        true_maxes   = true_xy + true_wh_half
        
        pred_xy = tf.expand_dims(pred_box_xy, 4)
        pred_wh = tf.expand_dims(pred_box_wh, 4)
        
        pred_wh_half = pred_wh / 2.
        pred_mins    = pred_xy - pred_wh_half
        pred_maxes   = pred_xy + pred_wh_half    
        
        intersect_mins  = tf.maximum(pred_mins,  true_mins)
        intersect_maxes = tf.minimum(pred_maxes, true_maxes)
        intersect_wh    = tf.maximum(intersect_maxes - intersect_mins, 0.)
        intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
        
        true_areas = true_wh[..., 0] * true_wh[..., 1]
        pred_areas = pred_wh[..., 0] * pred_wh[..., 1]
    
        union_areas = pred_areas + true_areas - intersect_areas
        iou_scores  = tf.truediv(intersect_areas, union_areas)
    
        best_ious = tf.reduce_max(iou_scores, axis=4)
        conf_mask = conf_mask + tf.compat.v1.to_float(best_ious < 0.6) * (1 - y_true[..., 4]) * NO_OBJECT_SCALE
        
        # penalize the confidence of the boxes, which are reponsible for corresponding ground truth box
        conf_mask = conf_mask + y_true[..., 4] * OBJECT_SCALE
        
        ### class mask: simply the position of the ground truth boxes (the predictors)
        class_mask = y_true[..., 4] * tf.gather(CLASS_WEIGHTS, true_box_class) * CLASS_SCALE       
        
        """
        Warm-up training
        """
        no_boxes_mask = tf.compat.v1.to_float(coord_mask < COORD_SCALE/2.)
        seen = tf.compat.v1.assign_add(seen, 1.)
        
        true_box_xy, true_box_wh, coord_mask = tf.cond(tf.less(seen, WARM_UP_BATCHES), 
                              lambda: [true_box_xy + (0.5 + cell_grid) * no_boxes_mask, 
                                       true_box_wh + tf.ones_like(true_box_wh) * np.reshape(ANCHORS, [1,1,1,BOX,2]) * no_boxes_mask, 
                                       tf.ones_like(coord_mask)],
                              lambda: [true_box_xy, 
                                       true_box_wh,
                                       coord_mask])
        
        """
        Finalize the loss
        """
        nb_coord_box = tf.reduce_sum(tf.compat.v1.to_float(coord_mask > 0.0))
        nb_conf_box  = tf.reduce_sum(tf.compat.v1.to_float(conf_mask  > 0.0))
        nb_class_box = tf.reduce_sum(tf.compat.v1.to_float(class_mask > 0.0))
        
        loss_xy    = tf.reduce_sum(tf.square(true_box_xy-pred_box_xy)     * coord_mask) / (nb_coord_box + 1e-6) / 2.
        loss_wh    = tf.reduce_sum(tf.square(true_box_wh-pred_box_wh)     * coord_mask) / (nb_coord_box + 1e-6) / 2.
        loss_conf  = tf.reduce_sum(tf.square(true_box_conf-pred_box_conf) * conf_mask)  / (nb_conf_box  + 1e-6) / 2.
        loss_class = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=true_box_class, logits=pred_box_class)
        loss_class = tf.reduce_sum(loss_class * class_mask) / (nb_class_box + 1e-6)
        
        loss = loss_xy + loss_wh + loss_conf + loss_class
        
        nb_true_box = tf.reduce_sum(y_true[..., 4])
        nb_pred_box = tf.reduce_sum(tf.compat.v1.to_float(true_box_conf > 0.5) * tf.compat.v1.to_float(pred_box_conf > 0.3))
    
        """
        Debugging code
        """    
        current_recall = nb_pred_box/(nb_true_box + 1e-6)
        total_recall = tf.compat.v1.assign_add(total_recall, current_recall) 
    
        loss = tf.compat.v1.Print(loss, [tf.zeros((1))], message='Dummy Line \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_xy], message='Loss XY \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_wh], message='Loss WH \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_conf], message='Loss Conf \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_class], message='Loss Class \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss], message='Total Loss \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [current_recall], message='Current Recall \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [total_recall/seen], message='Average Recall \t', summarize=1000)
        
        return loss
    
    opened by eirini5th 1
Releases(v0.1)
Owner
Huynh Ngoc Anh
available for consulting jobs
Huynh Ngoc Anh
A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

SpiderBot_DeepRL Title: Implementation of Single and Multi-Agent Deep Reinforcement Learning Algorithms for a Walking Spider Robot Authors(s): Arijit

Arijit Dasgupta 9 Jul 28, 2022
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral)

Joint Discriminative and Generative Learning for Person Re-identification [Project] [Paper] [YouTube] [Bilibili] [Poster] [Supp] Joint Discriminative

NVIDIA Research Projects 1.2k Dec 30, 2022
CIFAR-10 Photo Classification

Image-Classification CIFAR-10 Photo Classification CIFAR-10_Dataset_Classfication CIFAR-10 Photo Classification Dataset CIFAR is an acronym that stand

ADITYA SHAH 1 Jan 05, 2022
[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

x-magical x-magical is a benchmark extension of MAGICAL specifically geared towards cross-embodiment imitation. The tasks still provide the Demo/Test

Kevin Zakka 36 Nov 26, 2022
Complete* list of autonomous driving related datasets

AD Datasets Complete* and curated list of autonomous driving related datasets Contributing Contributions are very welcome! To add or update a dataset:

Daniel Bogdoll 13 Dec 19, 2022
Predict and time series avocado hass

RECOMMENDER SYSTEM MARKETING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU 1. Giới thiệu - Tiki là một hệ sinh thái thương mại "all in one", trong đó có tiki.vn, là

hieulmsc 3 Jan 10, 2022
Supporting code for the Neograd algorithm

Neograd This repo supports the paper Neograd: Gradient Descent with a Near-Ideal Learning Rate, which introduces the algorithm "Neograd". The paper an

Michael Zimmer 12 May 01, 2022
Hyperbolic Hierarchical Clustering.

Hyperbolic Hierarchical Clustering (HypHC) This code is the official PyTorch implementation of the NeurIPS 2020 paper: From Trees to Continuous Embedd

HazyResearch 154 Dec 15, 2022
Non-Imaging Transient Reconstruction And TEmporal Search (NITRATES)

Non-Imaging Transient Reconstruction And TEmporal Search (NITRATES) This repo contains the full NITRATES pipeline for maximum likelihood-driven discov

13 Nov 08, 2022
This is the face keypoint train code of project face-detection-project

face-key-point-pytorch 1. Data structure The structure of landmarks_jpg is like below: |--landmarks_jpg |----AFW |------AFW_134212_1_0.jpg |------AFW_

I‘m X 3 Nov 27, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

Yutian Liu 2 Jan 29, 2022
Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

Kaiyang 679 Jan 04, 2023
This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

1 Dec 24, 2021
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021 The code for training mCOLT/mRASP2, a multilingua

104 Jan 01, 2023
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 07, 2022
Face Mask Detection system based on computer vision and deep learning using OpenCV and Tensorflow/Keras

Face Mask Detection Face Mask Detection System built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Chandrika Deb 1.4k Jan 03, 2023
Google Brain - Ventilator Pressure Prediction

Google Brain - Ventilator Pressure Prediction https://www.kaggle.com/c/ventilator-pressure-prediction The ventilator data used in this competition was

Samuele Cucchi 1 Feb 11, 2022
[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data (NeurIPS 2021) This repository will provide the official PyTorch implementa

Liming Jiang 238 Nov 25, 2022