当前位置：网站首页>C3d model pytorch source code sentence by sentence analysis (I)

C3d model pytorch source code sentence by sentence analysis (I)

2022-07-25 10:45:00 【zzh1370894823】

Thesis link ：http://vlg.cs.dartmouth.edu/c3d/c3d_video.pdf

Code link ：https://github.com/jfzhang95/pytorch-video-recognition

1. Source code preparation

git clone --recursive https://github.com/jfzhang95/pytorch-video-recognition.git
After downloading, you get C3D Source code

2. Source structure

File name	function
train.py	Training scripts
mypath.py	Configure the path of data set and pre training model
dataest.py	Data reading and data processing scripts
C3D_model.py	C3D Model network structure construction script
ucf101-caffe.path	Pre training model

3. Source code analysis

3.1 Data reading and data processing scripts

Pay attention to the first preprocessing data , namely preprocess=True

   def __init__(self, dataset='ucf101', split='train', clip_len=16, preprocess=False):
        self.root_dir, self.output_dir = Path.db_dir(dataset)   #  Get the source path and output path of the dataset 
        folder = os.path.join(self.output_dir, split)   #  Get the path of the corresponding group , namely train,test,val The path of 
        self.clip_len = clip_len  #  How many frames at a time 
        self.split = split

        # The following three parameters are chosen as described in the paper section 4.1
        #  The changing process of the height and width of the picture （h*w-->128*171-->112*112）, First change the size , Re cut 
        self.resize_height = 128
        self.resize_width = 171
        self.crop_size = 112  #  tailoring

check_integrity()

Judge whether it exists Dataset Source path of , If it does not exist , False report

        # check_integrity() Judge whether it exists Dataset Source path of , If it does not exist , False report 
        if not self.check_integrity():
            raise RuntimeError('Dataset not found or corrupted.' +
                               ' You need to download it from official website.')

        if (not self.check_preprocess()) or preprocess:  #  Determine whether pretreatment is required 
            print('Preprocessing of {} dataset, this will take long, but it will be done only once.'.format(dataset))
            self.preprocess()

def preprocess(self):

Preprocessing video

    def preprocess(self):   #  Preprocessing video 
        #  Create the corresponding grouping path 
        if not os.path.exists(self.output_dir):
            os.mkdir(self.output_dir)
            os.mkdir(os.path.join(self.output_dir, 'train'))
            os.mkdir(os.path.join(self.output_dir, 'val'))
            os.mkdir(os.path.join(self.output_dir, 'test'))

        # Split train/val/test sets
        for file in os.listdir(self.root_dir):  #  Traverse each folder of the dataset , namely ucf101 This folder 
            file_path = os.path.join(self.root_dir, file)  #  Get the path of each action class 
            video_files = [name for name in os.listdir(file_path)]  #  Get the name of each video , The type is list, Brackets cannot be omitted 

            #  Divide the data into two parts , among 0.2 For test set ,42 For random seeds 
            #  One point for each category , Once in a cycle 
            train_and_valid, test = train_test_split(video_files, test_size=0.2, random_state=42)
            train, val = train_test_split(train_and_valid, test_size=0.2, random_state=42)

test, train, val Divided into three list, The stored is the video name

            #  Create the corresponding folder 
            train_dir = os.path.join(self.output_dir, 'train', file)
            val_dir = os.path.join(self.output_dir, 'val', file)
            test_dir = os.path.join(self.output_dir, 'test', file)

            if not os.path.exists(train_dir):
                os.mkdir(train_dir)
            if not os.path.exists(val_dir):
                os.mkdir(val_dir)
            if not os.path.exists(test_dir):
                os.mkdir(test_dir)

            for video in train:   # train The type of is video name list
                self.process_video(video, file, train_dir)

            for video in val:
                self.process_video(video, file, val_dir)

            for video in test:
                self.process_video(video, file, test_dir)

        print('Preprocessing finished.')

process_video（） function

Processing video , Read it as numpy type

    def process_video(self, video, action_name, save_dir):
        video_filename = video.split('.')[0]
        
        if not os.path.exists(os.path.join(save_dir, video_filename)):
            os.mkdir(os.path.join(save_dir, video_filename))

        capture = cv2.VideoCapture(os.path.join(self.root_dir, action_name, video))  #  Read video 

        frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))  #  Read how many frames of video 
        frame_width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))  #  Read video width 
        frame_height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))

split Method to get the video name Such as ：video = ‘v_YoYo_g17_c03.avi’, take ’.avi’ Get rid of , return ’v_YoYo_g17_c03’

      # Make sure to take one every few frames , Take enough 16 frame 
        EXTRACT_FREQUENCY = 4  #  Partition EXTRACT_FREQUENCY Frame take data , The default is 4, Take enough and then reduce 
        if frame_count // EXTRACT_FREQUENCY <= 16:
            EXTRACT_FREQUENCY -= 1
            if frame_count // EXTRACT_FREQUENCY <= 16:
                EXTRACT_FREQUENCY -= 1
                if frame_count // EXTRACT_FREQUENCY <= 16:
                    EXTRACT_FREQUENCY -= 1


        count = 0  
        i = 0   
        retaining = True

        while (count < frame_count and retaining):
            retaining, frame = capture.read()

            if frame is None:
                continue
            #  Read every frame of the video 
            if count % EXTRACT_FREQUENCY == 0:  #  Judge whether this frame is separated EXTRACT_FREQUENCY One take 
                if (frame_height != self.resize_height) or (frame_width != self.resize_width):
                    frame = cv2.resize(frame, (self.resize_width, self.resize_height))
                cv2.imwrite(filename=os.path.join(save_dir, video_filename, '0000{}.jpg'.format(str(i))), img=frame)
                i += 1
            count += 1

        # Release the VideoCapture once it is no longer needed--> Release resources 
        capture.release()

count = 0 Indicates the number of times cap.read(), That is, which frame is read
i = 0 Record the number of frames extracted , And use when naming
capture.read() Read video by frame ,ret,frame Yes cap.read() Method .
among ret Boolean value , If the read frame is correct, return True, If the file reads to the end , Its return value is False.
frame It's the image of every frame , It's a three-dimensional matrix . by ndarray type , For a picture

原网站

版权声明
本文为[zzh1370894823]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/206/202207250923216290.html