当前位置：网站首页>C3d model pytorch source code sentence by sentence analysis (II)

C3d model pytorch source code sentence by sentence analysis (II)

2022-07-25 10:45:00 【zzh1370894823】

3.1 The source code parsing

Generate video action tags txt file

        self.fnames, labels = [], []
        for label in sorted(os.listdir(folder)): 
            for fname in os.listdir(os.path.join(folder, label)):
                self.fnames.append(os.path.join(folder, label, fname))
                labels.append(label)

folder = ‘xx\data_process\ucf101\test’
label Is the name of the video action category , Such as label = ‘ApplyEyeMakeup’
labels by list,label A list of components , It represents video classification label
fname Name for a single video , Such as fname = ‘v_ApplyEyeMakeup_g03_c02’
self.fnames For a single video path list, common 8460 Elements

assert（ Assertion )
Used to determine an expression , In the expression, the condition is false Trigger exception when
At this time, judge whether the number of videos is consistent with the number of tags , That is, whether there is one-to-one correspondence

        assert len(labels) == len(self.fnames)
        print('Number of {} videos: {:d}'.format(split, len(self.fnames)))

Get digital labels

 # Prepare a mapping between the label names (strings) and indices (ints)--> label And corresponding digital labels 
        self.label2index = {
    label: index for index, label in enumerate(sorted(set(labels)))}
        # Convert the list of label names into an array of label indices--> Convert to digital labels 
        self.label_array = np.array([self.label2index[label] for label in labels], dtype=int)

a = set(labels)
a={set:101}{‘HorseRiding’, ‘PullUps’, …}
set Is a combination that does not allow content duplication , and set The content position in the is arbitrary , So you can't index .
b = sorted(set(labels))
sorted() Function to sort all objects that can be iterated
enumerate() For traversing a data object ( As listing 、 Tuples or strings ) Combined into an index sequence , Return both data and data subscript
self.label_array For one 8460 Of ndarray, A digital tag that stores all data
self.label2index by 101 An element of dict, Record the relationship between the digital tag and the actual action tag

Need to rewrite __getitem__ Method

    def __getitem__(self, index):
        # Loading and preprocessing.
        buffer = self.load_frames(self.fnames[index])  #  Altogether 8460 A folder 
        buffer = self.crop(buffer, self.clip_len, self.crop_size)
        labels = np.array(self.label_array[index])
        
         if self.split == 'test':
            # Perform data augmentation
            buffer = self.randomflip(buffer)
        buffer = self.normalize(buffer)
        buffer = self.to_tensor(buffer)
        return torch.from_numpy(buffer), torch.from_numpy(labels)  #  The final input is x and Y, by tensor type

buffer For storing all frame information under the indexed video folder ndarray data
labels For storing tags of indexed videos ndarray data

load_frames function

    def load_frames(self, file_dir):
        frames = sorted([os.path.join(file_dir, img) for img in os.listdir(file_dir)])
        frame_count = len(frames)  #  Number of processed frames 
        buffer = np.empty((frame_count, self.resize_height, self.resize_width, 3), np.dtype('float32'))
        for i, frame_name in enumerate(frames):
            frame = np.array(cv2.imread(frame_name)).astype(np.float64)
            buffer[i] = frame  #  Save the frame picture buffer
        return buffer

file_dir = ‘H:\python_load\C3D\data_process\ucf101\train\Haircut\v_Haircut_g01_c01’, To a single video name folder
frames = {list:34} Under this video category folder , Path list of all frames , At this time, there is 34 Frame picture
buffer = {ndarray: (34, 128, 171, 3)} For storing all frame information ndarray data

crop function
Purpose ： From the processed video , Select a fixed size continuous 16 frame
Method ：
It's equivalent to using a 16 * 112 * 112 To intercept , Just determine the vertex position of this cube , Randomly generated .
time_index： The starting frame of time
height_index： Starting position of height
width_index： Starting position of width

    def crop(self, buffer, clip_len, crop_size):
        # randomly select time index for temporal jittering
        time_index = np.random.randint(buffer.shape[0] - clip_len)  #  Use a window to cut 16 Consecutive pictures 

        # Randomly select start indices in order to crop the video
        height_index = np.random.randint(buffer.shape[1] - crop_size)
        width_index = np.random.randint(buffer.shape[2] - crop_size)

        buffer = buffer[time_index:time_index + clip_len,  #  Time and place 
                 height_index:height_index + crop_size,  #  Height position 
                 width_index:width_index + crop_size, :]  #  Width position 

        return buffer

randomflip function

    def randomflip(self, buffer):
        """Horizontally flip the given image and ground truth randomly with a probability of 0.5."""
        #  Data set to 0.5 Probability reversal , Enhance data sets 
        if np.random.random() < 0.5:
            for i, frame in enumerate(buffer):
                frame = cv2.flip(buffer[i], flipCode=1)
                buffer[i] = cv2.flip(frame, flipCode=1)

Data set to 0.5 Probability reversal , Enhance data sets

原网站

版权声明
本文为[zzh1370894823]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/206/202207250923216229.html

当前位置：网站首页>C3d model pytorch source code sentence by sentence analysis (II)

C3d model pytorch source code sentence by sentence analysis (II)

3.1 The source code parsing

边栏推荐

猜你喜欢

随机推荐