当前位置：网站首页>Talk about the multimodal project of fire

Talk about the multimodal project of fire

2022-06-21 09:59:00 【woshicver】

Multimodal machine learning , English full name MultiModal Machine Learning (MMML), The aim is to achieve the ability to process and understand multi-source modal information by means of machine learning .

Each source or form of information , You can call it a mode . for example , People have a sense of touch , auditory , Vision , The sense of smell ; The message has voice 、 video 、 Words and other media ; A variety of sensors , Such as radar 、 infrared 、 Accelerometer, etc. . Each of the above can be called a mode .

Modes can also be very broadly defined , For example, we can think of two different languages as two modes , Even the data sets collected in two different cases , You can think of it as two modes .

The present , Multimodal technology has a wide range of application scenarios , Such as Taobao Search 、AI subtitle 、AI Virtual digital human 、 Humanoid interaction 、 Intelligent assistant 、 Product recommendation and information flow advertising 、 Image vector retrieval of video frame and face frame 、 Voice interaction, etc .

We are honored to invite in-service senior algorithm researchers Clark teacher , utilize 1 About an hour or so , Systematically sort out multimodal technology for you .

Live sharing

PART

01 The development trend of multimodal models
02 Multimodal data set
03 Common multimodal downstream tasks

PART

Lecturer

Live time

PART

6 month 22 Friday night 20:00-21:00

Students interested in multimodal Technology , Scan the QR code below , Reservation live broadcast .

Sweep code payment 0.1 Yuan means the appointment is successful

Live broadcast when the party staff contact you ~

PART

Multimodal learning path

01 Fundamentals of multimodal theory

Study multimodal pre training related papers ——CLIP、ALIGN、VILT

02 Self supervised algorithm

Learn some self-monitoring schemes that may be used in multimodal pre training ——MAE、DINO、MOCO

03 Introduction to multimodal downstream tasks

Mainly understand VQA The tasks and nlvr Mission

04 Multimodal applications

Image Captioning Case study 、 Alibaba e-commerce cross modal retrieval case . Understand the task introduction 、baseline build 、 Model optimization 、 Result display .

05 Multimodal project

AI Smart copywriting 、 Mobile photo album management and retrieval based on multimodal pre training model 、AI Lip recognition 、 Automatic driving based on deep multimodal target detection and semantic segmentation

6 month 22 Friday night 20:00-21:00

Students interested in multimodal Technology , Scan the QR code below , Reservation live broadcast .

Sweep code payment 0.1 Yuan means the appointment is successful

Live broadcast when the party staff contact you ~

原网站

版权声明
本文为[woshicver]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/172/202206210949202920.html