当前位置:网站首页>Talk about the multimodal project of fire

Talk about the multimodal project of fire

2022-06-21 09:59:00 woshicver

Multimodal machine learning , English full name MultiModal Machine Learning (MMML), The aim is to achieve the ability to process and understand multi-source modal information by means of machine learning .

Each source or form of information , You can call it a mode . for example , People have a sense of touch , auditory , Vision , The sense of smell ; The message has voice 、 video 、 Words and other media ; A variety of sensors , Such as radar 、 infrared 、 Accelerometer, etc. . Each of the above can be called a mode .

Modes can also be very broadly defined , For example, we can think of two different languages as two modes , Even the data sets collected in two different cases , You can think of it as two modes .

The present , Multimodal technology has a wide range of application scenarios , Such as Taobao Search 、AI subtitle 、AI Virtual digital human 、 Humanoid interaction 、 Intelligent assistant 、 Product recommendation and information flow advertising 、 Image vector retrieval of video frame and face frame 、 Voice interaction, etc .

We are honored to invite in-service senior algorithm researchers Clark teacher , utilize 1 About an hour or so , Systematically sort out multimodal technology for you .

Live sharing

01

PART

  • 01 The development trend of multimodal models  

  • 02 Multimodal data set  

  • 03 Common multimodal downstream tasks

02

PART

Lecturer

d0345ff1ec08d90d009624dc606bee87.png

Live time

03

PART

  • 6 month 22 Friday night 20:00-21:00

Students interested in multimodal Technology , Scan the QR code below , Reservation live broadcast .

ec940af035ed9fea5f8f815b7fe3c773.png

Sweep code payment 0.1 Yuan means the appointment is successful

Live broadcast when the party staff contact you ~

04

PART

Multimodal learning path

f86dbdb84ec9635a12e5ab6b567c65d1.png

01  Fundamentals of multimodal theory

Study multimodal pre training related papers ——CLIP、ALIGN、VILT

02  Self supervised algorithm

Learn some self-monitoring schemes that may be used in multimodal pre training ——MAE、DINO、MOCO

03  Introduction to multimodal downstream tasks

Mainly understand VQA The tasks and nlvr Mission

04  Multimodal applications

Image Captioning Case study 、 Alibaba e-commerce cross modal retrieval case . Understand the task introduction 、baseline build 、 Model optimization 、 Result display .

05  Multimodal project

AI Smart copywriting 、 Mobile photo album management and retrieval based on multimodal pre training model 、AI Lip recognition 、 Automatic driving based on deep multimodal target detection and semantic segmentation

6 month 22 Friday night 20:00-21:00

Students interested in multimodal Technology , Scan the QR code below , Reservation live broadcast .

de3889c70e3c962245db4773d756b8c6.png

Sweep code payment 0.1 Yuan means the appointment is successful

Live broadcast when the party staff contact you ~


b014b69af352e60c9e8d51c668425f32.png

原网站

版权声明
本文为[woshicver]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/172/202206210949202920.html