当前位置:网站首页>2022kdd pre lecture | 11 first-class scholars take you to unlock excellent papers in advance

2022kdd pre lecture | 11 first-class scholars take you to unlock excellent papers in advance

2022-07-04 13:44:00 Aitime theory

Click on the blue words

4b3ca00b3508ca7bc59981bb3801a61f.jpeg

Pay attention to our

AI TIME Welcome to everyone AI Fans join in !

7 month 6 We invite you to 11 position KKD A scholar interprets excellent papers online for you !

30c6c1cae53dcde02e2ed2987c9edb54.jpeg

Bili Bili live channel

Sweep yards attention AI TIME Bli bli official account reservation live broadcast

46b9e391ff4190dff85c24e2f1268af5.gif

7 month 6 Japan 10:00-11:00

eaa6c10f1b66e073c560868553ef7101.gif

1e3efabb1fe069fbd46ab9fe01436616.png

Introduction to the speaker

Zhao Chen :

The United States Kitware Senior R & D Engineer of the company , He graduated from the University of Texas at Dallas, majoring in computer . Main research direction: fairness learning in data mining , machine learning , Research and application in deep learning . Include KDD,AAAI,WWW,ICDM Many papers have been published in conferences and journals , And was invited to serve KDD,AAAI,ICDM,AISTATS And other top international conference program members and reviewers in the field of artificial intelligence , And organize and serve KDD workshop The host of . Personal home page :https://charliezhaoyinpeng.github.io/homepage/

Share content

Online meta learning of fairness awareness to adapt to environmental changes

Introduction to the report

The fairness awareness online learning framework has become a powerful tool for continuous lifelong learning settings . The goal of learners is to learn new tasks sequentially , These new tasks appear one after another as time goes by , Learners ensure that new tasks are performed in different protected subspecies ( Such as race and gender ) Statistical equality . A major drawback of existing methods is that they use a lot of data i.i.d Assumptions , So as to provide static regret analysis for the framework . However , In a changing environment where tasks are sampled from heterogeneous distributions , Low static regret does not mean good performance . In order to solve the problem of fairness perception online learning in a changing environment , This paper first adds long-term fairness constraints to the loss regret of strong adaptation , A new regret measure is constructed FairSAR. Besides , In order to determine a good model parameter in each round , We propose a new adaptive fair sense online meta learning algorithm FairSAOML, The algorithm can adapt to the changing environment in terms of deviation control and model accuracy . The problem is expressed in the form of bi level convex concave optimization , The original parameters and coupling parameters related to the accuracy and fairness of the model respectively . Theoretical analysis gives the sublinear upper bound of loss regret and violation of cumulative fairness constraint . Our experimental evaluation on different real data sets shows , What is proposed in this paper FairSAOML It is obviously superior to other relevant online learning methods .

954ff25b1d2b77418f0ff01526555d52.png

Introduction to the speaker

Jian Yi :

Eli Chien ( Jian Yi ) At present, it is ECE,UIUC Of Ph.D. candidate. In the past Nokia Bell Labs And Amazon Search Summer research internship , At present, his main research direction is to study geometry deep learning from a theoretical point of view (geometric deep learning), It includes graph machine learning and machine learning in non Euclidean space . For example, he once proposed a broad sense with his collaborators PageRank (GPR), Then it is put forward in combination with graph neural network GPRGNN It solves the problem of learning and over smoothing on non homologous graphs . At the same time , His self supervised node feature extraction method published this year GIANT-XRT Also in Open Graph Benchmark Leaderboard List 1 is obtained on the classification data set of three nodes . Besides , Previous research interests also include statistical model analysis and active learning on graphs and hypergraphs (active learning)、 Semi supervised K-means Clustering and support set estimation (support estimation) And so on .Eli Chien Our research is mainly published in machine learning 、 Data mining and information theory summit (NeurIPS, ICLR, AISTATS, AAAI, KDD, ICDM, ISIT, TIT... etc. ). Personal home page :https://sites.google.com/view/eli-chien/home

Share content

HyperAid: Using hyperbolic space to strengthen tree learning and hierarchical clustering

Introduction to the report

Take any distance (metric) Take the shortest distance on the tree (i.e. tree-metric) Approximation has received much attention in theoretical computer and machine learning . Although many existing methods can restore the optimal tree from a given tree distance , But if the input distance is not the tree distance , There are still many unsolved problems in how to find the optimal tree . We put forward HyperAid frame , Firstly, the embedding in learning hyperbolic space is used to make the distance between input points more convenient " image " Tree distance . Here we use Gromov δ hyperbolicity To describe the similarity between any distance and tree distance . Then use Neighbor Joining Wait for the method to get the tree itself . At the same time, our problem is related to hierarchical clustering , And point out the use lp norm As an objective function Dasgupta Advantages of objective function . We have achieved better performance in both manual generation and actual data sets , For example, on five actual data sets , We HyperAid The framework can improve Neighbor Joining The performance reaches 125.94%.

619233f02691a12b2066430a520bd265.png

Introduction to the speaker

Wei Tianxin :

University of Illinois at Champagne (UIUC) First year doctoral students . The main research direction is trusted machine learning 、 Graph data mining and its application in real scenes , stay KDD, SIGIR, ICDM He has published many papers at the summit in the field of machine learning and data mining . Personal home page :https://weitianxin.github.io/

Share content

Integrated fair cold start recommendation system

Introduction to the report

Cold start is a common challenge in recommendation systems , Because for new users in the system , Their observable interactions are very limited . And in order to meet this challenge , Many recent works have begun to introduce the idea of meta learning into recommendation scenarios , They aim to acquire a priori knowledge with strong generalization by learning the preferences of different users , So that the model can quickly adapt to new users in the system through a small amount of training data . However , Recommendation systems are easily influenced by bias and unfairness , Although meta learning has achieved success in cold start recommendation performance , However, the issue of fairness has been largely ignored . In this paper , We put forward a proposal called CLOVER Comprehensive fair meta learning framework , To ensure the fairness of the cold start recommendation model . We systematically study three kinds of fairness in recommendation systems —— Individual equity 、 Counterfactual fairness and group fairness , And propose the method of multi task confrontation learning to improve these three kinds of fairness . Experimental evaluation on different real data sets shows , What is proposed in this paper CLOVER The scheme is obviously superior to other relevant methods , The comprehensive fairness of the model is successfully improved without reducing the cold start recommendation performance .

7 month 6 Japan 14:00-17:00

2239c9af59e725b41b8d7866669eedfe.gif

ddeba2064d06b82fa941106802f01e7b.png

Introduction to the speaker

Zhang Zaixi :

First year doctoral student, School of computer science, University of science and technology of China , Learn from Liu Qi     professor . His main research interests include graph representation learning , Machine learning security and privacy , Prediction and generation of molecular properties etc. . With the first author in NeurIPS,AAAI,IJCAI,  KDD And other academic conferences .

Share content

FLDetector: Defend against model pollution attacks in federated learning by detecting malicious users

Introduction to the report

Federated learning is vulnerable to model pollution : Malicious users can destroy the training of the model by changing the gradient of the model uploaded to the central server . The existing defense mainly depends on Byzantine robust methods , A good model can be trained in the case of a small number of malicious users . However, it is still difficult to train a good model when there are many malicious users . our FLDetecor The above problem can be solved by detecting and eliminating malicious nodes .FLDetector Mainly based on the following observations : In model pollution attacks , The gradient of the model uploaded by malicious users is inconsistent in multiple iterations .FLDetector Determine malicious users and eliminate them in time by detecting the above consistency , The remaining users can learn a good model through federated learning . We're in multiple benchmark Data sets and model pollution attacks are verified FLDetector The effectiveness of the .

9f9a9d0f78a7547e4f5edefb0fc88a27.png

Introduction to the speaker

Trison :

Third year doctor, School of information, Tsinghua University , Studied under Professor zhangchangshui of Tsinghua University . The main research direction is reliable machine learning , Including algorithm fairness , Robustness , Interpretability , And privacy protection . As the first author in SIGKDD,NeurIPS Many papers will be published at the machine learning conference , As a ICML,NeurIPS Wait for the reviewer of machine learning conference , Personal home page https://cuis15.github.io/.

Share content

Cooperative equilibrium in federal learning

Introduction to the report

Federal learning (federated learning) It refers to the learning paradigm of distributed multi-source model training on the premise of protecting data privacy . Due to the statistical heterogeneity of various data sources, it widely exists in the real scene , At the same time, statistical heterogeneity also has a negative impact on cooperative model learning under federated learning , It will even damage the performance of the model . This also brings a basic problem : An institution (client) Can you gain by joining the cooperative network , That is, whether participating in cooperation means improving the performance of its own model . in fact , An organization does not always work with all organizations to maximize its own performance . We established the cooperative equilibrium theory under federal learning , Each of these institutions only cooperates with institutions that are beneficial to it , Avoid the impact of negative migration to the greatest extent , So as to maximize the performance of its own model . say concretely , We propose the following two axioms to characterize cooperative equilibrium :1. Selfishness principle ; No interest , There is no cooperation ;2. The principle of reason ; Organizations are committed to maximizing the performance of their models . We propose a gain graph (benefit graph) The concept of , Describes the best collaborators for each institution , And a method based on Pareto optimization is proposed to determine the optimal collaborator . We prove the existence of cooperative equilibrium in theory , A method based on graph theory is proposed , Realization O(V+E) Cooperative equilibrium of time complexity .

49052ab9ff6d12da830aa5141570d62e.png

Introduction to the speaker

Shaozezhi :

Third year doctor, Institute of computing technology, Chinese Academy of Sciences , The main research direction is multivariable time series prediction 、 Spatiotemporal neural networks 、 Heterograph neural network, etc . As the first author in SIGKDD、VLDB Wait for the data mining conference to publish papers

Share content

STEP: A spatiotemporal map neural network for pre training and enhancement of multivariate time series

Introduction to the report

Multivariate time series (Multivarite Time Series,MTS) It is a kind of typical spatiotemporal data , Contains multiple interrelated time series ,MTS Learning and forecasting in traffic 、 Environmental Science 、 Electric power 、 National defense and other applications play a vital role . lately , Spatiotemporal neural networks  (Spatial-Temporal Graph Neural Networks,STGNNs) It has become more and more popular MTS Prediction method .STGNNs Through graph neural network and sequence model MTS Joint modeling of spatio-temporal patterns , The prediction accuracy is significantly improved . But limited by the complexity of the model , majority STGNN Consider only short-term history MTS data , For example, the data of the past hour . However , Patterns of time series and their dependencies ( That is, time and space patterns ) According to the long history MTS Data analysis . To solve this problem , We put forward one called STEP Novel framework , among STGNNs Through the scalable time series pre training model ( be called TSFormer) Enhanced . say concretely , We designed a pre training model TSFormer, From very long historical time series ( for example , In the past two weeks MTS) Learning time pattern , And generate segment level representation . These are expressed as STGNN The short-term time series input of provides context information , It also promotes the modeling of dependencies between time series . Experiments on three open source real-world datasets show that , Our framework can significantly enhance downstream STGNNs, And our pre training model properly captures the time pattern .

dda47b6f45c789e7f7a4f4b3b06c1a3e.png

Introduction to the speaker

Lintin :

Algorithm Engineer of Alibaba Dharma Academy , He graduated from the computer department of Tsinghua University with a master's degree . His main research interests are natural language understanding , Oral dialogue system , Multimodal emotion, etc . With the first author in KDD、ACL、AAAI He has published many papers at the summit in the field of natural language processing and data mining . Personal home page :https://scholar.google.com/citations?user=XNdFVMAAAAAJ&hl=en

Share content

Duplex Conversation: Exploration of full duplex dialogue with voice semantic integration

Introduction to the report

We integrate voice and text , A full duplex dialogue system integrating speech and semantics is proposed Duplex Conversation, To achieve more efficiency 、 Precise dialogue and interaction . First , We model human like interaction through three subtasks , Including user status detection 、 Feedback selection and interruption detection . secondly , We propose multimodal data enhancement and semi supervised learning methods , Improve the generalization ability of the model by introducing massive unlabeled data . Experiments show that , The proposed method is in the middle of each subtask baseline Compared with both of them, they have achieved significant improvement . Last , We will integrate the proposed audio and semantic capabilities into Alibaba cloud intelligent customer service on a large scale , on-line A/B Experiments show that , The proposed system can significantly reduce the machine response delay 50%, To create the next generation of voice interaction (Voice User Interface,  VUI) Take the first step .

6989208787d0e0ebd241728b655d000c.png

Introduction to the speaker

Zhang Yifei :

Chinese University of Hong Kong (CUHK) Second year doctoral students , Learn from IEEE Fellow Irwin King professor , He was a senior algorithm engineer of Alibaba , The main research direction is graph data mining and its application in real scenes , stay KDD, WWW, AAAI, CIKM, NAACL He has published many papers at the summit in the field of machine learning and data mining . Personal home page :https://yifeiacc.github.io/

Share content

COSTA: A covariance preserving feature enhancement method for graph contrast learning

Introduction to the report

In comparative learning (Contrastive Leaning,CL) in , The performance of the model will often be enhanced by the deviation in the data (Bias) Affected by . In this paper , We point out and define the deviation problem in data enhancement for the first time . Based on this , We observed comparative learning in Figure (Graph CL, GCL) in , Figure enhance (Graph Augmentation, GA) Will introduce a large number of deviations and affect GCL The final performance of the model . We cleverly propose a method based on feature enhancement to alleviate DA The deviation problem in improves the performance of the next task .

85059b697b7f3297c44a4181faf8aa4f.png

Introduction to the speaker

Li Kuan :

Institute of computing, Chinese Academy of Sciences (ICT) Second year master . The main research direction is graph representation learning , The work mainly focuses on the robustness of graph Neural Networks , The class imbalance problem of dynamic graph modeling and semi supervised node classification is expanded . Already in KDD,WWW And publish papers at the top conference of data mining . Home link :https://likuanppd.github.io/

Share content

STABLE- An unsupervised high robustness graph structure learning framework

Introduction to the report

Graph neural network performs well in many downstream tasks based on graph data , However, in recent years, studies have found that graph neural networks are very vulnerable to malicious structural disturbances . An intuitive way to enhance the robustness of graph confrontation is structural learning , Optimize the tampered graph structure to mitigate the negative impact of the attack . Most of the existing methods are based on the original features or supervisory signals to carry out structural learning . But there are some problems in both methods , The former lacks structural information , The latter is attacked because of the classifier , The quality of characterization also decreases . Based on this , We propose an unsupervised framework based on contrastive learning to obtain high-quality representations for confrontation robustness , So as to optimize the structure . On the other hand , We also found that GCN Re parameterization of trick Will make the model more fragile , Based on this, we simply modified GCN, A more robust downstream classifier is obtained .

9cd10a2ae746578635eb82a165949961.png

Introduction to the speaker

Chen Yankai :

Chinese University of Hong Kong (CUHK) Third year doctoral student . The main research direction is around search and recommendation application and optimization , Including graph data mining recommendation system 、 Neural network quantification technology 、 Neural sequencing model combined with natural language understanding . stay KDD, ICDE, WSDM, IJCAI, TKDE He has published many papers at the summit of data mining . Personal home page : https://yankai-chen.github.io/.

Share content

BiGeaR: An online oriented Top-K The recommended graph eigenvalue binarization model

Introduction to the report

Learning vectorization representation is for users - The core of various recommendation system models for commodity matching . To perform fast online reasoning , Characterization binarization (Representation Binarization), The purpose is to embed potential object features by using finite binary digital sequences , Recently, the potential of optimizing memory and computing overhead has been shown . However , The existing work only focuses on the transformation of numerical level , And ignore the accompanying information loss , As a result, the performance of the model is significantly reduced . In order to deal with such problems , We propose a novel and effective framework for graph eigenvalue binarization . We are in the early stage of binary representation learning 、 In the middle and later stages, various quantitative reinforcement technologies have been introduced , This largely preserves the amount of information for binary representation . In addition to saving memory , It also further develops a reliable online reasoning acceleration through bitwise operation , It provides alternative flexibility for actual deployment . Our empirical results on five real data sets show ,BiGeaR Compared with the most advanced recommendation system model based on binary representation learning, it achieves about 22%~40% Performance improvement of . At the same time with the acquisition SOTA The effect of the full precision model is compared ,BiGeaR Optimization in time and space overhead exceeds 8 On a multiple basis , It can reach about 95%~102% The ability to predict .

0b4afe97917f7761e17a29cca3d657bb.png

Introduction to the speaker

Huibinyuan :

Algorithm Engineer of Alibaba Dharma Academy , His research direction is semantic parsing 、 Dialogue system 、 Pre training model, etc , I was in ACL / AAAI / KDD And publish papers in conferences and journals .

Share content

Semantic parsing method based on knowledge detection and knowledge utilization

Introduction to the report

We propose a new pre training model utilization framework , Through the detection process , From large-scale pre training language model (PLM) Extract the relationship structure , And use the induction relationship to expand the current graph based parser , To achieve better mode linking . Compared with the common rule-based schema linking methods , We found that , Even if the surface form of the mentioned and entity is different , Detecting relationships can also effectively capture semantic correspondence . Besides , Our knowledge detection process is completely unsupervised , No additional parameters are required . A lot of experiments show that , Our framework has reached the latest on three benchmarks SOTA performance .

After the live broadcast, you can ask questions in the group , Please add “AI TIME Little helper ( WeChat ID :AITIME_HY)”, reply “KDD”, Will pull you into “AI TIME KDD Communication group ”!

476c17037bf5d29f77c703e09ad1e945.gif

AI TIME Wechat assistant

d4324fba50b391161069796ef24b547e.jpeg

Lord         do :AI TIME 

Associated Media :AI Data pie 、 Academic headlines

partners : Wisdom spectrum ·AI、 Chinese Academy of Engineering Zhiling live 、 School Online 、 Kou enjoys academic 、AMiner、 Ever Chain action 、 Scientific research cloud

Excellent articles in the past are recommended

ce4f9cf37951667c11e04859decc632e.jpeg

Remember to pay attention to us ! There is new knowledge every day !

  About AI TIME 

AI TIME From 2019 year , It aims to carry forward the spirit of scientific speculation , Invite people from all walks of life to the theory of artificial intelligence 、 Explore the essence of algorithm and scenario application , Strengthen the collision of ideas , Link the world AI scholars 、 Industry experts and enthusiasts , I hope in the form of debate , Explore the contradiction between artificial intelligence and human future , Explore the future of artificial intelligence .

so far ,AI TIME Has invited 600 Many speakers at home and abroad , Held more than 300 An event , super 210 10000 people watch .

fd7bcbd64531d8ec709409510fbe30f0.png

I know you.

Looking at

Oh

~

4a3cd026128731a5138429c79a243d1d.gif

Click on Read the original   Reservation live broadcast !

原网站

版权声明
本文为[Aitime theory]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/185/202207041019425459.html