当前位置:网站首页>Dgraph: large scale dynamic graph dataset
Dgraph: large scale dynamic graph dataset
2022-07-02 16:57:00 【Aitime theory】
Click on the blue words

Pay attention to our
AI TIME Welcome to everyone AI Fans join in !

Webpage: https://dgraph.xinye.com/
GitHub: https://github.com/DGraphXinye/

In recent days, , Yang Yang's scientific research group of Zhejiang University (yangy.org) Hexin also jointly released a large-scale dynamic graph data set DGraph, Aimed at service graph neural network 、 Graph mining 、 Social networks 、 Researchers in the direction of anomaly detection , Provide large-scale data of real scenes .DGraph On the one hand, it can be used as the standard data to verify the performance of the correlation graph model , On the other hand, it can also be used to carry out user portrait 、 Network analysis and other research work .
Data set description
DGraph The source data of is provided by Xinye Technology .DGraph It is a directed dynamic graph with no right , Contains more than 370 Ten thousand nodes and 430 Ten thousand dynamic edges . As shown in the figure below ,DGraph The node in represents the financial lending user of Xinye technology service , A directed edge indicates an urgent contact relationship , Each node contains the attribute characteristics after desensitization , And a label indicating whether it is a financial fraud user .

Data features
The scene is real
DGraph It comes from the real financial business scenario , Its construction logic is close to the industrial landing , It provides an opportunity for users of data sets to explore how to extend the graph model to the financial field . To be specific ,DGraph The proportion of abnormal and normal users in is about 1:100, Its “ The label is unbalanced ” The characteristics of the are in line with the real scene , Support exception detection 、 Research on classification of unbalanced nodes .
Structural dynamics
DGraph User relationships in are sampled from across 27 A business scenario for months , And the network structure will evolve over time , It provides data support for the current dynamic graph model and mining research .
Large scale
DGraph contain 370 Thousands of desensitized real financial lending users and 430 Ten thousand dynamic relationships , Its scale is about the largest dynamic graph data in the financial field Elliptic Of 17 times , Support the research and evaluation of large-scale graph models . Besides ,DGraph Contained in the 60% Of “ Background node ”, That is, it is not a classification or analysis object, but it actually exists 、 Nodes that have an indirect impact on business logic . These nodes play an important role in maintaining the connectivity of the network , Widely exists in industry . Reasonable processing of background nodes can effectively improve the storage space of data and the operation efficiency of the model in large-scale data scenarios .DGraph It contains more than 200 10000 background nodes , It can support researchers to explore the properties of background nodes .
Open source community maintenance
Ranking List
DGraph Users can submit at any time 、 Refreshed performance leaderboard (leaderboard), To track the research progress of the latest graph model . The list provides a unified evaluation process , All results are open and transparent .
Research results
DGraph It has rich characteristics , Support graph research in multiple directions .
Algorithm contest
Xinye technology revolves around DGraph The seventh Xinye Technology Cup algorithm competition was held , Task and DGraph The fraud user identification in is consistent . The competition is open to the whole society , Colleges and universities at home and abroad 、 Scientific research institutes 、 Internet enterprises can sign up for the competition , The bonus pool is abundant , total 31 Thousands of yuan .
Welcome interested colleagues 「 Scan the qr code below 」 Patronize DGraph Public data website , Work together to provide rich application data for the field of artificial intelligence , Work together to build an open digital ecosystem .

Dataset home page

Match Links
carry
Wake up
Related papers :
DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection. Xuanwen Huang, Yang Yang*, Yang Wang, Chunping Wang, Zhisheng Zhang, Jiarong Xu, and Lei Chen. Preprint.
Thesis link :
http://yangy.org/works/dgraph/dgraph_2022.pdf
Cooperation platform


Excellent articles in the past are recommended
Remember to pay attention to us ! There is new knowledge every day !
About AI TIME
AI TIME From 2019 year , It aims to carry forward the spirit of scientific speculation , Invite people from all walks of life to the theory of artificial intelligence 、 Explore the essence of algorithm and scenario application , Strengthen the collision of ideas , Link the world AI scholars 、 Industry experts and enthusiasts , I hope in the form of debate , Explore the contradiction between artificial intelligence and human future , Explore the future of artificial intelligence .
so far ,AI TIME Has invited 600 Many speakers at home and abroad , Held more than 300 An event , super 210 10000 people watch .

I know you.
Looking at
Oh
~

Click on Read the original entrants !
边栏推荐
- 远程办公对我们的各方面影响心得 | 社区征文
- 【云原生】简单谈谈海量数据采集组件Flume的理解
- PWM控制舵机
- 图书管理系统(山东农业大学课程设计)
- C语言自定义函数的方法
- Classifier visual interpretation stylex: Google, MIT, etc. have found the key attributes that affect image classification
- Machine learning perceptron model
- Vscode setting delete line shortcut [easy to understand]
- What if the win11 app store cannot load the page? Win11 store cannot load page
- unity Hub 登錄框變得很窄 無法登錄
猜你喜欢

Go zero micro service practical series (VIII. How to handle tens of thousands of order requests per second)

Data security industry series Salon (III) | data security industry standard system construction theme Salon

Classifier visual interpretation stylex: Google, MIT, etc. have found the key attributes that affect image classification

Day 18 of leetcode dynamic planning introduction

Multi task prompt learning: how to train a large language model?

路由模式:hash和history模式

七一献礼:易鲸捷 “百日会战”完美收官 贵阳银行数据库提前封板

TCP congestion control details | 2 background

隐私计算技术创新及产业实践研讨会:学习

数字IC手撕代码--投票表决器
随机推荐
The macrogenome microbiome knowledge you want is all here (2022.7)
According to the atlas of data security products and services issued by the China Academy of information technology, meichuang technology has achieved full coverage of four major sectors
Deep learning image data automatic annotation [easy to understand]
Cell:清华程功组揭示皮肤菌群的一种气味挥发物促进黄病毒感染宿主吸引蚊虫...
Cell: Tsinghua Chenggong group revealed an odor of skin flora. Volatiles promote flavivirus to infect the host and attract mosquitoes
Digital IC hand tearing code -- voting device
OpenHarmony如何启动远程设备的FA
Global and Chinese market of oil analyzers 2022-2028: Research Report on technology, participants, trends, market size and share
PCL 最小中值平方法拟合平面
AcWing 300. Task arrangement
uboot的作用和功能
sql解决连续登录问题变形-节假日过滤
La boîte de connexion du hub de l'unit é devient trop étroite pour se connecter
PCL least median square method fitting plane
Detailed explanation of @accessories annotation of Lombok plug-in
618 deep resumption: Haier Zhijia's winning methodology
John blasting appears using default input encoding: UTF-8 loaded 1 password hash (bcrypt [blowfish 32/64 x3])
MOSFET器件手册关键参数解读
linux下配置Mysql授权某个用户远程访问,不受ip限制
How to solve the failure of printer driver installation of computer equipment
