当前位置:网站首页>Dgraph: large scale dynamic graph dataset
Dgraph: large scale dynamic graph dataset
2022-07-04 13:55:00 【PaperWeekly】
In recent days, , Yang Yang's scientific research group of Zhejiang University (yangy.org) Hexin also jointly released a large-scale dynamic graph data set DGraph, Aimed at service graph neural network 、 Graph mining 、 Social networks 、 Researchers in the direction of anomaly detection , Provide large-scale data of real scenes .DGraph On the one hand, it can be used as the standard data to verify the performance of the correlation graph model , On the other hand, it can also be used to carry out user portrait 、 Network analysis and other research work .
Dataset home page :
https://dgraph.xinye.com/
Github:
https://github.com/DGraphXinye/
Related papers :
DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection. Xuanwen Huang, Yang Yang*, Yang Wang, Chunping Wang, Zhisheng Zhang, Jiarong Xu, and Lei Chen. Preprint, 2022. (http://yangy.org/works/dgraph/dgraph_2022.pdf)
Data set description
DGraph The source data of is provided by Xinye Technology .DGraph It is a directed dynamic graph with no right , Contains more than 370 Ten thousand nodes and 430 Ten thousand dynamic edges . As shown in the figure below ,DGraph The node in represents the financial lending user of Xinye technology service , A directed edge indicates an urgent contact relationship , Each node contains the attribute characteristics after desensitization , And a label indicating whether it is a financial fraud user .
Data features
The scene is real
DGraph It comes from the real financial business scenario , Its construction logic is close to the industrial landing , It provides an opportunity for users of data sets to explore how to extend the graph model to the financial field . To be specific ,DGraph The proportion of abnormal and normal users in is about 1:100, Its “ The label is unbalanced ” The characteristics of the are in line with the real scene , Support exception detection 、 Research on classification of unbalanced nodes .
Structural dynamics
DGraph User relationships in are sampled from across 27 A business scenario for months , And the network structure will evolve over time , It provides data support for the current dynamic graph model and mining research .
Large scale
DGraph contain 370 Thousands of desensitized real financial lending users and 430 Ten thousand dynamic relationships , Its scale is about the largest dynamic graph data in the financial field Elliptic Of 17 times , Support the research and evaluation of large-scale graph models . Besides ,DGraph Contained in the 60% Of “ Background node ”, That is, it is not a classification or analysis object, but it actually exists 、 Nodes that have an indirect impact on business logic . These nodes play an important role in maintaining the connectivity of the network , Widely exists in industry . Reasonable processing of background nodes can effectively improve the storage space of data and the operation efficiency of the model in large-scale data scenarios .DGraph It contains more than 200 10000 background nodes , It can support researchers to explore the properties of background nodes .
Open source community maintenance
Ranking List
DGraph Users can submit at any time 、 Refreshed performance leaderboard (leaderboard), To track the research progress of the latest graph model . The list provides a unified evaluation process , All results are open and transparent .
Research results
DGraph It has rich characteristics , Support graph research in multiple directions .
Algorithm contest
Xinye technology revolves around DGraph The seventh Xinye Technology Cup algorithm competition was held , Task and DGraph The fraud user identification in is consistent . The competition is open to the whole society , Colleges and universities at home and abroad 、 Scientific research institutes 、 Internet enterprises can sign up for the competition , The bonus pool is abundant , total 31 Thousands of yuan .
Welcome interested colleagues to patronize DGraph Public data website , Work together to provide rich application data for the field of artificial intelligence , Work together to build an open digital ecosystem .
Cooperation platform
Read more
# cast draft through Avenue #
Let your words be seen by more people
How to make more high-quality content reach the reader group in a shorter path , How about reducing the cost of finding quality content for readers ? The answer is : People you don't know .
There are always people you don't know , Know what you want to know .PaperWeekly Maybe it could be a bridge , Push different backgrounds 、 Scholars and academic inspiration in different directions collide with each other , There are more possibilities .
PaperWeekly Encourage university laboratories or individuals to , Share all kinds of quality content on our platform , It can be Interpretation of the latest paper , It can also be Analysis of academic hot spots 、 Scientific research experience or Competition experience explanation etc. . We have only one purpose , Let knowledge really flow .
The basic requirements of the manuscript :
• The article is really personal Original works , Not published in public channels , For example, articles published or to be published on other platforms , Please clearly mark
• It is suggested that markdown Format writing , The pictures are sent as attachments , The picture should be clear , No copyright issues
• PaperWeekly Respect the right of authorship , And will be adopted for each original first manuscript , Provide Competitive remuneration in the industry , Specifically, according to the amount of reading and the quality of the article, the ladder system is used for settlement
Contribution channel :
• Send email :[email protected]
• Please note your immediate contact information ( WeChat ), So that we can contact the author as soon as we choose the manuscript
• You can also directly add Xiaobian wechat (pwbot02) Quick contribution , remarks : full name - contribute
△ Long press add PaperWeekly Small make up
Now? , stay 「 You know 」 We can also be found
Go to Zhihu home page and search 「PaperWeekly」
Click on 「 Focus on 」 Subscribe to our column
·
·
边栏推荐
- Optional values and functions of the itemized contenttype parameter in the request header
- 面试官:Redis中哈希数据类型的内部实现方式是什么?
- C array supplement
- Use fail2ban to prevent password attempts
- Distributed base theory
- C语言图书租赁管理系统
- Automatic filling of database public fields
- Getting started with the go language is simple: go implements the Caesar password
- Comparative study of the gods in the twilight Era
- 安装trinity、解决报错
猜你喜欢
CTF competition problem solution STM32 reverse introduction
The only core indicator of high-quality software architecture
N++ is not reliable
高质量软件架构的唯一核心指标
Flet tutorial 03 basic introduction to filledbutton (tutorial includes source code) (tutorial includes source code)
Personalized online cloud database hybrid optimization system | SIGMOD 2022 selected papers interpretation
Understanding and difference between viewbinding and databinding
Xue Jing, director of insight technology solutions: Federal learning helps secure the flow of data elements
7 月数据库排行榜:MongoDB 和 Oracle 分数下降最多
OpenHarmony应用开发之如何创建DAYU200预览器
随机推荐
"Pre training weekly" issue 52: shielding visual pre training and goal-oriented dialogue
2022年起重机械指挥考试模拟100题模拟考试平台操作
Web知识补充
ASP.NET Core入门一
2022KDD预讲 | 11位一作学者带你提前解锁优秀论文
Iptables foundation and Samba configuration examples
[AI system frontier dynamics, issue 40] Hinton: my deep learning career and research mind method; Google refutes rumors and gives up tensorflow; The apotheosis framework is officially open source
Flet教程之 03 FilledButton基础入门(教程含源码)(教程含源码)
E-week finance | Q1 the number of active people in the insurance industry was 86.8867 million, and the licenses of 19 Payment institutions were cancelled
7 月数据库排行榜:MongoDB 和 Oracle 分数下降最多
Interviewer: what is the internal implementation of hash data type in redis?
动画与过渡效果
SQL statement syntax error in test SQL statement deletion in eclipse linked database
A data person understands and deepens the domain model
C语言程序设计选题参考
#yyds干货盘点# 解决名企真题:连续最大和
Flet tutorial 03 basic introduction to filledbutton (tutorial includes source code) (tutorial includes source code)
C foundation in-depth learning II
js中的变量提升和函数提升
XML入门一