当前位置:网站首页>Dgraph: large scale dynamic graph dataset
Dgraph: large scale dynamic graph dataset
2022-07-04 13:12:00 【Zhiyuan community】
In recent days, , Yang Yang's scientific research group of Zhejiang University (yangy.org) Hexin also jointly released a large-scale dynamic graph data set DGraph, Aimed at service graph neural network 、 Graph mining 、 Social networks 、 Researchers in the direction of anomaly detection , Provide large-scale data of real scenes .DGraph On the one hand, it can be used as the standard data to verify the performance of the correlation graph model , On the other hand, it can also be used to carry out user portrait 、 Network analysis and other research work .
Dataset home page :https://dgraph.xinye.com/
Github:
https://github.com/DGraphXinye/
Related papers :
DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection. Xuanwen Huang, Yang Yang*, Yang Wang, Chunping Wang, Zhisheng Zhang, Jiarong Xu, and Lei Chen. Preprint, 2022. (http://yangy.org/works/dgraph/dgraph_2022.pdf)
Data set description

DGraph The source data of is provided by Xinye Technology .DGraph It is a directed dynamic graph with no right , Contains more than 370 Ten thousand nodes and 430 Ten thousand dynamic edges . As shown in the figure below ,DGraph The node in represents the financial lending user of Xinye technology service , A directed edge indicates an urgent contact relationship , Each node contains the attribute characteristics after desensitization , And a label indicating whether it is a financial fraud user .
Data features
The scene is real
DGraph It comes from the real financial business scenario , Its construction logic is close to the industrial landing , It provides an opportunity for users of data sets to explore how to extend the graph model to the financial field . To be specific ,DGraph The proportion of abnormal and normal users in is about 1:100, Its “ The label is unbalanced ” The characteristics of the are in line with the real scene , Support exception detection 、 Research on classification of unbalanced nodes .
Structural dynamics
DGraph User relationships in are sampled from across 27 A business scenario for months , And the network structure will evolve over time , It provides data support for the current dynamic graph model and mining research .
Large scale
DGraph contain 370 Thousands of desensitized real financial lending users and 430 Ten thousand dynamic relationships , Its scale is about the largest dynamic graph data in the financial field Elliptic Of 17 times , Support the research and evaluation of large-scale graph models . Besides ,DGraph Contained in the 60% Of “ Background node ”, That is, it is not a classification or analysis object, but it actually exists 、 Nodes that have an indirect impact on business logic . These nodes play an important role in maintaining the connectivity of the network , Widely exists in industry . Reasonable processing of background nodes can effectively improve the storage space of data and the operation efficiency of the model in large-scale data scenarios .DGraph It contains more than 200 10000 background nodes , It can support researchers to explore the properties of background nodes .
Open source community maintenance
Ranking List
DGraph Users can submit at any time 、 Refreshed performance leaderboard (leaderboard), To track the research progress of the latest graph model . The list provides a unified evaluation process , All results are open and transparent .
Research results
DGraph It has rich characteristics , Support graph research in multiple directions .
Algorithm contest
Xinye technology revolves around DGraph The seventh Xinye Technology Cup algorithm competition was held , Task and DGraph The fraud user identification in is consistent . The competition is open to the whole society , Colleges and universities at home and abroad 、 Scientific research institutes 、 Internet enterprises can sign up for the competition , The bonus pool is abundant , total 31 Thousands of yuan .
Welcome interested colleagues to patronize DGraph Public data website , Work together to provide rich application data for the field of artificial intelligence , Work together to build an open digital ecosystem .

边栏推荐
- Jetson TX2配置Tensorflow、Pytorch等常用库
- After installing vscode, the program runs (an include error is detected, please update the includepath, which has been solved for this translation unit (waveform curve is disabled) and (the source fil
- Talk about the design and implementation logic of payment process
- go-zero微服务实战系列(九、极致优化秒杀性能)
- eclipse链接数据库中测试SQL语句删除出现SQL语句语法错误
- Comparative study of the gods in the twilight Era
- I want to talk about yesterday
- Can Console. Clear be used to only clear a line instead of whole console?
- jsonp
- How real-time cloud interaction helps the development of education industry
猜你喜欢
随机推荐
C語言函數
分布式事务相关概念与理论
Transformer principle and code elaboration (tensorflow)
C语言:围圈报号排序问题
7、 Software package management
面向个性化需求的在线云数据库混合调优系统 | SIGMOD 2022入选论文解读
go-zero微服务实战系列(九、极致优化秒杀性能)
BackgroundWorker用法示例
Practice: fabric user certificate revocation operation process
Peak detection of measured signal
Kivy tutorial 08 countdown app implements timer call (tutorial includes source code)
从0到1建设智能灰度数据体系:以vivo游戏中心为例
Fly tutorial 02 advanced functions of elevatedbutton (tutorial includes source code) (tutorial includes source code)
After installing vscode, the program runs (an include error is detected, please update the includepath, which has been solved for this translation unit (waveform curve is disabled) and (the source fil
从0到1建设智能灰度数据体系:以vivo游戏中心为例
Jetson TX2配置Tensorflow、Pytorch等常用库
面试官:Redis 过期删除策略和内存淘汰策略有什么区别?
DC-5靶机
Introduction to the button control elevatedbutton of the fleet tutorial (the tutorial includes the source code)
C#/VB. Net to add text / image watermarks to PDF documents








