当前位置:网站首页>Dgraph: large scale dynamic graph dataset
Dgraph: large scale dynamic graph dataset
2022-07-04 13:12:00 【Zhiyuan community】
In recent days, , Yang Yang's scientific research group of Zhejiang University (yangy.org) Hexin also jointly released a large-scale dynamic graph data set DGraph, Aimed at service graph neural network 、 Graph mining 、 Social networks 、 Researchers in the direction of anomaly detection , Provide large-scale data of real scenes .DGraph On the one hand, it can be used as the standard data to verify the performance of the correlation graph model , On the other hand, it can also be used to carry out user portrait 、 Network analysis and other research work .
Dataset home page :https://dgraph.xinye.com/
Github:
https://github.com/DGraphXinye/
Related papers :
DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection. Xuanwen Huang, Yang Yang*, Yang Wang, Chunping Wang, Zhisheng Zhang, Jiarong Xu, and Lei Chen. Preprint, 2022. (http://yangy.org/works/dgraph/dgraph_2022.pdf)
Data set description

DGraph The source data of is provided by Xinye Technology .DGraph It is a directed dynamic graph with no right , Contains more than 370 Ten thousand nodes and 430 Ten thousand dynamic edges . As shown in the figure below ,DGraph The node in represents the financial lending user of Xinye technology service , A directed edge indicates an urgent contact relationship , Each node contains the attribute characteristics after desensitization , And a label indicating whether it is a financial fraud user .
Data features
The scene is real
DGraph It comes from the real financial business scenario , Its construction logic is close to the industrial landing , It provides an opportunity for users of data sets to explore how to extend the graph model to the financial field . To be specific ,DGraph The proportion of abnormal and normal users in is about 1:100, Its “ The label is unbalanced ” The characteristics of the are in line with the real scene , Support exception detection 、 Research on classification of unbalanced nodes .
Structural dynamics
DGraph User relationships in are sampled from across 27 A business scenario for months , And the network structure will evolve over time , It provides data support for the current dynamic graph model and mining research .
Large scale
DGraph contain 370 Thousands of desensitized real financial lending users and 430 Ten thousand dynamic relationships , Its scale is about the largest dynamic graph data in the financial field Elliptic Of 17 times , Support the research and evaluation of large-scale graph models . Besides ,DGraph Contained in the 60% Of “ Background node ”, That is, it is not a classification or analysis object, but it actually exists 、 Nodes that have an indirect impact on business logic . These nodes play an important role in maintaining the connectivity of the network , Widely exists in industry . Reasonable processing of background nodes can effectively improve the storage space of data and the operation efficiency of the model in large-scale data scenarios .DGraph It contains more than 200 10000 background nodes , It can support researchers to explore the properties of background nodes .
Open source community maintenance
Ranking List
DGraph Users can submit at any time 、 Refreshed performance leaderboard (leaderboard), To track the research progress of the latest graph model . The list provides a unified evaluation process , All results are open and transparent .
Research results
DGraph It has rich characteristics , Support graph research in multiple directions .
Algorithm contest
Xinye technology revolves around DGraph The seventh Xinye Technology Cup algorithm competition was held , Task and DGraph The fraud user identification in is consistent . The competition is open to the whole society , Colleges and universities at home and abroad 、 Scientific research institutes 、 Internet enterprises can sign up for the competition , The bonus pool is abundant , total 31 Thousands of yuan .
Welcome interested colleagues to patronize DGraph Public data website , Work together to provide rich application data for the field of artificial intelligence , Work together to build an open digital ecosystem .

边栏推荐
- Will the concept of "being integrated" become a new inflection point of the information and innovation industry?
- 强化学习-学习笔记1 | 基础概念
- n++也不靠谱
- Definition of cognition
- 老掉牙的 synchronized 锁优化,一次给你讲清楚!
- 17. Memory partition and paging
- Using nsproxy to forward messages
- C语言数组
- ArcGIS uses grid processing tools for image clipping
- AI painting minimalist tutorial
猜你喜欢

分布式事务相关概念与理论

Read the BGP agreement in 6 minutes.

诸神黄昏时代的对比学习

CTF竞赛题解之stm32逆向入门

C#/VB. Net to add text / image watermarks to PDF documents

【Android Kotlin】lambda的返回语句和匿名函数

Daily Mathematics Series 57: February 26

CANN算子:利用迭代器高效实现Tensor数据切割分块处理

Introduction to the button control elevatedbutton of the fleet tutorial (the tutorial includes the source code)

Fly tutorial 02 advanced functions of elevatedbutton (tutorial includes source code) (tutorial includes source code)
随机推荐
Practice: fabric user certificate revocation operation process
8个扩展子包!RecBole推出2.0!
干货整理!ERP在制造业的发展趋势如何,看这一篇就够了
Excuse me, have you encountered this situation? CDC 1.4 cannot use timestamp when connecting to MySQL 5.7
《预训练周刊》第52期:屏蔽视觉预训练、目标导向对话
Detailed explanation of mt4api documentary and foreign exchange API documentary interfaces
WPF双滑块控件以及强制捕获鼠标事件焦点
游戏启动后提示安装HMS Core,点击取消,未再次提示安装HMS Core(初始化失败返回907135003)
C language: find the palindrome number whose 100-999 is a multiple of 7
Simple understanding of binary search
Vit (vision transformer) principle and code elaboration
CTF竞赛题解之stm32逆向入门
iptables基础及Samba配置举例
Show recent errors only command /bin/sh failed with exit code 1
Transformer principle and code elaboration (pytorch)
AI 绘画极简教程
When to use pointers in go?
C language array
实时云交互如何助力教育行业发展
Comprehensive evaluation of modular note taking software: craft, notation, flowus