当前位置:网站首页>Dgraph: large scale dynamic graph dataset
Dgraph: large scale dynamic graph dataset
2022-07-04 13:55:00 【PaperWeekly】
In recent days, , Yang Yang's scientific research group of Zhejiang University (yangy.org) Hexin also jointly released a large-scale dynamic graph data set DGraph, Aimed at service graph neural network 、 Graph mining 、 Social networks 、 Researchers in the direction of anomaly detection , Provide large-scale data of real scenes .DGraph On the one hand, it can be used as the standard data to verify the performance of the correlation graph model , On the other hand, it can also be used to carry out user portrait 、 Network analysis and other research work .
Dataset home page :
https://dgraph.xinye.com/
Github:
https://github.com/DGraphXinye/
Related papers :
DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection. Xuanwen Huang, Yang Yang*, Yang Wang, Chunping Wang, Zhisheng Zhang, Jiarong Xu, and Lei Chen. Preprint, 2022. (http://yangy.org/works/dgraph/dgraph_2022.pdf)
Data set description
DGraph The source data of is provided by Xinye Technology .DGraph It is a directed dynamic graph with no right , Contains more than 370 Ten thousand nodes and 430 Ten thousand dynamic edges . As shown in the figure below ,DGraph The node in represents the financial lending user of Xinye technology service , A directed edge indicates an urgent contact relationship , Each node contains the attribute characteristics after desensitization , And a label indicating whether it is a financial fraud user .
Data features
The scene is real
DGraph It comes from the real financial business scenario , Its construction logic is close to the industrial landing , It provides an opportunity for users of data sets to explore how to extend the graph model to the financial field . To be specific ,DGraph The proportion of abnormal and normal users in is about 1:100, Its “ The label is unbalanced ” The characteristics of the are in line with the real scene , Support exception detection 、 Research on classification of unbalanced nodes .
Structural dynamics
DGraph User relationships in are sampled from across 27 A business scenario for months , And the network structure will evolve over time , It provides data support for the current dynamic graph model and mining research .
Large scale
DGraph contain 370 Thousands of desensitized real financial lending users and 430 Ten thousand dynamic relationships , Its scale is about the largest dynamic graph data in the financial field Elliptic Of 17 times , Support the research and evaluation of large-scale graph models . Besides ,DGraph Contained in the 60% Of “ Background node ”, That is, it is not a classification or analysis object, but it actually exists 、 Nodes that have an indirect impact on business logic . These nodes play an important role in maintaining the connectivity of the network , Widely exists in industry . Reasonable processing of background nodes can effectively improve the storage space of data and the operation efficiency of the model in large-scale data scenarios .DGraph It contains more than 200 10000 background nodes , It can support researchers to explore the properties of background nodes .
Open source community maintenance
Ranking List
DGraph Users can submit at any time 、 Refreshed performance leaderboard (leaderboard), To track the research progress of the latest graph model . The list provides a unified evaluation process , All results are open and transparent .
Research results
DGraph It has rich characteristics , Support graph research in multiple directions .
Algorithm contest
Xinye technology revolves around DGraph The seventh Xinye Technology Cup algorithm competition was held , Task and DGraph The fraud user identification in is consistent . The competition is open to the whole society , Colleges and universities at home and abroad 、 Scientific research institutes 、 Internet enterprises can sign up for the competition , The bonus pool is abundant , total 31 Thousands of yuan .
Welcome interested colleagues to patronize DGraph Public data website , Work together to provide rich application data for the field of artificial intelligence , Work together to build an open digital ecosystem .
Cooperation platform
Read more
# cast draft through Avenue #
Let your words be seen by more people
How to make more high-quality content reach the reader group in a shorter path , How about reducing the cost of finding quality content for readers ? The answer is : People you don't know .
There are always people you don't know , Know what you want to know .PaperWeekly Maybe it could be a bridge , Push different backgrounds 、 Scholars and academic inspiration in different directions collide with each other , There are more possibilities .
PaperWeekly Encourage university laboratories or individuals to , Share all kinds of quality content on our platform , It can be Interpretation of the latest paper , It can also be Analysis of academic hot spots 、 Scientific research experience or Competition experience explanation etc. . We have only one purpose , Let knowledge really flow .
The basic requirements of the manuscript :
• The article is really personal Original works , Not published in public channels , For example, articles published or to be published on other platforms , Please clearly mark
• It is suggested that markdown Format writing , The pictures are sent as attachments , The picture should be clear , No copyright issues
• PaperWeekly Respect the right of authorship , And will be adopted for each original first manuscript , Provide Competitive remuneration in the industry , Specifically, according to the amount of reading and the quality of the article, the ladder system is used for settlement
Contribution channel :
• Send email :[email protected]
• Please note your immediate contact information ( WeChat ), So that we can contact the author as soon as we choose the manuscript
• You can also directly add Xiaobian wechat (pwbot02) Quick contribution , remarks : full name - contribute
△ Long press add PaperWeekly Small make up
Now? , stay 「 You know 」 We can also be found
Go to Zhihu home page and search 「PaperWeekly」
Click on 「 Focus on 」 Subscribe to our column
·
·
边栏推荐
- Runc hang causes the kubernetes node notready
- unity不识别rider的其中一种解决方法
- 担心“断气” 德国正修改《能源安全法》
- 易周金融 | Q1保险行业活跃人数8688.67万人 19家支付机构牌照被注销
- 安装trinity、解决报错
- 分布式BASE理论
- Interviewer: what is the difference between redis expiration deletion strategy and memory obsolescence strategy?
- Flet教程之 03 FilledButton基础入门(教程含源码)(教程含源码)
- C language Dormitory Management Query Software
- Reading cognitive Awakening
猜你喜欢
随机推荐
[FAQ] summary of common causes and solutions of Huawei account service error 907135701
Understanding and difference between viewbinding and databinding
三星量产3纳米产品引台媒关注:能否短期提高投入产出率是与台积电竞争关键
Reading cognitive Awakening
CTF competition problem solution STM32 reverse introduction
2022年起重机械指挥考试模拟100题模拟考试平台操作
E-week finance | Q1 the number of active people in the insurance industry was 86.8867 million, and the licenses of 19 Payment institutions were cancelled
Web知识补充
Is the outdoor LED screen waterproof?
Efficient! Build FTP working environment with virtual users
C language dormitory management query software
CommVault cooperates with Oracle to provide metallic data management as a service on Oracle cloud
基于链表管理的单片机轮询程序框架
XML入门二
逆向调试入门-PE结构-资源表07/07
When MDK uses precompiler in header file, ifdef is invalid
Configure WebDAV server on Apache
#yyds干货盘点# 解决名企真题:连续最大和
30:第三章:开发通行证服务:13:开发【更改/完善用户信息,接口】;(使用***BO类承接参数,并使用了参数校验)
Runc hang causes the kubernetes node notready