当前位置:网站首页>How to analyze fans' interests?
How to analyze fans' interests?
2022-07-07 03:00:00 【Monkey data analysis】

【 subject 】
There is a “ Fan attention table ”, contain 3 A field : user id、 Follow the media id、 date .

【 problem 】“ Fan attention table ” There is a situation that one user pays attention to multiple media at the same time , such as : user id by A001 Users of , Focus on the media id The data is 1010,1020,1031. In order to facilitate the later analysis of fans' interests , Please split this situation in the table into multiple .
For example, for users A001, Its conversion is as follows :

【 Their thinking 】
Such problems are called “ Column turned ”, stay MySQL There are generally three steps to deal with it :
1) Create a “ Sequence table ”;
2) Join multiple tables , Copy each piece of data in the original table into multiple pieces ;
3) Use substring_index Function to get the final result .
First step : establish Sequence table
“ Sequence table ” It means that there is only one field , Stored is a sequence of numbers , such as :

among ,“ Sequence ” The maximum value of is the maximum number of media a user pays attention to in this problem .
select max(length( Follow the media id) - length(replace( Follow the media id,',','')) + 1) as The maximum number of media attention
from Fan attention table ;The return result is :

Then we need new “ Sequence table ” Namely :

The second step : Multi table join

Use multi table join , Can pass “ Sequence table ” take “ Fan attention table ” Each line of becomes multiple lines .
Here are two points to note :
1) To ensure that every piece of data in the original table is not lost , choice “ Left link ”, And take the original table as the left table ;
2) The number of copies is limited in the connection condition , The limiting condition is the number of media users pay attention to , namely “ Follow the media id” The number of commas under the field plus 1.
select t1. user id,
t1. Follow the media id,
t1. date ,
t2. Sequence
from Fan attention table t1
left join Sequence table t2 on t2. Sequence <= (length( Follow the media id) - length(replace( Follow the media id,',','')) + 1);The return result is :

The third step : Use the function to get the result
The next step is to put the media id Intercept it , You need to use the string interception function :SUBSTRING_INDEX.
SUBSTRING_INDEX( character string , Separator , Parameters )
among , Separator refers to dividing media in this question id Of “,”;2 Means to separate by separator , Intercept several media from left to right id; If the parameter is negative , It means to intercept several media from right to left id.
select t1. user id,
substring_index(substring_index(t1. Follow the media id,',',t2. Sequence ),',',-1) as Follow the media id,
t1. date
from Fan attention table t1
left join Sequence table t2 on t2. Sequence <= (length( Follow the media id) - length(replace( Follow the media id,',','')) + 1);The return result is :

【 The test point of this question 】
1) Check your understanding of the ordered list ;
2) Check the string interception function SUBSTRING_INDEX Understanding ;
3) Check your understanding of multi table connections .


▼ Click on 「 Read the original 」
▼ Unlock more data analysis courses
边栏推荐
- AWS learning notes (I)
- Examples of how to use dates in Oracle
- Derivative, partial derivative, directional derivative
- [leetcode]Search for a Range
- Redis getting started complete tutorial: common exceptions on the client
- Redis入门完整教程:复制配置
- 普通测试年薪15w,测试开发年薪30w+,二者差距在哪?
- Redis Getting started tutoriel complet: positionnement et optimisation des problèmes
- Redis getting started complete tutorial: client management
- A complete tutorial for getting started with redis: AOF persistence
猜你喜欢

测试优惠券要怎么写测试用例?

知识图谱构建全流程

6-6 vulnerability exploitation SSH security defense

2022 spring recruitment begins, and a collection of 10000 word interview questions will help you

Have fun | latest progress of "spacecraft program" activities

一文读懂Faster RCNN

Utilisation de la promesse dans es6

用全连接+softmax对图片的feature进行分类

软件测试——Jmeter接口测试之常用断言

记一次JAP查询导致OOM的问题分析
随机推荐
[leetcode]Search for a Range
Planning and design of double click hot standby layer 2 network based on ENSP firewall
一文读懂Faster RCNN
dotConnect for DB2数据提供者
MySQL
HAVE FUN | “飞船计划”活动最新进展
普通测试年薪15w,测试开发年薪30w+,二者差距在哪?
Examples of how to use dates in Oracle
Summary of basic debugging steps of S120 driver
MySQL
How to find file accessed / created just feed minutes ago
密码学系列之:在线证书状态协议OCSP详解
MetaForce原力元宇宙佛萨奇2.0智能合约系统开发(源码部署)
Redis入门完整教程:客户端管理
MySQL - common functions - string functions
MATLB|具有储能的经济调度及机会约束和鲁棒优化
Google Earth Engine(GEE)——Landsat 全球土地调查 1975年数据集
Redis入門完整教程:問題定比特與優化
【Socket】①Socket技术概述
Code debugging core step memory