当前位置:网站首页>How to analyze fans' interests?
How to analyze fans' interests?
2022-07-07 03:00:00 【Monkey data analysis】
【 subject 】
There is a “ Fan attention table ”, contain 3 A field : user id、 Follow the media id、 date .
【 problem 】“ Fan attention table ” There is a situation that one user pays attention to multiple media at the same time , such as : user id by A001 Users of , Focus on the media id The data is 1010,1020,1031. In order to facilitate the later analysis of fans' interests , Please split this situation in the table into multiple .
For example, for users A001, Its conversion is as follows :
【 Their thinking 】
Such problems are called “ Column turned ”, stay MySQL There are generally three steps to deal with it :
1) Create a “ Sequence table ”;
2) Join multiple tables , Copy each piece of data in the original table into multiple pieces ;
3) Use substring_index Function to get the final result .
First step : establish Sequence table
“ Sequence table ” It means that there is only one field , Stored is a sequence of numbers , such as :
among ,“ Sequence ” The maximum value of is the maximum number of media a user pays attention to in this problem .
select max(length( Follow the media id) - length(replace( Follow the media id,',','')) + 1) as The maximum number of media attention
from Fan attention table ;
The return result is :
Then we need new “ Sequence table ” Namely :
The second step : Multi table join
Use multi table join , Can pass “ Sequence table ” take “ Fan attention table ” Each line of becomes multiple lines .
Here are two points to note :
1) To ensure that every piece of data in the original table is not lost , choice “ Left link ”, And take the original table as the left table ;
2) The number of copies is limited in the connection condition , The limiting condition is the number of media users pay attention to , namely “ Follow the media id” The number of commas under the field plus 1.
select t1. user id,
t1. Follow the media id,
t1. date ,
t2. Sequence
from Fan attention table t1
left join Sequence table t2 on t2. Sequence <= (length( Follow the media id) - length(replace( Follow the media id,',','')) + 1);
The return result is :
The third step : Use the function to get the result
The next step is to put the media id Intercept it , You need to use the string interception function :SUBSTRING_INDEX.
SUBSTRING_INDEX( character string , Separator , Parameters )
among , Separator refers to dividing media in this question id Of “,”;2 Means to separate by separator , Intercept several media from left to right id; If the parameter is negative , It means to intercept several media from right to left id.
select t1. user id,
substring_index(substring_index(t1. Follow the media id,',',t2. Sequence ),',',-1) as Follow the media id,
t1. date
from Fan attention table t1
left join Sequence table t2 on t2. Sequence <= (length( Follow the media id) - length(replace( Follow the media id,',','')) + 1);
The return result is :
【 The test point of this question 】
1) Check your understanding of the ordered list ;
2) Check the string interception function SUBSTRING_INDEX Understanding ;
3) Check your understanding of multi table connections .
▼ Click on 「 Read the original 」
▼ Unlock more data analysis courses
边栏推荐
猜你喜欢
What management points should be paid attention to when implementing MES management system
Left value, right value
MySQL - common functions - string functions
[2022 national tournament simulation] polygon - computational geometry, binary answer, multiplication
Kysl Haikang camera 8247 H9 ISAPI test
Unity uses maskablegraphic to draw a line with an arrow
记一次JAP查询导致OOM的问题分析
Dotconnect for DB2 Data Provider
uniapp适配问题
测试优惠券要怎么写测试用例?
随机推荐
uniapp的表单验证
Safety delivery engineer
Es6中Promise的使用
Change your posture to do operation and maintenance! GOPs 2022 Shenzhen station highlights first!
The panel floating with the mouse in unity can adapt to the size of text content
Electrical engineering and automation
Redis入门完整教程:RDB持久化
Babbitt | metauniverse daily must read: is IP authorization the way to break the circle of NFT? What are the difficulties? How should holder choose the cooperation platform
Work of safety inspection
S120驱动器基本调试步骤总结
Introduction to ins/gps integrated navigation type
C language exercises_ one
Static proxy of proxy mode
Convert widerperson dataset to Yolo format
QT common Concepts-1
Left path cloud recursion + dynamic planning
Metaforce force meta universe fossage 2.0 smart contract system development (source code deployment)
Classify the features of pictures with full connection +softmax
Leetcode:minimum_ depth_ of_ binary_ Tree solutions
c语言(字符串)如何把字符串中某个指定的字符删除?