当前位置:网站首页>[data analysis and visualization] key points of data drawing 11- precautions for radar chart

[data analysis and visualization] key points of data drawing 11- precautions for radar chart

2022-06-13 02:35:00 The winter holiday of falling marks

Key points of data drawing 11- Precautions for radar chart

There are many things worth thinking about to accurately represent data through radar charts , This paper mainly introduces some matters needing attention in radar chart .

Radar mapping

Basic radar map

Radar chart is also called spider chart or network chart , Is a two-dimensional chart type , One or more numerical series designed to plot multiple quantitative variables . Each variable has its own axis , All axes are connected to the center of the figure . Let's consider a student's exam results . He is in mathematics 、 sports 、 The score range of ten topics such as statistics is 0 To 20. The radar map provides an axis for each subject . Through this shape , You can see which topics students do well or poorly .

#  Load the library 
library(tidyverse)
library(viridis)
library(patchwork)
library(hrbrthemes)
library(fmsb)
library(colormap)


#  Create data 
# Design random seeds 
set.seed(42) 
#  Design data 
data <- as.data.frame(matrix( sample( 2:20 , 10 , replace=T) , ncol=10)) 
#  Add column name 
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )
#  Add the maximum and minimum ranges of each column to the data 
data <- rbind(rep(20,10) , rep(0,10) , data)
data
A data.frame: 3 × 10
mathenglishbiologymusicR-codingdata-vizfrenchphysicstatisticsport
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
20202020202020202020
0 0 0 0 0 0 0 0 0 0
18 6 211 5191816 8 5

#  Create radar map 
par(mar=c(0,0,0,0))
radarchart( data, axistype=1, 
           #  Custom radar quadrilateral 
           # pcol Set quadrilateral border color ,pfcol Set Quad fill color ,plwd Set the border thickness 
           pcol=rgb(0.9,0.6,0.5,0.9) , pfcol=rgb(0.2,0.5,0.5,0.5) , plwd=4, 
           #  Custom Grid 
           # cglcol Set grid line color ,cglty Set the grid line type ,axislabcol Set axis label font color 
           # caxislabels Set the range of the axis ,cglwd Set the grid line thickness 
           cglcol="black", cglty=2, axislabcol="blue", caxislabels=seq(0,20,5), cglwd=0.8,
           # vlcex Set category label position size 
           vlcex=1.2 
)

png

Multi category radar chart

In the previous diagram , Only one series has been drawn , Shows a student's performance . A common task is to compare several people . Just a few series , You can display each group on the same chart . As shown in the figure below , Obviously ,Shirley Comprehensive performance is better than Sonia, Except in sports 、 English and R Encoding aspects .

#  Create data 
set.seed(1)
data <-as.data.frame(matrix( c( sample( 2:20 , 10 , replace=T), sample( 2:9 , 10 , replace=T)) , ncol=10, byrow=TRUE))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )
#  Set the data of the second row and the second column as 19
data[2,2]=19
#  Add the maximum and minimum ranges of each column to the data 
data <- rbind(rep(20,10) , rep(0,10) , data)
data
A data.frame: 4 × 10
mathenglishbiologymusicR-codingdata-vizfrenchphysicstatisticsport
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
20202020202020202020
0 0 0 0 0 0 0 0 0 0
5 8 2 312151920 211
719 3 8 2 8 6 6 2 2
#  Set the color 
colors_border=c( rgb(0.2,0.5,0.5,0.9), rgb(0.8,0.2,0.5,0.9)  )
colors_in=c( rgb(0.2,0.5,0.5,0.4), rgb(0.8,0.2,0.5,0.4)  )

#  Create radar map 
radarchart( data, axistype=1, 
           #  Custom quadrilateral 
           pcol=colors_border , pfcol=colors_in , plwd=4, plty=1 , 
           #  Custom Grid 
           cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=1.1,
           #  Custom tag 
           vlcex=0.8 )

#  Set legend 
# bty Set the legend text border style ,pch Set legend icon style ,col Set legend icon color 
# text.col Set legend text color ,cex Set the legend text size ,pt.cex Set legend icon size 
legend(x=0.85, y=1, legend = c("Shirley", "Sonia"), bty = "n", pch=20 , col=colors_border , text.col = "black", cex=0.9, pt.cex=1.6)

png

For two or more series , Using multiple subgraphs is a good practice , To avoid cluttered numbers . Each student has his own radar chart . It is easy to understand the characteristics of a particular individual , Looking for shape similarity allows you to find students with similar characteristics .

#  Create data 
set.seed(1)
data <-as.data.frame(matrix( sample( 2:20 , 60 , replace=T) , ncol=10, byrow=TRUE))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )
#  Add the maximum and minimum ranges of each column to the data 
data <- rbind(rep(20,10) , rep(0,10) , data)
data
A data.frame: 8 × 10
mathenglishbiologymusicR-codingdata-vizfrenchphysicstatisticsport
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
20202020202020202020
0 0 0 0 0 0 0 0 0 0
5 8 2 312151920 211
1511 81016 61015 6 6
3111316 2 4 71111 7
1613 7 913 7 82011 7
15 3141915 7 22020 9
713 7 9 81218 514 9
#  Set the color 
colors_border=colormap(colormap=colormaps$viridis, nshades=6, alpha=1)
colors_in=colormap(colormap=colormaps$viridis, nshades=6, alpha=0.3)

#  Set title 
mytitle <- c("Max", "George", "Xue", "Tom", "Alice", "bob")

#  Set up subgraphs 
par(mar=rep(0.8,4))
par(mfrow=c(2,3))

#  Sequential drawing 
for(i in 1:6){
    radarchart( data[c(1,2,i+2),], axistype=1, 
    pcol=colors_border[i] , pfcol=colors_in[i] , plwd=4, plty=1 , 
    cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=0.8,
    vlcex=0.8,
    title=mytitle[i]
    )
}

png

Problems and solutions of radar chart

Problems with radar charts

1 The circular layout means that it is more difficult to read

When laying out along a single vertical or horizontal axis , Quantitative values are easier to compare . This is a general criticism of the circular layout . The figure below only considers the data of one student . It is easier to compare the values in the bar chart , It's more accurate .

#  Create data 
set.seed(1)
data <-as.data.frame(matrix( sample( 2:20 , 10 , replace=T) , ncol=10))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )

#  Add the maximum and minimum ranges of each column to the data 
data <-rbind(rep(20,10) , rep(0,10) , data)
data
A data.frame: 3 × 10
mathenglishbiologymusicR-codingdata-vizfrenchphysicstatisticsport
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
20202020202020202020
0 0 0 0 0 0 0 0 0 0
5 8 2 312151920 211
#  Create radar map 
par(mar=c(0,0,0,0))
p1 <- radarchart( data, axistype=1, 
                 pcol=rgb(0.2,0.5,0.5,0.9) , pfcol=rgb(0.2,0.5,0.5,0.5) , plwd=4 , 
                 cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=0.8,vlcex=1.3 )

png


#  Create a bar chart 
data %>% slice(3) %>% t() %>% as.data.frame() %>% add_rownames() %>% arrange(V1) %>% mutate(rowname=factor(rowname, rowname)) %>%
  ggplot( aes(x=rowname, y=V1)) +
    geom_segment( aes(x=rowname ,xend=rowname, y=0, yend=V1), color="grey") +
    geom_point(size=5, color="#69b3a2") +
    coord_flip() +
    theme(
      panel.grid.minor.y = element_blank(),
      panel.grid.major.y = element_blank(),
      axis.text = element_text( size=32),
      legend.position="none"
    ) +
    ylim(0,20) +
    ylab("mark") +
    xlab("")

png

2 Ranking is not supported

In the example above , The lollipop chart is ordered . It allows you to immediately see which topic has the highest score and the ranking of each topic . For radar charts without starting point and ending point , It's more difficult .

3 Category sorting has a huge impact

Readers of radar charts may be interested in the observed shapes . This can be misleading , Because this shape is highly dependent on the order of the surrounding categories . See these charts made with the same data , The following three figures are the same , But the category sort has been changed .

#  Create data 
set.seed(7)
data <- as.data.frame(matrix( sample( 2:20 , 10 , replace=T) , ncol=10))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )
data[1,1:3]=rep(19,3)
data[1,6:8]=rep(4,3)
data <- rbind(rep(20,10) , rep(0,10) , data)
data
A data.frame: 3 × 10
mathenglishbiologymusicR-codingdata-vizfrenchphysicstatisticsport
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
20202020202020202020
0 0 0 0 0 0 0 0 0 0
191919 316 4 4 416 9
#  Change the sort to create other data 
data2 <- data[,sample(1:10,10, replace=FALSE)]
data2
data3 <- data[,sample(1:10,10, replace=FALSE)]
data3
A data.frame: 3 × 10
sportbiologymusicenglishdata-vizstatisticmathR-codingfrenchphysic
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
20202020202020202020
0 0 0 0 0 0 0 0 0 0
919 319 4161916 4 4
A data.frame: 3 × 10
frenchenglishmusicdata-vizbiologyphysicmathsportstatisticR-coding
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
20202020202020202020
0 0 0 0 0 0 0 0 0 0
419 3 419 419 91616
#  mapping 
par(mar=c(0,0,0,0))
par(mfrow=c(3,1))
radarchart( data, axistype=1, pcol=rgb(0.2,0.5,0.5,0.9) , pfcol=rgb(0.2,0.5,0.5,0.5) , plwd=4 ,   
           cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=0.8, vlcex=0.8  )
radarchart( data2, axistype=1, pcol=rgb(0.2,0.5,0.5,0.9) , pfcol=rgb(0.2,0.5,0.5,0.5) , plwd=4 ,  
           cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=0.8, vlcex=0.8  )
radarchart( data3, axistype=1, pcol=rgb(0.2,0.5,0.5,0.9) , pfcol=rgb(0.2,0.5,0.5,0.5) , plwd=4 ,   
           cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=0.8, vlcex=0.8  )

png

4 Uncertain numerical scale

The radar chart shows the values of several quantitative variables , All variables are represented on one axis . In the previous example , All variables ( Range from 0 To 20 Of ) Share the same proportions and the same units . But radar charts can also show completely different variables . under these circumstances , Don't forget to show a clear scale for each : Otherwise the reader would expect the same proportion .

5 Over evaluation of differences

The area of the shape in the radar chart also increases in a quadratic and nonlinear way , This may lead viewers to think that small changes are more important than the actual situation . In the following example , The students on the left scored 7, The students on the right scored on each topic 14. however , The area of the right graph is more than twice that of the left graph .

#  Create data 
data <- as.data.frame(matrix( c(7,7,7,7,7) , ncol=5))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding")
data <- rbind(rep(20,10) , rep(0,10) , data)
data
data2 <- data
data2[3,] <- rep(14,5)
data2
A data.frame: 3 × 5
mathenglishbiologymusicR-coding
<dbl><dbl><dbl><dbl><dbl>
2020202020
0 0 0 0 0
7 7 7 7 7
A data.frame: 3 × 5
mathenglishbiologymusicR-coding
<dbl><dbl><dbl><dbl><dbl>
2020202020
0 0 0 0 0
1414141414
#  mapping 
par(mar=rep(0,4))
par(mfrow=c(2,1))
radarchart( data, axistype=1, pcol=rgb(0.2,0.5,0.5,0.9) , pfcol=rgb(0.2,0.5,0.5,0.5) , 
           plwd=4 , cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=0.8, vlcex=0.8  )
radarchart( data2, axistype=1, pcol=rgb(0.2,0.5,0.5,0.9) , pfcol=rgb(0.2,0.5,0.5,0.5) , 
           plwd=4 , cglcol="grey", cglty=1, axislabcol="grey", caxislabels=seq(0,20,5), cglwd=0.8, vlcex=0.8  )

png

Solution

If you want to display a single series and all quantitative variables have the same proportion , Then use bar chart or lollipop chart , Rank variables :

#  Create data 
set.seed(1)
data <-as.data.frame(matrix( sample( 2:20 , 10 , replace=T) , ncol=10))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )
data <-rbind(rep(20,10) , rep(0,10) , data)

#  Draw a bar graph 
data %>% slice(3) %>% t() %>% as.data.frame() %>% add_rownames() %>% arrange(V1) %>% mutate(rowname=factor(rowname, rowname)) %>%
  ggplot( aes(x=rowname, y=V1)) +
    geom_segment( aes(x=rowname ,xend=rowname, y=0, yend=V1), color="grey") +
    geom_point(size=5, color="#69b3a2") +
    coord_flip() +
    theme(
      panel.grid.minor.y = element_blank(),
      panel.grid.major.y = element_blank(),
      axis.text = element_text( size=32 ),
      legend.position="none"
    ) +
    ylim(0,20) +
    ylab("mark") +
    xlab("")

png

If you have two series to draw , You can still use bar charts and lollipop charts . This one has 2 A series of examples . It focuses on the first student ( Dark ), And let you see another student ( light colour ) How do you behave .

#  Create data 
set.seed(1)
data <-as.data.frame(matrix( sample( 2:20 , 20 , replace=T) , ncol=10))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )
data <-rbind(rep(20,10) , rep(0,10) , data)

#  mapping 
data %>% slice(c(3,4)) %>% t() %>% as.data.frame() %>% add_rownames() %>% arrange(V1) %>% mutate(rowname=factor(rowname, rowname)) %>%
  ggplot( aes(x=rowname, y=V1)) +
    #  Draw lines 
    geom_segment( aes(x=rowname ,xend=rowname, y=V2, yend=V1), color="grey") +
    geom_point(size=5, color="#69b3a2") +
    #  Set transparency 
    geom_point(aes(y=V2), size=5, color="#69b3a2", alpha=0.5) +
    coord_flip() +
    theme(
      panel.grid.minor.y = element_blank(),
      panel.grid.major.y = element_blank(),
      axis.text = element_text( size=32 )
    ) +
    ylim(0,20) +
    ylab("mark") +
    xlab("")

png

If you have more than 2 A series to draw , Using the facet of bar chart or lollipop chart may solve the problem :

#  Create data 
set.seed(1)
data <-as.data.frame(matrix( sample( 2:20 , 40 , replace=T) , ncol=10))
colnames(data) <- c("math" , "english" , "biology" , "music" , "R-coding", "data-viz" , "french" , "physic", "statistic", "sport" )
data <-rbind(rep(20,10) , rep(0,10) , data)
rownames(data) <- c("-", "--", "John", "Angli", "Baptiste", "Alfred")

#  The drawing data 
data <- data %>% slice(c(3:6)) %>% 
  t() %>% 
  as.data.frame() %>% 
  add_rownames() %>% 
  arrange() %>% 
  mutate(rowname=factor(rowname, rowname)) %>% 
  gather(key=name, value=mark, -1)

#  Compile data 
data$name <- recode(data$name, V1 = "John", V2 = "Angli", V3 = "Baptiste", V4 = "Alfred")

#  mapping 
data %>% ggplot( aes(x=rowname, y=mark)) +
    geom_bar(stat="identity", fill="#69b3a2", width=0.6) +
    coord_flip() +
    theme(
      panel.grid.minor.y = element_blank(),
      panel.grid.major.y = element_blank(),
      axis.text = element_text( size=16 )
    ) +
    ylim(0,20) +
    ylab("mark") +
    xlab("") +
    facet_wrap(~name, ncol=4)

png

If you have many series to draw , Or your variables do not have the same scale , Then the best choice may be to switch to the parallel coordinate graph .

library(GGally)
#  Import iris data 
data <- iris

#  mapping 
data %>% 
ggparcoord(
columns = 1:4, groupColumn = 5, order = "anyClass",
showPoints = TRUE, 
title = "Parallel Coordinate Plot for the Iris Data",
alphaLines = 0.3
) + 
scale_color_viridis(discrete=TRUE) 

png

Reference resources

原网站

版权声明
本文为[The winter holiday of falling marks]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202280540284164.html