当前位置:网站首页>[data analysis and visualization] key points of data drawing 4- problems of pie chart
[data analysis and visualization] key points of data drawing 4- problems of pie chart
2022-06-13 02:33:00 【The winter holiday of falling marks】
Key points of data drawing 4- Pie chart problem
This article lets us know the most criticized chart types in history : The pie chart .
Bad definition
A pie chart is a circle , It's divided into several parts , Each part represents a part of the whole . It is usually used to display percentages , Where the sum of sectors is equal to 100%. The problem is that humans are very bad at reading . In the adjacent pie chart , Try to find the largest group , And try to sort them by value . It may be difficult for you to do this , This is why you must avoid using pie charts . Let's try to compare 3 A pie chart . Try to understand here 3 Which group in the graph has the highest value . Besides , Try to figure out what the value evolution between groups is .
# Libraries
library(tidyverse)
library(hrbrthemes)
library(viridis)
library(patchwork)
# create 3 data frame Create data
data1 <- data.frame( name=letters[1:5], value=c(17,18,20,22,24) )
data2 <- data.frame( name=letters[1:5], value=c(20,18,21,20,20) )
data3 <- data.frame( name=letters[1:5], value=c(24,23,21,19,18) )
# View the data
data1
data2
data3
| name | value |
|---|---|
| <fct> | <dbl> |
| a | 17 |
| b | 18 |
| c | 20 |
| d | 22 |
| e | 24 |
| name | value |
|---|---|
| <fct> | <dbl> |
| a | 20 |
| b | 18 |
| c | 21 |
| d | 20 |
| e | 20 |
| name | value |
|---|---|
| <fct> | <dbl> |
| a | 24 |
| b | 23 |
| c | 21 |
| d | 19 |
| e | 18 |
# Define the drawing function
plot_pie <- function(data, vec){
ggplot(data, aes(x="name", y=value, fill=name)) +
# A pie chart is a bar chart
geom_bar(width = 1, stat = "identity") +
# Change to polar coordinate system
coord_polar("y", start=0, direction = -1) +
# Set fill color
scale_fill_viridis(discrete = TRUE, direction=-1) +
# According to the text
geom_text(aes(y = vec, label = rev(name), size=4, color=c( "white", rep("black", 4)))) +
scale_color_manual(values=c("black", "white")) +
theme(
legend.position="none",
plot.title = element_text(size=14),
panel.grid = element_blank(),
axis.text = element_blank()
) +
xlab("") +
ylab("")
}
a <- plot_pie(data1, c(10,35,55,75,93))
b <- plot_pie(data2, c(10,35,53,75,93))
c <- plot_pie(data3, c(10,29,50,75,93))
a + b + c

Now? , Let's use a bar chart barplot Represents identical data :
# Define the drawing function
plot_bar <- function(data){
ggplot(data, aes(x=name, y=value, fill=name)) +
# Draw a bar graph
geom_bar(stat = "identity") +
# Set fill color
scale_fill_viridis(discrete = TRUE, direction=-1) +
scale_color_manual(values=c("black", "white")) +
theme(
legend.position="none",
plot.title = element_text(size=14),
panel.grid = element_blank(),
) +
ylim(0,25) +
xlab("") +
ylab("")
}
a <- plot_bar (data1)
b <- plot_bar (data2)
c <- plot_bar (data3)
a + b + c

Let's talk about the reasons for using charts .
- Charts are a way to get information and make it easier to understand .
- Generally speaking , The purpose of the chart is to make it easier to compare different data sets .
- Charts can convey as much information as possible without adding complexity .
As you can see by comparing the pictures , Pie charts are difficult to visualize the differences between data , The bar chart is the opposite , You can clearly see the difference between different data . Pie charts can't compare different values , And there is no way to convey more information .
Solution
Bar chart , The bar chart is the best substitute for the pie chart . If you have many values to display , You can also consider a more elegant lollipop chart in my opinion . The following is based on a few countries in the world / Demonstration examples of the number of important items sold in the region :
# from github Load data
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/7_OneCatOneNum.csv", header=TRUE, sep=",")
# Clear null data
data <- filter(data,!is.na(Value))
nrow(data)
head(data)
# Arrange data
data<- arrange(data,Value)
# take Contry Convert to factor term , To represent classified data
data<- mutate(data,Country=factor(Country, Country))
# mapping
ggplot(data,aes(x=Country, y=Value) ) +
# Define the data axis
geom_segment( aes(x=Country ,xend=Country, y=0, yend=Value), color="grey") +
# Draw points
geom_point(size=3, color="#69b3a2") +
# x,y Shaft exchange
coord_flip() +
# Set the theme
theme(
# Set the inner line to empty
panel.grid.minor.y = element_blank(),
panel.grid.major.y = element_blank(),
legend.position="none"
) +
# original x The axis is now in the image y The axis title of the axis is set to null
xlab("")
38
| Country | Value | |
|---|---|---|
| <fct> | <int> | |
| 1 | United States | 12394 |
| 2 | Russia | 6148 |
| 3 | Germany (FRG) | 1653 |
| 4 | France | 2162 |
| 5 | United Kingdom | 1214 |
| 6 | China | 1131 |

If your goal is to describe the composition of the whole , Another possibility is to create a tree view .
# Package
# Import specialized packages
library(treemap)
# Plot mapping
treemap(data,
# data
index="Country",
vSize="Value",
type="index",
# Set the color
title="",
palette="Dark2",
# Border Bounding box settings
border.col=c("black"),
# Bounding box lineweight
border.lwds=3,
# Labels Set label color
fontcolor.labels="white",
# Set the font
fontface.labels=2,
# Set label position
align.labels=c("left", "top"),
# The larger the setting area , The bigger the label is
inflate.labels=T,
# Set the display label level , The smaller the size, the fewer labels are displayed
fontsize.labels=5
)

Reference resources
边栏推荐
- Find the number of permutations
- Advanced stair climbing
- Superficial understanding of conditional random fields
- An image is word 16x16 words: transformers for image recognition at scale
- Termux SSH first shell start
- 【LeetCode-SQL】1532. Last three orders
- OpenCVSharpSample05Wpf
- C # illustrated tutorial (Fourth Edition) chapter7-7.2 accessing inherited members
- 04路由跳转并携带参数
- Introduction to armv8/armv9 - learning this article is enough
猜你喜欢

regular expression

智能安全配电装置如何减少电气火灾事故的发生?

I didn't expect that the index occupies several times as much space as the data MySQL queries the space occupied by each table in the database, and the space occupied by data and indexes. It is used i

Flow chart of interrupt process

Graph theory, tree based concept
![[reading papers] transformer miscellaneous notes, especially miscellaneous](/img/c3/7788b1bcd71b90c18cf66bb915db32.jpg)
[reading papers] transformer miscellaneous notes, especially miscellaneous

Chapter7-12_ Controllable Chatbot
![[reading point paper] deeplobv3 rethinking atlas revolution for semantic image segmentation ASPP](/img/4e/a5c6b1a8880209f89d6bf252ff889a.jpg)
[reading point paper] deeplobv3 rethinking atlas revolution for semantic image segmentation ASPP

Classification and summary of system registers in aarch64 architecture of armv8/arnv9

Matlab: obtain the figure edge contour and divide the figure n equally
随机推荐
[pytorch]fixmatch code explanation - data loading
Laravel permission export
Chapter7-12_ Controllable Chatbot
Introduction to easydl object detection port
Basic exercises of test questions letter graphics ※
Microsoft Pinyin opens U / V input mode
1000粉丝啦~
柏瑞凱電子沖刺科創板:擬募資3.6億 汪斌華夫婦為大股東
Mbedtls migration experience
Advanced stair climbing
An image is word 16x16 words: transformers for image recognition at scale
Solution of depth learning for 3D anisotropic images
C # illustrated tutorial (Fourth Edition) chapter7-7.2 accessing inherited members
Understand speech denoising
Opencv 9 resize size change rotate rotate blur mean (blur)
Opencv 15 face recognition and eye recognition
js-dom
Deep learning the principle of armv8/armv9 cache
Laptop touch pad operation
Jump model between mirrors