当前位置:网站首页>Posture recognition and simple behavior recognition based on mediapipe
Posture recognition and simple behavior recognition based on mediapipe
2022-07-27 23:55:00 【Wind dwelling willow poplar】
List of articles
Learning goals
1、 It can recognize the key points of human posture
2、 You can recognize the human body's actions through the method of angle recognition ( Customize )
One 、mediapipe Installation
Actually, this part is very simple , Directly in windows Command line environment
pip install mediepipe
That's all right.
Two 、 Use mediapipe Detect key points
1、mediapipe Introduction to
Mediapipe Is a framework for building a machine learning pipeline , Users process video 、 Audio and other time series data . This cross platform framework is suitable for desktop / The server 、Android、ios And various embedded devices .
at present mediapipe contain 16 individual solutions, Respectively
Face detection
Face Mesh
iris
hand
Posture
human body
Character segmentation
Hair split
object detection
Box Tracking
instant Motion Tracking
3D object detection
Feature matching
AutoFlip
MediaSequence
YouTuBe_8M
, Google officially calls this method of human posture recognition Blazepose.
(0) Preparations before testing
''' Import some basic libraries '''
import cv2
import mediapipe as mp
import time
from tqdm import tqdm
import numpy as np
from PIL import Image, ImageFont, ImageDraw
# ------------------------------------------------
# mediapipe The initialization
# This step is necessary , Because we need to use several classes defined below
# ------------------------------------------------
mp_pose = mp.solutions.pose
mp_drawing = mp.solutions.drawing_utils
pose = mp_pose.Pose(static_image_mode=True)
(1) Test pictures
def process_frame(img):
start_time = time.time()
h, w = img.shape[0], img.shape[1] # Height and width
# Adjust the font
tl = round(0.005 * (img.shape[0] + img.shape[1]) / 2) + 1
tf = max(tl-1, 1)
# BRG-->RGB
img_RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# take RGB Image input model , obtain Key points Predicted results
results = pose.process(img_RGB)
keypoints = ['' for i in range(33)]
if results.pose_landmarks:
mp_drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
for i in range(33):
cx = int(results.pose_landmarks.landmark[i].x * w)
cy = int(results.pose_landmarks.landmark[i].y * h)
keypoints[i] = (cx, cy) # To get the final 33 A key point
else:
print("NO PERSON")
struction = "NO PERSON"
img = cv2.putText(img, struction, (25, 100), cv2.FONT_HERSHEY_SIMPLEX, 1.25, (255, 255, 0),
6)
end_time = time.time()
process_time = end_time - start_time # Picture key prediction time
fps = 1 / process_time # Frame rate
colors = [[random.randint(0,255) for _ in range(3)] for _ in range(33)]
radius = [random.randint(8,15) for _ in range(33)]
for i in range(33):
cx, cy = keypoints[i]
#if i in range(33):
img = cv2.circle(img, (cx, cy), radius[i], colors[i], -1)
'''str_pose = get_pos(keypoints) # Get posture cv2.putText(img, "POSE-{}".format(str_pose), (12, 100), cv2.FONT_HERSHEY_TRIPLEX, tl / 3, (255, 0, 0), thickness=tf)'''
cv2.putText(img, "FPS-{}".format(str(int(fps))), (12, 100), cv2.FONT_HERSHEY_SIMPLEX,
tl/3, (255, 255, 0),thickness=tf)
return img
If you need to execute code , Then use in the main function at the end of the text
if __name__ == '__main__':
# Read the picture
img0 = cv2.imread("./data/outImage--20.jpg")
# Because there is a Chinese path , So add this trip
image = cv2.imdecode(np.fromfile(image_path, dtype=np.uint8), -1)
img = image.copy()
# Detect key points , Got image It is the picture after detection
image = process_frame(img)
# Use matplotlib drawing
fig, axes = plt.subplots(nrows=1, ncols=2)
axes[0].imshow(img0[:,:,::-1])
axes[0].set_title(" Original picture ")
axes[1].imshow(image[:,:,::-1])
axes[1].set_title(" Detect and visualize the image ")
plt.rcParams["font.sans-serif"] = ['SimHei']
plt.rcParams["axes.unicode_minus"] = False
plt.show()
fig.savefig("./data/out.png")
Finally, the test results are attached .
(2) Detect video
Anything that doesn't involve 3D Convolution machine vision method , Detecting video is actually detecting pictures . Because video is fused from multiple frames of pictures .
Like a 30 The frame of the video , Then every second of it , Is the 30 Pictures superimposed .
These segmented images are detected separately , Finally, the detected images are fused , What you get is the video after detection .
With this basis , We can write the image detection process as a function , Call this function in every frame of the video
In general use opencv The library decomposes video into picture frames , The sample code is as follows :
def video2image(videoPath="./video/demo1.mp4",
image_dir="./image"):
'''videoPath Is the video path , image_dir Is the path of the folder where the pictures are saved '''
cap = cv2.VideoCapture(videoPath)
frame_count = 0
while(cap.isOpened()):
success,frame = cap.read()
if not success:
break
frame_count += 1
print(" Total frames of video :", frame_count)
cap.release()
cap = cv2.VideoCapture(videoPath)
count = 0
with tqdm(total=frame_count-1) as pbar:
try:
while(cap.isOpened()):
success, frame = cap.read()
if not success:
break
# Process frames
try:
if count % 20 == 0:
cv2.imwrite("{}/outImage--{}.jpg".format(image_dir, count), frame)
except:
print("error")
pass
if success == True:
pbar.update(1)
count+=1
except:
print(" Break in the middle ")
pass
cv2.destroyAllWindows()
cap.release()
print(" Video processing has ended , Proceed to the next step !!!")
Then implement the functions that this article wants to achieve , You can add a picture detection function after the frame decomposed from the video .
The code is as follows :
def process_video(video_path="./Data.mp4"):
video_flag = False
cap = cv2.VideoCapture(video_path)
out_path = "./out_Data.mp4"
print(" The video starts processing ……")
frame_count = 0
while (cap.isOpened()):
success, frame = cap.read()
frame_count += 1
if not success:
break
cap.release()
print(" The total number of frames = ", frame_count)
cap = cv2.VideoCapture(video_path)
if video_flag == False:
frame_size = cap.get(cv2.CAP_PROP_FRAME_WIDTH), cap.get(cv2.CAP_PROP_FRAME_HEIGHT) # Size of processed image .
fourcc = cv2.VideoWriter_fourcc(*'mp4v') # Save the video file in the format mp4
fps = cap.get(cv2.CAP_PROP_FPS)
out = cv2.VideoWriter(out_path, fourcc, fps, (int(frame_size[0]),int(frame_size[1])), ) # Handle to output image
with tqdm(total=frame_count-1) as pbar:
try:
while cap.isOpened():
success, frame = cap.read()
if success:
pbar.update(1)
frame = process_frame(frame) # frame Is the frame captured by the video ,process_frame It means to detect .
cv2.namedWindow("frame", cv2.WINDOW_NORMAL)
cv2.imshow("frame", frame)
out.write(frame)
if cv2.waitKey(1) == 27:
break
else:
break
except:
print(" Break in the middle ")
pass
cap.release()
cv2.destroyAllWindows()
out.release()
print(" The video has been saved to ", out_path)
With the code of video , Then you can call it in the main function , The visualization effect is not displayed .
3、 ... and 、 Use mediapipe-BlazePose Detect custom simple behavior
1、 Principle introduction
take Mediapipe It is a complex thing to use for behavior detection ; If you do , Then the accuracy of behavior detection depends entirely on Mediapipe Detection accuracy of key points .
So we can detect the position and posture of people according to the joint angle in the figure below .
Such as raising your hand , The angle between the boom and the horizontal direction must be greater than 0 Degree .
On your hips , Hands down , The included angle between the big arm and the small arm is greater than 60 Degree less than 120 degree
In this way, we can complete the classification of some basic actions .
I only list a few relatively simple .
(1) Raise your hands (2) Raise your left hand (3) Raise your right hand (4) Hips (5) Than triangle
First look at the renderings

2、 Implementation process
The first thing to know is , The formula of obtaining vector from coordinates , It's actually subtracting two coordinates .
Then find the formula of the included angle between two vectors :
Then in the code is :
v1 = (x1, y1) - (x2, y2)
v2 = (x0, y0) - (x2, y2)
def get_angle(v1, v2):
angle = np.dot(v1, v2) / (np.sqrt(np.sum(v1 * v1)) * np.sqrt(np.sum(v2 * v2)))
angle = np.arccos(angle) / 3.14 * 180
cross = v2[0] * v1[1] - v2[1] * v1[0]
if cross < 0:
angle = - angle
return angle
In this way, we can get the angle between the two vectors .
after , You can judge the behavior through the included angle , The rule here is
Raise your hands The left-hand vector is less than 0 The included angle of the right-hand vector is greater than 0
Raise your left hand The left-hand vector is less than 0 Right hand vector is less than 0
Raise your right hand The left-hand vector is greater than 0 The right hand vector is greater than 0
Than triangle While raising your hands , The included angle between the big arm and the small arm is less than 120 degree
normal The left-hand vector is greater than 0 The included angle of the right-hand vector is less than 0
Hips Under normal circumstances , The included angle of left elbow is less than 120 degree , The included angle of the right elbow is also less than 0
The code example given is as follows :
def get_pos(keypoints):
str_pose = ""
# Calculate the angle between the left arm and the horizontal
keypoints = np.array(keypoints)
v1 = keypoints[12] - keypoints[11]
v2 = keypoints[13] - keypoints[11]
angle_left_arm = get_angle(v1, v2)
# Calculate the included angle between the right arm and the horizontal direction
v1 = keypoints[11] - keypoints[12]
v2 = keypoints[14] - keypoints[12]
angle_right_arm = get_angle(v1, v2)
# Calculate the included angle of the left elbow
v1 = keypoints[11] - keypoints[13]
v2 = keypoints[15] - keypoints[13]
angle_left_elow = get_angle(v1, v2)
# Calculate the included angle of the right elbow
v1 = keypoints[12] - keypoints[14]
v2 = keypoints[16] - keypoints[14]
angle_right_elow = get_angle(v1, v2)
if angle_left_arm<0 and angle_right_arm<0:
str_pose = "LEFT_UP"
elif angle_left_arm>0 and angle_right_arm>0:
str_pose = "RIGHT_UP"
elif angle_left_arm<0 and angle_right_arm>0:
str_pose = "ALL_HANDS_UP"
if abs(angle_left_elow)<120 and abs(angle_right_elow)<120:
str_pose = "TRIANGLE"
elif angle_left_arm>0 and angle_right_arm<0:
str_pose = "NORMAL"
if abs(angle_left_elow)<120 and abs(angle_right_elow)<120:
str_pose = "AKIMBO"
return str_pose
Got str_pose Is the behavior string , stay process_frame Can be visualized in picture frames .
Come here , Key point detection and simple behavior detection have all been introduced , If it can't be reproduced , You can directly see the source code in my code warehouse
plan : In the future blog , Based on wxpython Of UI Design and Mediapipe To merge , Realize the visual interaction process , Stay tuned .
The road of learning is sailing against the current , Come on !!!
边栏推荐
- Error:svn: E155010: ‘/Users/.../Desktop/wrokspace/xxx‘ is scheduled for addition, but is missing
- How to bold font in Latex & how to make circle serial number
- Redis hash underlying data structure
- Key points of data management
- Under the epidemic, TSMC's growth in the first quarter exceeded expectations, with 7Nm accounting for 35%! Second quarter or record high
- Control mode of CPU
- [C language] address book (dynamic version)
- Zabbix4.0 uses SNMP agent to monitor vcenter6.5
- How Flink uses savepoint
- smartRefresh嵌套多个RecycleView滑动冲突及布局显示不全
猜你喜欢

Key points of data management
![[NCTF2019]babyRSA1](/img/c1/52e79b6e40390374d48783725311ba.gif)
[NCTF2019]babyRSA1

BUUCTF-RSA4

基于原生js实现今日新闻网站

尚硅谷尚品项目汇笔记(一)

字符流学习14.3

xss.haozi.me练习通关

org.junit.runners.model. InvalidTestClassError: Invalid test class ‘com.zhj.esdemo. MysqlTests‘: 1.
![[December Haikou] the 6th International Conference on ships, marine and Maritime Engineering in 2022 (naome 2022)](/img/a4/041268aadd5d8ff493b52ead9c5e79.png)
[December Haikou] the 6th International Conference on ships, marine and Maritime Engineering in 2022 (naome 2022)

BUUCTF-RSA
随机推荐
Latex common summary (2): input matrix (input matrix, diagonal matrix, equations, etc.)
Use a grayscale filter
Key points of data management
J9数字科普:Sui网络的双共识是如何工作的?
数据中台的那些“经验与陷阱”
org.junit.runners.model.InvalidTestClassError: Invalid test class ‘com.zhj.esdemo.MysqlTests‘: 1.
2022/7/26
真的很难理解?RecyclerView 缓存机制到底是几级缓存?
虚拟存储器与Cache的比较
Is it really hard to understand? What level of cache is the recyclerview caching mechanism?
Zcmu--1720: death is like the wind, I want to pretend to force
Yijia will release ODM orders in 2020 and make efforts in the middle and low-end market
TFRecord的Shuffle、划分和读取
BUUCTF-RSA roll
BUUCTF-Dangerous RSA
How to bold font in Latex & how to make circle serial number
MapReduce (III)
Remotely debug idea, configure remote debug, and add JVM startup parameter -xdebug in the program of remote server
Comparison between virtual memory and cache
Sort sort