当前位置:网站首页>Posture recognition and simple behavior recognition based on mediapipe
Posture recognition and simple behavior recognition based on mediapipe
2022-07-27 23:55:00 【Wind dwelling willow poplar】
List of articles
Learning goals
1、 It can recognize the key points of human posture
2、 You can recognize the human body's actions through the method of angle recognition ( Customize )
One 、mediapipe Installation
Actually, this part is very simple , Directly in windows Command line environment
pip install mediepipe
That's all right.
Two 、 Use mediapipe Detect key points
1、mediapipe Introduction to
Mediapipe Is a framework for building a machine learning pipeline , Users process video 、 Audio and other time series data . This cross platform framework is suitable for desktop / The server 、Android、ios And various embedded devices .
at present mediapipe contain 16 individual solutions, Respectively
Face detection
Face Mesh
iris
hand
Posture
human body
Character segmentation
Hair split
object detection
Box Tracking
instant Motion Tracking
3D object detection
Feature matching
AutoFlip
MediaSequence
YouTuBe_8M
, Google officially calls this method of human posture recognition Blazepose.
(0) Preparations before testing
''' Import some basic libraries '''
import cv2
import mediapipe as mp
import time
from tqdm import tqdm
import numpy as np
from PIL import Image, ImageFont, ImageDraw
# ------------------------------------------------
# mediapipe The initialization
# This step is necessary , Because we need to use several classes defined below
# ------------------------------------------------
mp_pose = mp.solutions.pose
mp_drawing = mp.solutions.drawing_utils
pose = mp_pose.Pose(static_image_mode=True)
(1) Test pictures
def process_frame(img):
start_time = time.time()
h, w = img.shape[0], img.shape[1] # Height and width
# Adjust the font
tl = round(0.005 * (img.shape[0] + img.shape[1]) / 2) + 1
tf = max(tl-1, 1)
# BRG-->RGB
img_RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# take RGB Image input model , obtain Key points Predicted results
results = pose.process(img_RGB)
keypoints = ['' for i in range(33)]
if results.pose_landmarks:
mp_drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
for i in range(33):
cx = int(results.pose_landmarks.landmark[i].x * w)
cy = int(results.pose_landmarks.landmark[i].y * h)
keypoints[i] = (cx, cy) # To get the final 33 A key point
else:
print("NO PERSON")
struction = "NO PERSON"
img = cv2.putText(img, struction, (25, 100), cv2.FONT_HERSHEY_SIMPLEX, 1.25, (255, 255, 0),
6)
end_time = time.time()
process_time = end_time - start_time # Picture key prediction time
fps = 1 / process_time # Frame rate
colors = [[random.randint(0,255) for _ in range(3)] for _ in range(33)]
radius = [random.randint(8,15) for _ in range(33)]
for i in range(33):
cx, cy = keypoints[i]
#if i in range(33):
img = cv2.circle(img, (cx, cy), radius[i], colors[i], -1)
'''str_pose = get_pos(keypoints) # Get posture cv2.putText(img, "POSE-{}".format(str_pose), (12, 100), cv2.FONT_HERSHEY_TRIPLEX, tl / 3, (255, 0, 0), thickness=tf)'''
cv2.putText(img, "FPS-{}".format(str(int(fps))), (12, 100), cv2.FONT_HERSHEY_SIMPLEX,
tl/3, (255, 255, 0),thickness=tf)
return img
If you need to execute code , Then use in the main function at the end of the text
if __name__ == '__main__':
# Read the picture
img0 = cv2.imread("./data/outImage--20.jpg")
# Because there is a Chinese path , So add this trip
image = cv2.imdecode(np.fromfile(image_path, dtype=np.uint8), -1)
img = image.copy()
# Detect key points , Got image It is the picture after detection
image = process_frame(img)
# Use matplotlib drawing
fig, axes = plt.subplots(nrows=1, ncols=2)
axes[0].imshow(img0[:,:,::-1])
axes[0].set_title(" Original picture ")
axes[1].imshow(image[:,:,::-1])
axes[1].set_title(" Detect and visualize the image ")
plt.rcParams["font.sans-serif"] = ['SimHei']
plt.rcParams["axes.unicode_minus"] = False
plt.show()
fig.savefig("./data/out.png")
Finally, the test results are attached .
(2) Detect video
Anything that doesn't involve 3D Convolution machine vision method , Detecting video is actually detecting pictures . Because video is fused from multiple frames of pictures .
Like a 30 The frame of the video , Then every second of it , Is the 30 Pictures superimposed .
These segmented images are detected separately , Finally, the detected images are fused , What you get is the video after detection .
With this basis , We can write the image detection process as a function , Call this function in every frame of the video
In general use opencv The library decomposes video into picture frames , The sample code is as follows :
def video2image(videoPath="./video/demo1.mp4",
image_dir="./image"):
'''videoPath Is the video path , image_dir Is the path of the folder where the pictures are saved '''
cap = cv2.VideoCapture(videoPath)
frame_count = 0
while(cap.isOpened()):
success,frame = cap.read()
if not success:
break
frame_count += 1
print(" Total frames of video :", frame_count)
cap.release()
cap = cv2.VideoCapture(videoPath)
count = 0
with tqdm(total=frame_count-1) as pbar:
try:
while(cap.isOpened()):
success, frame = cap.read()
if not success:
break
# Process frames
try:
if count % 20 == 0:
cv2.imwrite("{}/outImage--{}.jpg".format(image_dir, count), frame)
except:
print("error")
pass
if success == True:
pbar.update(1)
count+=1
except:
print(" Break in the middle ")
pass
cv2.destroyAllWindows()
cap.release()
print(" Video processing has ended , Proceed to the next step !!!")
Then implement the functions that this article wants to achieve , You can add a picture detection function after the frame decomposed from the video .
The code is as follows :
def process_video(video_path="./Data.mp4"):
video_flag = False
cap = cv2.VideoCapture(video_path)
out_path = "./out_Data.mp4"
print(" The video starts processing ……")
frame_count = 0
while (cap.isOpened()):
success, frame = cap.read()
frame_count += 1
if not success:
break
cap.release()
print(" The total number of frames = ", frame_count)
cap = cv2.VideoCapture(video_path)
if video_flag == False:
frame_size = cap.get(cv2.CAP_PROP_FRAME_WIDTH), cap.get(cv2.CAP_PROP_FRAME_HEIGHT) # Size of processed image .
fourcc = cv2.VideoWriter_fourcc(*'mp4v') # Save the video file in the format mp4
fps = cap.get(cv2.CAP_PROP_FPS)
out = cv2.VideoWriter(out_path, fourcc, fps, (int(frame_size[0]),int(frame_size[1])), ) # Handle to output image
with tqdm(total=frame_count-1) as pbar:
try:
while cap.isOpened():
success, frame = cap.read()
if success:
pbar.update(1)
frame = process_frame(frame) # frame Is the frame captured by the video ,process_frame It means to detect .
cv2.namedWindow("frame", cv2.WINDOW_NORMAL)
cv2.imshow("frame", frame)
out.write(frame)
if cv2.waitKey(1) == 27:
break
else:
break
except:
print(" Break in the middle ")
pass
cap.release()
cv2.destroyAllWindows()
out.release()
print(" The video has been saved to ", out_path)
With the code of video , Then you can call it in the main function , The visualization effect is not displayed .
3、 ... and 、 Use mediapipe-BlazePose Detect custom simple behavior
1、 Principle introduction
take Mediapipe It is a complex thing to use for behavior detection ; If you do , Then the accuracy of behavior detection depends entirely on Mediapipe Detection accuracy of key points .
So we can detect the position and posture of people according to the joint angle in the figure below .
Such as raising your hand , The angle between the boom and the horizontal direction must be greater than 0 Degree .
On your hips , Hands down , The included angle between the big arm and the small arm is greater than 60 Degree less than 120 degree
In this way, we can complete the classification of some basic actions .
I only list a few relatively simple .
(1) Raise your hands (2) Raise your left hand (3) Raise your right hand (4) Hips (5) Than triangle
First look at the renderings

2、 Implementation process
The first thing to know is , The formula of obtaining vector from coordinates , It's actually subtracting two coordinates .
Then find the formula of the included angle between two vectors :
Then in the code is :
v1 = (x1, y1) - (x2, y2)
v2 = (x0, y0) - (x2, y2)
def get_angle(v1, v2):
angle = np.dot(v1, v2) / (np.sqrt(np.sum(v1 * v1)) * np.sqrt(np.sum(v2 * v2)))
angle = np.arccos(angle) / 3.14 * 180
cross = v2[0] * v1[1] - v2[1] * v1[0]
if cross < 0:
angle = - angle
return angle
In this way, we can get the angle between the two vectors .
after , You can judge the behavior through the included angle , The rule here is
Raise your hands The left-hand vector is less than 0 The included angle of the right-hand vector is greater than 0
Raise your left hand The left-hand vector is less than 0 Right hand vector is less than 0
Raise your right hand The left-hand vector is greater than 0 The right hand vector is greater than 0
Than triangle While raising your hands , The included angle between the big arm and the small arm is less than 120 degree
normal The left-hand vector is greater than 0 The included angle of the right-hand vector is less than 0
Hips Under normal circumstances , The included angle of left elbow is less than 120 degree , The included angle of the right elbow is also less than 0
The code example given is as follows :
def get_pos(keypoints):
str_pose = ""
# Calculate the angle between the left arm and the horizontal
keypoints = np.array(keypoints)
v1 = keypoints[12] - keypoints[11]
v2 = keypoints[13] - keypoints[11]
angle_left_arm = get_angle(v1, v2)
# Calculate the included angle between the right arm and the horizontal direction
v1 = keypoints[11] - keypoints[12]
v2 = keypoints[14] - keypoints[12]
angle_right_arm = get_angle(v1, v2)
# Calculate the included angle of the left elbow
v1 = keypoints[11] - keypoints[13]
v2 = keypoints[15] - keypoints[13]
angle_left_elow = get_angle(v1, v2)
# Calculate the included angle of the right elbow
v1 = keypoints[12] - keypoints[14]
v2 = keypoints[16] - keypoints[14]
angle_right_elow = get_angle(v1, v2)
if angle_left_arm<0 and angle_right_arm<0:
str_pose = "LEFT_UP"
elif angle_left_arm>0 and angle_right_arm>0:
str_pose = "RIGHT_UP"
elif angle_left_arm<0 and angle_right_arm>0:
str_pose = "ALL_HANDS_UP"
if abs(angle_left_elow)<120 and abs(angle_right_elow)<120:
str_pose = "TRIANGLE"
elif angle_left_arm>0 and angle_right_arm<0:
str_pose = "NORMAL"
if abs(angle_left_elow)<120 and abs(angle_right_elow)<120:
str_pose = "AKIMBO"
return str_pose
Got str_pose Is the behavior string , stay process_frame Can be visualized in picture frames .
Come here , Key point detection and simple behavior detection have all been introduced , If it can't be reproduced , You can directly see the source code in my code warehouse
plan : In the future blog , Based on wxpython Of UI Design and Mediapipe To merge , Realize the visual interaction process , Stay tuned .
The road of learning is sailing against the current , Come on !!!
边栏推荐
猜你喜欢

2022夏暑假每日一题(五)

Can Siemens PLC collect analog data of multiple slave stations in real time and wirelessly?

My annual salary is 1million, and I don't have clothes more than 100 yuan all over my body: saving money is the top self-discipline

Key points of data management

JUC工具包学习

Introduction to several common usage scenarios of message queue

数据管理的重点

Technical certification | Tupo software and Huawei cloud create a new situation of win-win cooperation

Latex中如何加粗字体 & 如何打出圆圈序号

Remotely debug idea, configure remote debug, and add JVM startup parameter -xdebug in the program of remote server
随机推荐
Construction and application of super large scale knowledge map of ants
Put cloudflare on the website (take Tencent cloud as an example)
Master data management theory and Practice
Monologue of a software Investor: why don't I pursue fast-growing companies
Can Siemens PLC collect analog data of multiple slave stations in real time and wirelessly?
XSS Payload 学习浏览器解码
基于原生js实现今日新闻网站
Features of hardwired controller:
Lua基础语法学习
What are the methods of process synchronization?
数据管理的重点
File & recursion 14.1
BUUCTF-RSA
What is the prospect of low code development? Are you really optimistic about low code development?
Reduce error demonstration
This is the most concise guide to tcpdump in history. It's enough to read this one
How to bold font in Latex & how to make circle serial number
Ideas, methods and steps of making folding fans with 3DMAX
How Flink uses savepoint
BUUCTF-RSA4