MarkTechPost@AI 03月26日
A Code Implementation for Advanced Human Pose Estimation Using MediaPipe, OpenCV and Matplotlib
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了一种使用MediaPipe和OpenCV进行人体姿态估计的实践方法。通过安装必要的库,包括mediapipe、opencv-python-headless和matplotlib,并导入相关模块,可以构建一个姿态检测系统。文章详细阐述了detect_pose函数,该函数用于读取图像、使用MediaPipe检测人体姿态关键点,并返回标注后的图像和关键点信息。此外,还提供了visualize_pose函数,用于并排显示原始图像和姿态标注后的图像。最后,extract_keypoints函数用于提取关键点的坐标和可见度信息,展示了如何将图像数据转化为对人体运动的深入理解。

🧐 首先,需要安装和导入必要的库,包括mediapipe、opencv-python-headless和matplotlib。这些库是实现人体姿态估计的基础。

💡 接下来,定义detect_pose函数,该函数读取图像,使用MediaPipe处理图像以检测人体姿态关键点。它返回标注后的图像和检测到的关键点信息。函数内部使用cv2.imread读取图像,然后将其转换为RGB格式以供MediaPipe处理。使用mp_pose.Pose初始化MediaPipe的姿态模型,并设置参数如静态图像模式、模型复杂度、启用分割和最小检测置信度。

🖼️ 然后,定义visualize_pose函数,用于显示原始图像和姿态标注后的图像。该函数使用matplotlib库并排显示两张图像,方便用户比较和观察姿态估计的结果。

🔑 此外,文章还定义了extract_keypoints函数,用于将检测到的姿态关键点转换为字典,包含关键点的x、y、z坐标和可见度。这使得用户可以方便地访问和使用关键点数据。

🚀 最后,文章展示了如何加载图像、检测姿态关键点、可视化结果以及提取关键点信息。通过这些步骤,可以将原始图像转化为详细的骨骼图,从而实现对人体姿态的分析。

Human pose estimation is a cutting-edge computer vision technology that transforms visual data into actionable insights about human movement. By utilizing advanced machine learning models like MediaPipe’s BlazePose and powerful libraries such as OpenCV, developers can track body key points with unprecedented accuracy. In this tutorial, we explore the seamless integration of these, demonstrating how Python-based frameworks enable sophisticated pose detection across various domains, from sports analytics to healthcare monitoring and interactive applications. 

First, we install the essential libraries:

!pip install mediapipe opencv-python-headless matplotlib

Then, we import the important libraries needed for our implementation:

import cv2import mediapipe as mpimport matplotlib.pyplot as pltimport numpy as np

We initialize the MediaPipe Pose model in static image mode with segmentation enabled and a minimum detection confidence of 0.5. It also imports utilities for drawing landmarks and applying drawing styles.

mp_pose = mp.solutions.posemp_drawing = mp.solutions.drawing_utilsmp_drawing_styles = mp.solutions.drawing_stylespose = mp_pose.Pose(    static_image_mode=True,    model_complexity=1,    enable_segmentation=True,    min_detection_confidence=0.5)

Here, we define the detect_pose function, which reads an image, processes it to detect human pose landmarks using MediaPipe, and returns the annotated image along with the detected landmarks. If landmarks are found, they are drawn using default styling.

def detect_pose(image_path):    image = cv2.imread(image_path)    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)    results = pose.process(image_rgb)    annotated_image = image_rgb.copy()    if results.pose_landmarks:        mp_drawing.draw_landmarks(            annotated_image,            results.pose_landmarks,            mp_pose.POSE_CONNECTIONS,            landmark_drawing_spec=mp_drawing_styles.get_default_pose_landmarks_style()        )    return annotated_image, results.pose_landmarks

We define the visualize_pose function, which displays the original and pose-annotated images side by side using matplotlib. The extract_keypoints function converts detected pose landmarks into a dictionary of named keypoints with their x, y, z coordinates and visibility scores.

def visualize_pose(original_image, annotated_image):    plt.figure(figsize=(16, 8))    plt.subplot(1, 2, 1)    plt.title('Original Image')    plt.imshow(cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB))    plt.axis('off')    plt.subplot(1, 2, 2)    plt.title('Pose Estimation')    plt.imshow(annotated_image)    plt.axis('off')    plt.tight_layout()    plt.show()def extract_keypoints(landmarks):    if landmarks:        keypoints = {}        for idx, landmark in enumerate(landmarks.landmark):            keypoints[mp_pose.PoseLandmark(idx).name] = {                'x': landmark.x,                'y': landmark.y,                'z': landmark.z,                'visibility': landmark.visibility            }        return keypoints    return None

Finally, we load an image from the specified path, detect and visualize human pose landmarks using MediaPipe, and then extract and print the coordinates and visibility of each detected keypoint.

image_path = '/content/Screenshot 2025-03-26 at 12.56.05 AM.png'original_image = cv2.imread(image_path)annotated_image, landmarks = detect_pose(image_path)visualize_pose(original_image, annotated_image)keypoints = extract_keypoints(landmarks)if keypoints:    print("Detected Keypoints:")    for name, details in keypoints.items():        print(f"{name}: {details}")
Sample Processed Output

In this tutorial, we explored human pose estimation using MediaPipe and OpenCV, demonstrating a comprehensive approach to body keypoint detection. We implemented a robust pipeline that transforms images into detailed skeletal maps, covering key steps including library installation, pose detection function creation, visualization techniques, and keypoint extraction. Using advanced machine learning models, we showcased how developers can transform raw visual data into meaningful movement insights across various domains like sports analytics and healthcare monitoring.


Here is the Colab Notebook. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 85k+ ML SubReddit.

The post A Code Implementation for Advanced Human Pose Estimation Using MediaPipe, OpenCV and Matplotlib appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

MediaPipe OpenCV 人体姿态估计 机器学习
相关文章