This document describes how to integrate Pose APIs into your service with a REST API. You can use Pose APIs only using a REST API because the Kakao SDK does not support Pose APIs.
You can test some features described in this document in [Tools] > [REST API Test].
The Analyzing image API detects people in the given image and extracts each person's 17 key points(person's eyes, nose, shoulders, elbows, wrists, pelvis, knees, and ankles) to determine their pose.
Add your REST API key
to the request header, and send a POST
request. You can either upload an image file (file) or specify an image URL (image_url) to request the image analysis.
If the request is successful, the Person
object containing the coordinates and confidence score of each key point is returned in JSON format. If there are multiple people in the image, a list of Person objects containing the information of the detected people is returned.
POST /pose HTTP/1.1
Host: cv-api.kakaobrain.com
Authorization: KakaoAK ${REST_API_KEY}
Name | Description | Required |
---|---|---|
Authorization | REST API key. You can check your app's REST API key in [My Application] > [App Keys]. |
O |
Name | Type | Description | Required |
---|---|---|---|
image_url | String |
Image URL. | O* |
file | Binary |
Image file. | O* |
* Either 'image_url' or 'file' is required.
- Use either the 'image_url' or the 'file' parameter to request the image analysis. - An image uploaded as 'file' or specified as 'image_url' supports JPEG, PNG, HEIC, and WebP format only. - You can upload an image up to 2 MB. - The image length must be at least 2048 pixels, and the width must be at least 320 pixels. - The image aspect ratio of must be 16:9 to 9:16. - When uploading an image file with the 'file' parameter, set 'Content-Type' to 'multipart/form-data'. - When requesting with the 'image_url' parameter, set 'Content-Type' to 'application/x-www-form-urlencoded'.
Name | Type | Description |
---|---|---|
area | Float |
Area of the bounding box that includes all key points. |
bbox | Float[] |
X and Y coordinates of the uppermost key point among the detected key points, and the width(w) and height(h) of the bounding box. [x, y, w, h] |
category_id | Int |
Fixed as 1 .1: person |
keypoints | Float[] |
Array containing the coordinates (x, y) and competence (score) of the 17 key points detected in the image in [x_1, y_1, score_1, x_2, y_2, score_2, ..., x_17, y_17, score_17] format. The number from 1 to 17 that comes after x, y, and score refers to the key point ID indicating each body part of the detected person (Refer to Keypoints below.) Example: [X coordinate of the nose, Y coordinate of the nose, confidence of the nose key point, X coordinate of the left eye, Y coordinate of the left eye, confidence of the left eye key point, ... , X coordinate of the right ankle, Y coordinate of the right ankle, confidence of the right ankle key point] |
score | Float |
Confidence score of each key point. A value between 0 and 1.0. |
Key point ID | Body part |
---|---|
1 | nose |
2 | left_eye |
3 | right_eye |
4 | left_ear |
5 | right_ear |
6 | left_shoulder |
7 | right_shoulder |
8 | left_elbow |
9 | right_elbow |
10 | left_wrist |
11 | right_wrist |
12 | left_hip |
13 | right_hip |
14 | left_knee |
15 | right_knee |
16 | left_ankle |
17 | right_ankle |
curl -v -X POST "https://cv-api.kakaobrain.com/pose" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Authorization: KakaoAK ${REST_API_KEY}" \
--data-urlencode "image_url=https://example.com/example.jpg"
curl -v -X POST "https://cv-api.kakaobrain.com/pose" \
-H "Content-Type: multipart/form-data" \
-H "Authorization: KakaoAK ${REST_API_KEY}" \
-F "file=@example_pose.jpg"
[
{
"area": 101090.2833,
"bbox": [719.4526, 244.1255, 182.7314, 553.2178],
"category_id": 1,
"keypoints": [
805.4897, 256.4165, 0.8422, 819.5366, 245.0034, 0.8773, 795.8325, 244.1255, 0.8664, 845.8745, 254.6606, 0.8105, 788.8091, 251.1489, 0.0631, 885.3813, 320.5054, 0.7525, 749.3022, 331.9185, 0.7706, 898.5503, 377.5708, 0.7825, 719.4526, 414.4438, 0.7897, 901.1841, 435.5142, 0.7782, 749.3022, 443.4155, 0.8086, 852.02, 504.8706, 0.6854, 785.2974, 511.894, 0.6738, 833.5835, 644.4614, 0.7899, 800.2222, 659.3862, 0.7655, 833.5835, 796.3433, 0.7055, 824.8042, 743.6675, 0.5165
],
"score": 0.7185
}
]
The Analyzing video API detects people in each frame of the requested video and extracts key points. Because this API is only used to analyze a given video, you must request the Checking video analysis result API additionally to check the job result by passing the job_id
received in this API's response. It is recommended to use the callback function using callback_url
to check if the video analysis is completed before requesting the Checking video analysis result API.
Add your REST API key
to the request header, and send a POST
request. You can either upload a video file(file) or specify a video URL (video_url) to request the video analysis.
If the request is successful, job_id
assigned to the analyzed job is returned.
POST /pose/job HTTP/1.1
Host: cv-api.kakaobrain.com
Authorization: KakaoAK ${REST_API_KEY}
Name | Description | Required |
---|---|---|
Authorization | REST API key. You can check your app's REST API key in [My Application] > [App Keys]. |
O |
Name | Type | Description | Required |
---|---|---|---|
video_url | String |
Used to specify a video URL to request the video analysis. HTTP (80 port) and HTTPS (443 port) are supported. |
O* |
file | Binary | Used to upload a video file to request the video analysis. | O* |
smoothing | Boolean |
Whether to apply the smoothing process to the position of the key points between the detected frames. true or false (Default: true ) |
X |
callback_url | String |
Used to set a callback URL to receive a callback when the video analysis is completed. It is recommended to use HTTPS (443 port). The callback is sent only once in the format of Sample: Callback and not re-sent even when it fails. |
X |
* Either 'video_url' or 'file' is required.
- You can upload a video up to 50 MB without a charge. If uploading a video over the maximum file size, an error is returned. To process a larger video, a partnership arrangement is required. - Without a partnership arrangement, you can analyze up to 30 seconds of a video. If uploading a video that exceeds the maximum number of frames, the first 30 seconds of the video is processed. To process a longer video, a partnership arrangement is required. - The video length must be at least 2048 pixels, and the video width must be at least 320 pixels. - The video aspect ratio of must be 16:9 to 9:16. - When uploading a video file with the 'file' parameter, set 'Content-Type' to 'multipart/form-data'. - When requesting with the 'video_url' parameter, set 'Content-Type' to 'application/x-www-form-urlencoded'.
When the video analysis is completed in the Kakao Brain server, the Kakao Brain server makes a POST request to the specified callback URL with the Job ID regardless of the analysis result. We recommend calling the Checking video analysis result API after getting notified that the video analysis is completed through the callback URL.
Name | Type | Description |
---|---|---|
job_id | String |
Job ID of the requested video. |
curl -v -X POST "https://cv-api.kakaobrain.com/pose/job" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Authorization: KakaoAK ${REST_API_KEY}" \
--data-urlencode "video_url=http://example.com/example.mp4"
curl -v -X POST "https://cv-api.kakaobrain.com/pose/job" \
-H "Content-Type: multipart/form-data" \
-H "Authorization: KakaoAK ${REST_API_KEY}" \
-F "file=@example.mp4"
{
"job_id":"bb91c265-341d-4661-813b-870cff0de1d3"
}
curl -v -X GET "https://api.example.com/pose/callback?job_id=bb91c265-341d-4661-813b-870cff0de1d3"
The Checking video analysis result API returns the processing status and the video analysis results processed through the Analyzing video API.
Add your REST API key
to the request header, and send a GET
request. To check the result of video analysis, you must add the job_id
passed in the response of the Analyzing video API to the request URL.
If the video analysis is successfully completed, the objects containing the information on key points for each person in each frame is returned in JSON format.
GET /pose/job/{job_id} HTTP/1.1
Host: cv-api.kakaobrain.com
Authorization: KakaoAK ${REST_API_KEY}
Name | Description | Required |
---|---|---|
Authorization | REST API key. You can check your app's REST API key in [My Application] > [App Keys]. |
O |
Name | Type | Description | Require |
---|---|---|---|
job_id | String |
Job ID passed in the response of the Analyzing video API. | O |
Name | Type | Description |
---|---|---|
job_id | String |
Job ID of the requested video. |
status | String |
Response status: one of waiting , processing , success , failed , not found - waiting : Server on standy.- processing : Processing video analysis.- success : Successfully processed.- failed : Failed to process.- not found : Cannot find the job ID or the requested video exceeds the seven day storage limit. |
annotations | Annotation[] |
List of objects containing the coordinates and score of the key points detected in each frame, as an array with the size of the number of frames. Refer to Annotations below. Only returned if the value of status is success . |
categories | Category[] |
Object containing the information about key points. Refer to Categories below. Only returned if the value of status is success . |
info | Info |
Object containing information about the analyzed video such as version, creation date, URL, description, etc. Only returned if the value of status is success . |
video | Video |
Object containing information about the frames of the requested video, such as the number of frames per second, the total number of frames, the video frame size. Only returned if the value of status is success . |
description | String |
Reason why the request is failed. Only returned if status is failed .Example: "Failed to get video" |
Name | Type | Description |
---|---|---|
frame_num | Int |
Number of the frames of the requested video. 0 to n-1 (n = number of frames). |
objects | Person[] |
Object containing the coordinates and score of the detected key point. To see more about parameters, refer to Person in the Analyzing image API. |
Name | Type | Description |
---|---|---|
id | Int |
Fixed as 1 .1: person |
keypoints | String[] |
Array containing the body part names in English for 17 key points. ["nose", "left_eye", "right_eye", "left_ear", "right_ear", "left_shoulder", "right_shoulder", "left_elbow", "right_elbow", "left_wrist", "right_wrist", "left_hip", "right_hip", "left_knee", "right_knee", "left_ankle", "right_ankle"] |
name | String |
Fixed as person . |
skeleton | List<Int[]> |
List of array containing the two connected key points. Example: [1, 2] indicates a line connecting the nose and the left ear. |
supercategory | String |
Fixed as person . |
curl -v -X GET "https://cv-api.kakaobrain.com/pose/job/bb91c265-341d-4661-813b870cff0de1d3" \
-H "Authorization: KakaoAK ${REST_API_KEY}"
{
"annotations": [
{
"frame_num": 0,
"objects": [
{
"area": 211350.1765,
"bbox": [340.11, 22.28, 302.92, 697.72],
"category_id": 1,
"keypoints": [517.0, 185.81, 1.0, 524.27, 171.27, 1.0, 517.0, 171.27, 0.86, 560.61, 200.35, 1.0, 0.0, 0.0, 0.0, 582.41, 265.76, 1.0, 473.4, 243.95, 0.97, 596.95, 345.7, 1.0, 407.99, 164.01, 1.0, 524.27, 309.36, 1.0, 371.65, 84.06, 1.0, 546.08, 447.45, 0.87, 480.66, 432.92, 0.88, 531.54, 600.08, 1.0, 480.66, 541.94, 1.0, 524.27, 701.83, 1.0, 473.4, 607.35, 1.0],
"score": 0.9563
}
]
}
... // The results after the first frame is omitted.
],
"categories": [
{
"id": 1,
"keypoints": ["nose", "left_eye", "right_eye", "left_ear", "right_ear", "left_shoulder", "right_shoulder", "left_elbow", "right_elbow", "left_wrist", "right_wrist", "left_hip", "right_hip", "left_knee", "right_knee", "left_ankle", "right_ankle"],
"name": "person",
"skeleton": [[1, 2], [1, 3], [2, 3], [2, 4], [3, 5], [4, 6], [5, 7], [6, 7], [6, 8], [6, 12], [7, 9], [7, 13], [8, 10], [9, 11], [12, 13], [14, 12], [15, 13], [16, 14], [17, 15]],
"supercategory": "person"
}
],
"info": {
"contributor": "Kakao Brain Corp.",
"date_created": "2020/05/26",
"description": "Human pose estimation result from Kakao Brain",
"url": "https://www.kakaobrain.com",
"version": "191227",
"year": 2020
},
"job_id": "bb91c265-341d-4661-813b-870cff0de1d3",
"status": "success",
"video": {
"fps": 29.97,
"frames": 30,
"height": 720,
"width": 1280
}
}
{
"description": "Failed to get video",
"job_id": "32b1dc9e-16c6-426b-9b06-d1c3e2abdfb4",
"status": "failed"
}
Here are code snippets showing how to use Pose APIs.
This is an example of implementing Pose APIs using Python. It is recommended to use Python 3.5 or higher.
import requests
APP_KEY = '${REST_API_KEY}'
IMAGE_URL = 'http://example.com/example.jpg'
IMAGE_FILE_PATH = 'example.jpg'
session = requests.Session()
session.headers.update({'Authorization': 'KakaoAK ' + APP_KEY})
# Requesting with image URL
response = session.post('https://cv-api.kakaobrain.com/pose', data={'image_url': IMAGE_URL})
print(response.status_code, response.json())
# Requesting with image file
with open(IMAGE_FILE_PATH, 'rb') as f:
response = session.post('https://cv-api.kakaobrain.com/pose', files=[('file', f)])
print(response.status_code, response.json())
import os
import requests
APP_KEY = '${REST_API_KEY}'
VIDEO_URL = 'http://example.com/example.mp4'
VIDEO_FILE_PATH = 'example.mp4'
session = requests.Session()
session.headers.update({'Authorization': 'KakaoAK ' + APP_KEY})
# Requesting with video URL
response = session.post('https://cv-api.kakaobrain.com/pose/job', data={'video_url': VIDEO_URL})
print(response.status_code, response.json())
job_id = response.json()['job_id']
# Requesting with video file
assert os.path.getsize(VIDEO_FILE_PATH) < 5e7
with open(VIDEO_FILE_PATH, 'rb') as f:
response = session.post('https://cv-api.kakaobrain.com/pose/job', files=[('file', f)])
print(response.status_code, response.json())
job_id = response.json()['job_id']
import requests
APP_KEY = '${REST_API_KEY}'
session = requests.Session()
session.headers.update({'Authorization': 'KakaoAK ' + APP_KEY})
response = session.get('https://cv-api.kakaobrain.com/pose/job/' + job_id)
print(response.status_code, response.json())
This is an example of visualizing the image and video analysis result using COCO API.
To use COCO API, you need the pycocotools package. Before installing pycocotools, you must install Cython and NumPy first to set dependency.
pip install Cython numpy
pip install matplotlib pycocotools requests pillow opencv-python
import cv2
import numpy as np
from matplotlib import pyplot as plt
from pycocotools.coco import COCO
from requests import Session
APP_KEY = '${REST_API_KEY}'
session = Session()
session.headers.update({'Authorization': 'KakaoAK ' + APP_KEY})
def inference(filename):
with open(filename, 'rb') as f:
response = session.post('https://cv-api.kakaobrain.com/pose', files={'file': f})
response.raise_for_status()
return response.json()
def visualize(filename, annotations, threshold=0.2):
# Ignore key points with low confidence
for annotation in annotations:
keypoints = np.asarray(annotation['keypoints']).reshape(-1, 3)
low_confidence = keypoints[:, -1] < threshold
keypoints[low_confidence, :] = [0, 0, 0]
annotation['keypoints'] = keypoints.reshape(-1).tolist()
# Visualization using COCO API
image = cv2.cvtColor(cv2.imread(filename, cv2.IMREAD_COLOR), cv2.COLOR_BGR2RGB)
plt.imshow(image)
plt.axis('off')
coco = COCO()
coco.dataset = {
"categories": [
{
"supercategory": "person",
"id": 1,
"name": "person",
"keypoints": ["nose", "left_eye", "right_eye", "left_ear", "right_ear", "left_shoulder",
"right_shoulder", "left_elbow", "right_elbow", "left_wrist", "right_wrist", "left_hip",
"right_hip", "left_knee", "right_knee", "left_ankle", "right_ankle"],
"skeleton": [[1, 2], [1, 3], [2, 3], [2, 4], [3, 5], [4, 6], [5, 7], [6, 7], [6, 8], [6, 12], [7, 9],
[7, 13], [8, 10], [9, 11], [12, 13], [14, 12], [15, 13], [16, 14], [17, 15]]
}
]
}
coco.createIndex()
coco.showAnns(annotations)
plt.show()
IMAGE_FILE_PATH = 'example_pose.jpg'
result = inference(IMAGE_FILE_PATH)
visualize(IMAGE_FILE_PATH, result)
import os
import time
import numpy as np
from matplotlib import pyplot as plt
from pycocotools.coco import COCO
from requests import Session
APP_KEY = '${REST_API_KEY}'
session = Session()
session.headers.update({'Authorization': 'KakaoAK ' + APP_KEY})
def submit_job_by_url(video_url):
response = session.post('https://cv-api.kakaobrain.com/pose/job', data={'video_url': video_url})
response.raise_for_status()
return response.json()
def submit_job_by_file(video_file_path):
assert os.path.getsize(video_file_path) < 5e7
with open(video_file_path, 'rb') as f:
response = session.post('https://cv-api.kakaobrain.com/pose/job', files=[('file', f)])
response.raise_for_status()
return response.json()
# It is recommended to implement a callback when applying it in your real service.
def get_job_result(job_id):
while True:
response = session.get('https://cv-api.kakaobrain.com/pose/job/' + job_id)
response.raise_for_status()
response = response.json()
if response['status'] in {'waiting', 'processing'}:
time.sleep(10)
else:
return response
def visualize(resp, threshold=0.2):
# Visualization using COCO API
coco = COCO()
coco.dataset = {'categories': resp['categories']}
coco.createIndex()
width, height = resp['video']['width'], resp['video']['height']
# Ignore key points with low confidence
for frame in resp['annotations']:
for annotation in frame['objects']:
keypoints = np.asarray(annotation['keypoints']).reshape(-1, 3)
low_confidence = keypoints[:, -1] < threshold
keypoints[low_confidence, :] = [0, 0, 0]
annotation['keypoints'] = keypoints.reshape(-1).tolist()
plt.axis('off')
plt.title("frame: " + str(frame['frame_num'] + 1))
plt.xlim(0, width)
plt.ylim(height, 0)
coco.showAnns(frame['objects'])
plt.show()
VIDEO_URL = 'http://example.com/example.mp4'
VIDEO_FILE_PATH = 'example.mp4'
# Used when specifying video URL(video_url)
submit_result = submit_job_by_url(VIDEO_URL)
# Used when uploading video file(file)
submit_result = submit_job_by_file(VIDEO_FILE_PATH)
job_id = submit_result['job_id']
job_result = get_job_result(job_id)
if job_result['status'] == 'success':
visualize(job_result)
else:
print(job_result)