Paper:
Rogez, Grégory, and Cordelia Schmid. “Mocap-guided data augmentation for 3d pose estimation in the wild.” Advances in Neural Information Processing Systems. 2016.
Key:
Dummy human pose for augmentation
- Basics
- Data augmentation for 3D pose estimation.
- Input: Using 3D motion capture data.
- Combine selected images to generate a new synthetic image. — stitching local image patches — Constraint on kinematical manner.
- Cluster the training data into a large number of pose classes. — K way classification problem.
- Main methods
- Cluster 3D poses into K pose classes. Then generate the “dummy” pose image, just keep shape outline looks like a human pose, that will be fine.
- Input: two training sources — Images with annotated 2D pose && 3D MoCap data
- Two process
- MoCap guided mosaic construction — Stitches image patches together
- Input: 3D pose with n joints. && projected 2D joints in one view.
- Output: For an image, we find each joint in the image which corresponds with the pose.
- Get the transformation matrix of the joint’s location from one pose to another. — Measure the similarity between the joint in the 2nd pose and the aligned joint from 1st pose to the 2nd pose.
- Increase the weight for the neighboring joints.
- Transfer the cropped image to another pose, and select the patch to form a new image.
- Pose-aware blending — improve image quality, erases patch seams.
- Solving the boundaries between image regions.
- Select a surrounding squared region. — Evaluate how much each image should contribute to the pixel. — Final is computed as the weighted sum over all aligned images.
- MoCap guided mosaic construction — Stitches image patches together
- CNN for full-body 3D pose estimation
- Shows that with only synthetic data, we can still obtain good performance.
- Take home messages.
- Other methods mentioned.
- Data augmentation
- Jittering
- Complex affine
- 3D pose estimation
- CNNs — trained on 3D MoCap data in constrained environments.
- Estimate 3D pose from 2D poses data.
- 2D pose detector
- Or jointly learn 2D and 3D pose.
- Dual source approach — combines 2D pose estimation and 3D pose retrieval.
- Synthetic pose data.
- Data augmentation