Text2Motion

Published: October 15, 2025

Goal

To develop an end-to-end pipeline that automates 3D character animation, taking static meshes and generating text-aligned motion sequences through automatic rigging, skinning, and motion synthesis.

Technical Approach

The system is organized in three stages:

Automatic Rigging & Skinning: A PointNet++ model predicts joint positions and skinning weights directly from 3D mesh geometry, eliminating the need for manual rigging.
Text-Conditioned Motion Generation: A topology-aware transformer diffusion model generates motion sequences conditioned on natural language descriptions, using SBERT and T5 text encodings.
Loss & Training: The model is trained with a combination of Geodesic loss and InfoNCE contrastive loss to ensure topologically consistent and semantically aligned motion.

Key Metrics & Results

Achieved 92% armature classification accuracy on the Truebones Zoo dataset.
The pipeline successfully automates the full workflow from static mesh to animated character driven by text descriptions.
Published results in the “Text2Motion” article on Medium.

Tech Stack

Python, PyTorch, Transformers, CUDA, OpenCV, SBERT, T5

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Domenico Lacavalla

Goal

Technical Approach

Key Metrics & Results

Tech Stack

Share on