4D multi-view reconstruction of moving actors has many applications in the entertainment industry and although studios providing such services become more accessible, efforts have to be done in order to improve the underlying technology to produce high-quality 4D contents. In this paper, we present a method to derive a time-evolving surface representation from a sequence of binary volumetric data representing an arbitrary motion in order to introduce coherence in the data. The context is provided by an indoor multi-camera system which performs synchronized video captures from multiple viewpoints in a chroma-key studio. Our input is given by a volumetric silhouette-based reconstruction algorithm that generates a visual hull at each frame of the video sequence. These 3D volumetric models lack temporal coherence, in terms of structure and topology, as each frame is generated independently. This prevents an easy post-production editing with 3D animation tools. Our goal is to transform this input sequence of independent 3D volumes into a single dynamic structure, directly usable in post-production. Our approach is based on a motion estimation procedure. An unsigned distance function on the volumes is used as the main shape descriptor and a 3D surface matching algorithm minimizes the interference between unrelated surface regions. Experimental results, tested on our multi-view datasets, show that our method outperforms other approaches based on optical flow when considering robustness over several frames.
Multi-view reconstruction, Motion flow, Dynamic mesh, Voxel matching, Mesh animation