We made a photorealistic dynamic macaque face avatar which provided the necessary groundwork to produce an animation of a highly-realistic three-dimensional monkey avatar with veridical full-body motion data obtained by tracking of real animals in their home environment. We constructed a novel dynamic, naturalistic stimulus set of bodies of monkeys and humans. The stimuli were presented to the monkey and human subjects in fMRI studies to reveal regions in the brain that respond specifically to dynamic bodies (“body patches”) and to examine the body patch network. A 7T human fMRI study revealed hitherto unknown dynamic body patches and a network analysis revealed two networks selective to bodies. One of these networks showed a high species selectivity, being specifically tuned to human bodies. We performed a full mapping of dynamic body patches in monkeys with fMRI using realistic dynamic monkey body stimuli. Subsequently, we revealed the shape and motion features that drive the neural responses to bodies at the single-unit level in these patches. We generated a large set of monkey avatars, varying in pose and viewpoint, and examined the feature selectivity of single neurons in body patches. In parallel, in humans, we revealed the selectivity for body pose features using human avatars and modeling in a body-selective area. A neural model was developed that accounts for monkey single-unit recording data of responses to moving bodies behind a narrow slit. This model is based on a novel nonlinear mechanism that suppresses clutter generated by the slit. Models for body motion and pose recognition were implemented and tested with stimuli employed in the physiological experiments. One model combines a deep network architecture with a recurrent neural field-based mechanism for sequence recognition. We extended and modified deep neural network models by combination with neurodynamic models in order to account for the dynamics of the responses of body-selective neurons.