Summary for Hands & Anatomy
This category remains one of the ultimate stress tests for AI. The data reveals a clear divide between models that generate "shaped" hands versus those that generate "biological" hands.
🏆 Key Findings
- Top Performer: Nano Banana Pro proved exceptional, scoring a perfect 10 on the difficult Mirror Reflection and Runner prompts, demonstrating superior logic and texture.
- Texture is the New Battleground: While many models now get the finger count right (solving the 'polydactyly' issue), widely used models like DALL-E 3 consistently scored lower (3-5 range) due to a distinct 'plastic' or waxy skin texture that fails realism tests.
- Interaction Difficulty: Models still struggle significantly when hands touch. Prompts like High Five caused finger-merging issues for several models, whereas single-hand prompts like Hand Drawing saw much higher success rates.
- The 'Group of 5' Failure: Almost every model failed the counting constraint in the Group Joining Hands prompt, showing that numeracy remains a weak point even in advanced models.
Patterns and Performance Analysis
🖐️ The 'Uncanny Valley' of Skin Texture
A recurring theme in the evaluations is the penalty for 'smooth' or 'airbrushed' skin.
🧩 Anatomical Logic vs. Visual Quality
High artistic merit does not save a model from anatomical failure.
- Mirror Logic: The Mirror Reflection prompt was a logic trap. Most models (including Flux and Ideogram) inverted the prompt, showing a reflected front instead of a back. Nano Banana (2.5 Flash) was the only one to score a 10/10 by correctly handling the geometry.
- Limb Coherence: In the Yoga Pose, models often struggled with 'fully extended' limbs. However, Grok Imagine managed a perfect 10/10 here, showing that newer models are gaining a better understanding of skeletal structure.
🔀 Interaction Issues
When two hands interact (touching, holding), the boundary lines often blur.
- Merging: In the Group Joining Hands prompt, models like Recraft V3 created 'mashed' piles of hands.
- Clean Separation: Seedream 3.0 scored a perfect 10 on the High Five prompt, proving it is possible to render touching skin without distinct objects merging.
Best Models by Scenario
📸 For Photorealistic Close-Ups
Winner: Recraft V3 & Nano Banana Pro
- Why: These models handle micro-details (pores, wrinkles, fabric weave) better than competitors. In the Handshake evaluation, Recraft V3 was noted for "perfect anatomical structure" and "incredible detail in skin texture."
🧘♀️ For Complex Body Poses (Yoga/Sports)
Winner: Grok Imagine
- Why: While inconsistent elsewhere, this model hit a "perfect execution" (10/10) on the Yoga Pose, showing zero artifacts in a complex twisting posture where others failed.
💡 For Complex Logic (Reflections/Geometry)
Winner: Nano Banana (2.5 Flash)
- Why: It was the standout performer on the Mirror Reflection task, correctly interpreting the spatial relationship between the camera, subject, and mirror.
🖌️ For Artistic Interpretation
Winner: Midjourney v7
- Why: Even when it misses a count constraint, the texture work is often described as "stunning" or "top-tier," as seen in the Hand Holding Apple prompt where it handled water droplets on skin perfectly.