Image Battle

Compare AI Image Generators for your use-case

Summary for Hands & Anatomy

This category remains one of the ultimate stress tests for AI. The data reveals a clear divide between models that generate "shaped" hands versus those that generate "biological" hands.

🏆 Key Findings

  • Top Performer: Nano Banana Pro proved exceptional, scoring a perfect 10 on the difficult Mirror Reflection and Runner prompts, demonstrating superior logic and texture.
  • Texture is the New Battleground: While many models now get the finger count right (solving the 'polydactyly' issue), widely used models like DALL-E 3 consistently scored lower (3-5 range) due to a distinct 'plastic' or waxy skin texture that fails realism tests.
  • Interaction Difficulty: Models still struggle significantly when hands touch. Prompts like High Five caused finger-merging issues for several models, whereas single-hand prompts like Hand Drawing saw much higher success rates.
  • The 'Group of 5' Failure: Almost every model failed the counting constraint in the Group Joining Hands prompt, showing that numeracy remains a weak point even in advanced models.

Patterns and Performance Analysis

🖐️ The 'Uncanny Valley' of Skin Texture

A recurring theme in the evaluations is the penalty for 'smooth' or 'airbrushed' skin.

  • The Waxy Look: Models like DALL-E 3 and Grok 2 Image frequently received deductions for skin that looked like rubber or plastic. For example, in the Handshake prompt, DALL-E 3 scored a 4 due to 'waxy skin,' while Recraft V3 scored a 9 for 'top-tier' surface detail.
  • Hyper-Realism: Midjourney v7 and Nano Banana Pro excel at rendering veins, knuckles, and dirt (seen in the Hand Drawing prompt), pushing them out of the uncanny valley.

🧩 Anatomical Logic vs. Visual Quality

High artistic merit does not save a model from anatomical failure.

  • Mirror Logic: The Mirror Reflection prompt was a logic trap. Most models (including Flux and Ideogram) inverted the prompt, showing a reflected front instead of a back. Nano Banana (2.5 Flash) was the only one to score a 10/10 by correctly handling the geometry.
  • Limb Coherence: In the Yoga Pose, models often struggled with 'fully extended' limbs. However, Grok Imagine managed a perfect 10/10 here, showing that newer models are gaining a better understanding of skeletal structure.

🔀 Interaction Issues

When two hands interact (touching, holding), the boundary lines often blur.

  • Merging: In the Group Joining Hands prompt, models like Recraft V3 created 'mashed' piles of hands.
  • Clean Separation: Seedream 3.0 scored a perfect 10 on the High Five prompt, proving it is possible to render touching skin without distinct objects merging.

Best Models by Scenario

📸 For Photorealistic Close-Ups

Winner: Recraft V3 & Nano Banana Pro

  • Why: These models handle micro-details (pores, wrinkles, fabric weave) better than competitors. In the Handshake evaluation, Recraft V3 was noted for "perfect anatomical structure" and "incredible detail in skin texture."

🧘‍♀️ For Complex Body Poses (Yoga/Sports)

Winner: Grok Imagine

  • Why: While inconsistent elsewhere, this model hit a "perfect execution" (10/10) on the Yoga Pose, showing zero artifacts in a complex twisting posture where others failed.

💡 For Complex Logic (Reflections/Geometry)

Winner: Nano Banana (2.5 Flash)

  • Why: It was the standout performer on the Mirror Reflection task, correctly interpreting the spatial relationship between the camera, subject, and mirror.

🖌️ For Artistic Interpretation

Winner: Midjourney v7

  • Why: Even when it misses a count constraint, the texture work is often described as "stunning" or "top-tier," as seen in the Hand Holding Apple prompt where it handled water droplets on skin perfectly.