If you’ve been using AI tools such as MidJourney, Stable Diffusion (DreamStudio) or Leonardo, you’ve probably noticed that even the best AI image creation tools still tend to heavily distort hands and feet.
Look for instance at a few renders I got from MidJourney and Leonardo for Feet Close Up.
MidJourney doesn’t add extra toes, playing it safe on 3 images out of 4 by displaying shoes instead of naked feet but the result isn’t that convincing, even for the most premium provider on the market. Here’s another generation from MidJourney, for the same prompt “Feet Close Up”. The feet on three pictures out of 4 are completely distorted.
The result isn’t better if you ask the AI to generate a close up of human hands… Here’s what you get from Leonardo.
It looks a bit better with MidJourney but if you pay attention to the details, you’ll notice multiple anatomical errors. On a side note, MidJourney seems to lean towards old subjects when asked to generate hands details. Probably because this part of the body has been more extensively covered with older models in the training dataset fed to the AI engine…
Let’s submit the following prompt to MidJourney: young woman hands close up.
Let’s be fair, we were luckier with this one. I don’t see any missing or extra fingers.
Let’s submit the same prompt to DreamStudio (Stable Diffusion).
It’s slightly weirder 🙂 You understand why MidJourney is still 10x more expensive than DreamStudio…
Now that we’ve seen a few examples of epic fails by the AI to generate decent feet and hands, let’s try to understand why AI tends to be so bad at rendering the extremities of our precious limbs.
I have asked this existential question to my favourite AI assistant, Perplexity.AI, and here is the detailed answer, along with the sources:
AI-generated images of hands and feet are often distorted due to several reasons.
- Lack of visibility in source images: Hands and feet are less visible in source images compared to faces, and they tend to be much smaller in the images.
- Complex geometry: Hands and feet are one of the most complex shapes to generate, and there is no standard set of lines or shapes that AI can recognize as a hand or foot. In order to generate realistic hands and feet, AI needs to collate multiple different shapes and arrangements.
- Human perception: There are some things that visually we’re willing to give a free pass, but hands and feet are different. There’s something about them that we’re very sensitive to, and know so instinctually. So if they’re wrong we notice right away.
- Difficulty in rendering details: Hands and feet are detail-heavy, and if an AI generates a hand or foot with too few or too many fingers or toes, it doesn’t look realistic.
AI developers have made progress in the struggle to teach artificial intelligence tools how human hands and feet should look, but there’s still plenty of room for improvement.
At least a smart AI can explain the blunders of another AI 😉
Feet fetichists will have to keep on worshipping authentic human limbs for a little while…