What can DALL-E, Stable Diffusion and ChatGPT tell us about Dressage Horse Stereotypes?
December 2022
There has been a lot of noise around the release of AI tools DALL-E and ChatGPT. These tools convert text into images, “draw a horse doing machine learning in steam punk style” or answer your questions in conversational style. If you know what this is about, skip the next paragraph. If it has escaped your attention so far, read on.
Artificial Intelligence (AI) and Machine Learning (ML) have been evolving rapidly over the last few years, driven by big companies spending large sums and by techies all over the world going home from their day jobs to play with and improve the latest open source offerings in this field. So far, while this has been fascinating for us techies, the results for everyone else have been underwhelming. The promise of AI has remained as elusive as Nuclear Fusion. But, as with Fusion, it’s all starting to get a bit more real. The current buzz is the ability to generate images with just a text prompt, no artistic skill required at all, just the ability to articulate your idea. ChatGPT can answer questions that you might ask google “What is Machine Learning?”, but unlike google that gives you links that might answer this question, ChatGPT will have a conversation that means you can refine or ask related questions and the answer will include the context of your earlier questions or requests. “Give me an outline for a blog post on Machine Learning”. Where does learning end and cheating start?
Have a go yourself, here are the links
https://labs.openai.com/ (DALL-E 2)
https://stablediffusionweb.com/ (Stable Diffusion)
https://chat.openai.com/chat (ChatGPT)
So having my first play with DALL-E 2 I tried “horse doing dressage in the rain” and this is what I got.
I thought it interesting that these horses all appear to be behind the vertical and one is sticking it’s tongue out!
For those non-horsey readers, it is considered correct for the horse to have their poll (the bit between the ears) as the highest point and their face on or in front of the vertical. There is an ongoing debate whether this is reflected in the judging of dressage competitions. Sticking the tongue out is a clear no no.
DALL-E and other image generation tools learn from a huge set of images with text descriptions and then use this association to create new images that match the meaning of the text. This requires that the AI both understand the meaning of the text and can correctly work out if an image matches it or not. We would therefore expect that asking for a dressage horse would tend to produce stereotypical images of what it has learned a dressage horse looks like.
So I started looking at more google images and generating more images from text to see what the computer thinks a dressage horse looks like.
Now clearly there is a lack of understanding around the physics of a horse. It seems to be easily confused between horse and human front legs for example, and as it turns out, these were some of the best images. We are way off deep fake horses, but in this fast moving world, way off may only be a case of months. So I switched to my favourite steampunk style of horse to illustrate this article…
Adding dressage, usually gets you a rider too
The TLDR is that it’s too soon to differentiate between dressage and non-dressage for the image generation tools. Looking at some of the weird dressage horses generated, they are wearing jumping horse tack and are not always doing dressage movements, so the system is heavily influenced by non-dressage images. Worryingly, the bulk of real images from a google search or in the stock libraries show outlines that do not meet the current stated standard of having their face on or in front of the vertical.
How long has this preference for a highly arched neck has been going on?. A very long time it seems, looking at equestrian statues through the ages, but that is a whole other topic for another day.
So then I wondered what ChatGPT might say about dressage and because the first one was so competition orientated, I asked it to try again and then further refine.
Compared to Wikipedia
While ChatGPT’s answer is not wrong, it misses the essential of dressage, which is the training of a horse for any discipline. In competitions there are predetermined movements so this suggests most content on dressage is about competitive dressage. However, with a bit of prompting (see full conversation below) it provides a fuller answer.
But now I have more questions, so I asked ChatGPT:
where does dall e get the images it learns from?
DALL-E gets its images from a variety of sources, including the internet, social media platforms, and image databases. These images serve as the training data for the artificial intelligence system, allowing it to learn and recognize patterns and features in order to generate new images based on text prompts.
How does ChatGPT learn about horses?
ChatGPT is a variant of GPT-3, which is a large language generation model developed by OpenAI. GPT-3 is trained on a vast amount of text data from the internet and other sources, so it has learned about a wide variety of topics, including horses. It is able to generate coherent and coherent text that is informed by this knowledge, but it is important to note that GPT-3 is a machine learning model, so it does not have the same level of understanding and knowledge as a human. It can generate text that is factually accurate, but it does not have the ability to apply that knowledge in the same way that a human would.
How can I correct chatgpt if I find something incorrect?
As a large language generation model, GPT-3 is not able to learn from new input or feedback, so it is not possible to directly correct it if you find something incorrect. However, you can use GPT-3 to generate text based on a prompt that includes accurate information, and in this way, you can ensure that the text it generates is based on accurate information.
It is also important to keep in mind that GPT-3 is a machine learning model, and like all machine learning models, it is not perfect. It is trained on a large dataset of text, but it may still generate text that is inaccurate or that contains errors. It is always a good idea to fact-check any information that you obtain from GPT-3 or any other source to ensure that it is accurate and up-to-date.
If someone decided they wanted to change the stereotype for a dressage horse outline, how many images over what period of time would it take?
It’s difficult to say exactly how many images or how much time would be required to change the stereotype for a dressage horse outline in DALL-E or stable diffusion, as it would depend on a number of factors, including the specific goals and objectives of the project, the quality and diversity of the images being used, and the complexity of the task.
However, it is generally accepted that the more data you have, the better the results you can achieve with machine learning models. So if you are trying to change the stereotype for a dressage horse outline, it would likely be helpful to gather a large and diverse set of images and work with them over an extended period of time in order to train a model that can accurately and effectively reflect the desired changes.
I’m only just starting to get my head around this. How will I use it? What will it displace? Who are real influencers and how do we change stereotypes? So many questions.
Do have play. What are your questions and how well did chatGPT answer them? What are the implications for this technology? How influential will it be? Can you tell if you are talking to a machine? When will it start talking with Horses?
Phoebe Bright
Horse Technologist