August 3, 2023

AI in Digital Marketing. Creating Realistic Images Using AI

In last week’s article, we looked at the potential of AI to create graphic images for marketing purposes. And in this one, we’ll share the results of testing generative neural networks to create realistic images for Mobio Group projects. As in the last article, the two most common tools chosen for testing were Midjournay and DALL-E.

Celebrity Images and memes

The first task was to create recognizable meme images but without 100% copying them to avoid legal implications of use.

For the test, we tried to get the images of Dwayne Rock Johnson and a famous meme with a smiling cat.

Midjourney

Getting a meme image of Dwayne Johnson was not difficult, either by text query or by using the original image. What is interesting, the use of the original image does not always (less than 50%) give the desired visual image, but often creates a “base of the scene”.

To get good results, in addition to the main description in the query a “tail” should be added listing the features of good images.

Here we would like to note an important feature — MJ generates famous people with ease. There is a whole trend of “what ifs” on the Internet: “What Harry Potter characters would look like if they were cats”, “Post-holiday Harry Potter characters”, “Lord of the Rings characters if they were cats”, “Lord of the Rings in Cyberpunk style”, etc.

It turned out to be a little more complicated with the cat.

Here I definitely had to use a ready-made image as a base and try various variations of requests. All of the cats turned out very cute and cuddly, but very different from the original image.

After many attempts, we still managed to get an interesting result, suitable for our purposes ⬇️

DALL-E

Here we were in for an absolute delight. This network is just made for copyright pirates. Taking the original images as the basis, we got incredibly accurate but altered versions. DALL-E picked up the finest details of the style of the original images and created variants of them exactly, but with significant modifications. In the same MJ, it took us more than one request, and the scene and styling were still very different.

The kitty came out on the first try.

Dwayne came out a little worse, but overall we were pleased with the result.

Creating a Creative with a Complex Scene

The next test task was to create a creative with a complex scene with a person on a given theme. Since in our work we often have tasks for which it is difficult to find suitable stock photographs, we tried to get a picture in the style of photorealism according to the following description: tired woman of 40+ sitting in the kitchen, drinking tea and eating cookies, gloomy tones.

Midjourney

Several dozen attempts to compose a proper query yielded some results, but revealed a number of difficulties:

For complex scenes, a reference is needed; it is needed to construct the overall scene of theimage.

On the other hand, the most acceptable result was obtained using only a text description. With a picture as a base, MJ tries to repeat facial features, pose and everything else. In the end the result is more predictable, as it is more similar to the original image, but the quality of the “human image” suffers greatly.

It’s very difficult to generate something in the mouth. We tried different queries and selected references as a basis, but we could not get an acceptable result.

Another big, well-known MJ problem is hands and fingers. It’s already somewhat of a memehow MJ draws hands, and the Artstation site artists protesting the use of AI have even created art on this subject ⬇️

After many attempts, we were unable to get a quality image with the right scene.

Interestingly, if you remove the item about cookies from the query, you get much better results.

From this we can conclude that the less complicated the plot, the better the result you can get.

DALL-E

There is nothing to describe about DALL-E — the result is close, and even worse than MJ’s.

Faces are distorted, small but important elements present in the query are also not generated. DALL-E is clearly not good at building complex scenes.

Evaluation of Results and Conclusions Mobio Group

Of course, it is too early to draw serious conclusions from just a few tasks, but our testing has yielded some results. What we learned:

As for the generation of images of famous personalities and memes, we were pleased with the results. And both neural networks were able to cope with this task.

We can clearly see the correlation “recognizable character” — the best result. Also neural networks are great at generating images of cats and dogs. But the more unique the request be is, the worse the result is.

We were not able to get the desired result when generating an image with a complex scene and a person. Perhaps the problem lies in the complexity of the subject we need. An over-detail request confuses the artificial intelligence, which badly affects the quality of the result.

The problem with the image of hands has not yet been solved.

It is practically impossible to make the network generate something in the mouth.

Applying MJ in practice, Mobio Group conclusions.

For the work of the creative department, the ability to create photorealistic images from clear references is a great opportunity to expand creative approaches.
Often creators come up with ideas that are interesting, but time-consuming to produce. It is difficult to find a stock photo, and legal restrictions do not allow using a suitable image copied from the Internet.
The possibilities of MJ and DALL-E simplify the realization of non-standard ideas, although the tools still have limitations.
Olga Mazur — Head of Creative, Mobio Group

Let’s grow your mobile app revenue