Image Generation in Python with OpenAI DALL-E

In this tutorial, we’ll teach you how to use the OpenAI DALL-E model to generate images from text prompts in Python. DALL-E is a neural network that can create diverse and realistic images from natural language descriptions. It is based on the GPT architecture and trained on a large corpus of text and images. DALL-E can produce images of various sizes, edit existing images, and create variations of images.

You will perform the following tasks in this tutorial:

Create new images using text prompts.
Edit existing images.
Create variations of existing Images.

Installing OpenAI Library

To use the DALL-E model in Python, you need to install the OpenAI Python library and set up our API key. You can get your API key from the OpenAI website after creating an account.

The following command installs the OpenAI Python library:

pip install --upgrade openai

The following script imports the openai module and sets your OpenAI API key. Simply replace the string with your actual API key.

import openai

openai.api_key = "YOUR_API_KEY_HERE"

Generating Images

The OpenAI library provides a convenient method for creating images from text prompts using the Image.create function. This function takes the following parameters:

prompt: The text description of the image we want to generate. This can be anything from a simple noun to a complex sentence.
n: The number of images we want to generate for the given prompt.
size: The size of the images we want to generate. The valid sizes are ‘256x256’, ‘512x512’, ‘1024x1024’, ‘1024x1792’, and ‘1792x1024’. The default value is ‘256x256’.

The function returns a JSON object that contains a list of image URLs in the data key. We can use these URLs to display the images or download them.

Let’s try an example prompt: “horse jumping a water stream.” The following script will generate two images of size 1024x1024 for this prompt. Though the images were originally made as size 1024x1024, I post-processed them down to 512x512 to reduce the download size.

response = openai.Image.create(
  prompt="A horse jumping a water stream",
  n=2,
  #valid sizes: '256x256', '512x512', '1024x1024', '1024x1792', '1792x1024'
  size="1024x1024"
)

You can iterate through the response[‘data’] object to print the URL of each generated image. The URL is only valid for 1 hour.

for image_url in response['data']:
    print(image_url['url'])
    print("---------------")

Output:

url 1

---------------
url 2

---------------

Click the output URLs to view the generated images. In my case, the following images were generated.
Output Image 1:

horse jumping a stream

Output Image 2:

horse jumping a stream

As you can see, the DALL-E model generated two different images of a horse jumping a water stream, with different backgrounds, colors, and perspectives. The images look realistic and detailed, and match the prompt well. The output can be different in your case, but the overall results will remain the same.

You can also modify the prompt to add more details or constraints, such as the color of the horse or the background. For example, let’s try “A blue horse jumping a water stream in front of white mountains.” We will generate one image of size 1024x1024 for this prompt.

response = openai.Image.create(
  prompt="A blue horse jumping a water stream in front of white mountains",
  n=1,
  #valid sizes: '256x256', '512x512', '1024x1024', '1024x1792', '1792x1024'
  size="1024x1024"
)

Output Image:

horse jumping a stream

The DALL-E model generated an image of a blue horse jumping a water stream in front of white mountains, as specified by the prompt. The image looks surreal and artistic, and demonstrates the creativity and flexibility of the model.

Editing Existing Images

The DALL-E model can also edit existing images by applying text prompts to masked regions of the image. This can be useful for modifying or enhancing images, or creating new compositions.

To edit an existing image, you can use the Image.create_edit function. This function takes a few parameters:

image: The original image that you want to edit. This can be either a file object or a URL.
mask: The mask image that indicates the region of the original image that you want to edit. The mask image should have the same size as the original image. The masked region should contain transparent pixels.
prompt: The text description of the edited image. The description should be for the complete image and not just for the masked region.
n: The number of images we want to generate for the given prompt and mask.
size: The size of the images we want to generate. The valid sizes are ‘256x256’, ‘512x512’, ‘1024x1024’, ‘1024x1792’, and ‘1792x1024’. The default value is ‘256x256’.

The Image.create_edit returns a JSON object that contains a list of image URLs in the data key. You can use these URLs to display the images or download them.

Let’s try an example where we edit an existing image. We will use the same image we got with our blue horse jumping a water stream in front of white mountains prompt, but we will mask out the horse and replace it with a lion using the prompt: “A lion jumping a water stream.”

Masked Image:

masked horse jumping a stream

response = openai.Image.create_edit(
  image=open(r"C:\Datasets\original_image.png", "rb"),
  mask=open(r"C:\Datasets\masked_image.png", "rb"),
  prompt="A lion jumping a water stream",
  n=1,
  size="1024x1024"
)
image_url = response['data'][0]['url']
print(image_url)

As shown below, the DALL-E model generated an image of a lion jumping a water stream, replacing the horse in the original image. The image looks realistic, consistent and matches the prompt well.

Edited Image:

lion jumping a stream

Creating Image variations

The DALL-E model can also create variations of existing images by applying random transformations or perturbations to them. This can be useful for generating new images from existing ones, or exploring different possibilities or styles.

To create variations of an existing image, you can use the Image.create_variation function. The original image is passed to the image parameter. In addition you need to pass the values for n and size parameters as before.

Let’s try an example where we create a variation of an existing image. We will use the same image of a blue horse jumping a water stream, but we will create one variation of size 1024x1024 for this image.

response = openai.Image.create_variation(
  image=open(r"C:\Datasets\original_image.png", "rb"),
  n=1,
  size="1024x1024"
)
image_url = response['data'][0]['url']
print(image_url)

New Image:

horse jumping a stream - variation

The DALL-E model generated a new image of a blue horse jumping a water stream in front of white mountains, applying some random transformations to the original image. The new image looks similar to the original one, but has some differences in the colors, shapes, and textures.

Conclusion

In this tutorial, we’ve learned how to use the OpenAI DALL-E model to generate images from text prompts, edit existing images, and create variations of existing images. We have seen that the DALL-E model can produce diverse and realistic images that match the given prompts well. We have also seen that the DALL-E model can create surreal and imaginative images demonstrating creativity and flexibility.

I hope you enjoyed this tutorial and learned something new! Thank you for reading!

Image Generation in Python with OpenAI DALL-E

The Python Tutorials Blog

Installing OpenAI Library

Generating Images

Editing Existing Images

Creating Image variations

Conclusion

About The Python Tutorials Blog