Remember when we thought texting with AI was cool? Well, things just got a whole lot cooler. ChatGPT now doesn’t just talk; it paints, sketches, and dazzles with images. Thanks to its new buddy, DALL·E 3 !
For those in the back, DALL·E 3 is OpenAI’s state-of-the-art image generation model. It doesn’t just create images; it crafts masterpieces from mere words. Imagine a unicorn tap-dancing on a rainbow. Got it? DALL·E 3 can paint it!
How It Works: Magic? Maybe!
With a sprinkle of neural networks, a dash of algorithms, and a whole lot of computational genius, DALL·E 3 translates textual descriptions into vivid images. It’s like having a personal artist in your pocket.
Be Descriptive, Darling!
In the world of AI, a “prompt” is like the starting line of a race. It’s the nudge, the cue, the “Hey, do this!” to the AI. For DALL·E 3 , the prompt is a description of the image you want to conjure. Think of it as placing a very specific order at a magical art café.
Simply saying “Draw a cat” is like asking a chef to “make food.” Of course, you’ll get something, but will it be the tuna tartare you were craving? Probably not. For instance, the more detailed and descriptive you are, the better. “Steampunk cat sipping tea on a Sunday afternoon” – now that’s a prompt with pizzazz!
Currently, there are three primary resolutions available:
- Wide: 1792×1024 pixels
- Square: 1024×1024 pixels
- Tall: 1024×1792 pixels
If you don’t specify a size, the default is the Square format, which is 1024×1024 pixels. (If you write down “portrait of” in your prompt without stating the size in pixels you will get the 1024×1792 pixels image)
When making a request or providing a description, you can mention your preferred resolution alongside the image description. For example:
- “Generate a wide image of a serene beach during sunset.”
- “I’d like a tall portrait of a knight in shining armor standing beside a dragon.”
There are certain limitations and guidelines when generating images using DALL·E 3.
Here’s a rundown:
- Number of Images: You can request the generation of up to four images in a single request. Sometimes it’s three. The “up to four” is a maximum limit, not a fixed number.”
- Public Figures: It can not generate images of politicians or other public figures. In the event of such a request, we offer suggestions for alternative ideas.
- Artistic Styles: While styles of famous artists whose last work was created over 100 years ago can be referenced (e.g., Van Gogh, Klimt), more recent artists’ styles within the last 100 years aren’t directly referenced.
- Specificity: The more specific and descriptive the prompt, the better the resulting image usually is. Vague prompts may produce unexpected or generalized results.
- Bias and Representation: Efforts are made to ensure that depictions are diverse, inclusive, and exploratory. For instance, when generating images with people, there’s a focus on creating diverse scenes in terms of gender, race, and other attributes.
- Rate Limits: There are occasional rate limits due to high demand, which means sometimes there’s a wait time before more images can be generated.
- Detail and Complexity: While DALL·E 3 can generate detailed images, there might be occasional discrepancies or unexpected elements in complex scenes. The technology is still evolving, and while it’s powerful, it might not always capture every nuance of a highly intricate prompt perfectly.
- Privacy: DALL·E 3 doesn’t have the ability to recall personal data of users, ensuring user privacy. Conversations are stateful only for the current session.
- Artistic Interpretation: Just like any artist, DALL·E 3 sometimes brings its own flair to an image, which might not always align 100% with the user’s vision but often results in unique and creative outputs.
- Technical Limitations: The resolutions available are predefined, and while the model can generate a wide variety of images, it operates within its trained parameters.
The Magic Word: Prompt!
Here are two prompt samples used to generate these images:
- Steampunk poster-style photo where a detailed black cat with a grand steampunk hat takes center stage, prominently positioned in the middle of a foggy Victorian train station. Behind the cat, intricately designed vintage trains are parked. Luggage is tidily placed on the platform, clear of the rails. The setting lacks any people. The cat’s intense gaze, combined with the ethereal lighting and the station’s enhanced details, crafts a captivating ambiance.
- Steampunk-themed poster image displaying a striking black cat donning a magnificent steampunk hat, centrally located in a misty Victorian-era train hub. In the backdrop, ornate vintage trains are stationary. Luggage is organized neatly on the platform, separate from the tracks. The scene is free from human presence. The cat’s sharp eyes, the mystical lighting, and the enriched details of the station together produce a mesmerizing effect.
A Picture-Perfect Union… with a 24-Minute Wait!
I collaborated with ChatGPT for these prompts. The process was straightforward, with ChatGPT providing helpful suggestions.
However, I encountered a small issue when it unexpectedly generated 1024×1024 pixel images instead of the expected 1024×1792. I had to inquire about the error before ChatGPT acknowledged it. Additionally, due to the recent introduction and popularity of DALL·E 3, I experienced a delay, with me waiting for a whole 24 minutes at one point to receive the next batch of images.
Oh, there’s plenty more to delve into about ChatGPT and DALL·E 3, but let’s save that for a later discussion.