I want you to imagine a cat. Yes. A cat 🐈
An orange, fuzzy cat sitting on the roof of a house in Tokyo, Japan. It's a clear summer night. The cat is staring at the full moon in the sky dreaming about what it's like to be up there. The style is anime, with a light influence of steampunk.
Does it look like this?
Or perhaps like this?
Or maybe you dream in Technicolor?
An idea. A few keystrokes. The image in your head is suddenly on the screen.
It's magic. It's a power to create visual art, now in the hands of those who could not before.
Like me. I peaked as an artist when I was 12:
But I’m no longer limited by my skill. If I could imagine it, and if I could describe it - I could create it. My words gained a new superpower. No longer confined to being letters on a screen, they could now take a life of their own as visual representations of my imagination.
My words became the most creative tool in my toolbox.
With my words I can imagine a new character and bring them to life, not just in the reader’s imagination, but in their retinas.
With my words I can build a brand new world, and show you what the landscape looks like - verbally and visually.
My words can take my product and place it in different environments - creating a photoshoot without having to touch a camera.
Or my words can make snazzy Zoom backgrounds and custom images for my Substack posts.
Once I got comfortable with generating AI art, my days have been more colorful, more creative, and more fun.
The Agenda
At first, I was overwhelmed. There are hundreds of models, a myriad of competing services, and the internet is full of opinionated people who aren’t exactly known for being reserved.
So, to the person who would rather create than read, I offer a straightforward map to AI art.
In later posts, we’ll talk about prompting, editing, automation and philosophy.
But today, we start with tools.
The Top AI Art Tools
According to … me
When I’m trying to hang a painting, sometimes I’ll reach for my DeWalt drill, and sometimes I’ll reach for my screwdriver. Most of the time, either will do the job.
The writing below is a toolbox. A beginner toolset from Ikea. I use these tools regularly. But I did choose to leave certain ones out. Some, I’m not as familiar with. Others are too hard to set up and have to be run on your local machine. And others are too specialized.
But I can guarantee that you have everything you need here to hang a painting or put together a coffee table. We’ll build custom furniture in future posts.
GPT-4o: The Corporate Multitasker
GPT-4o is the brand new kid on the block. It just moved into a new condo in a sleek high-rise with that “modern home” feel. The apartment is clean, efficient, and streamlined, but lacks any real personality.
This corporate multitasker is a replacement of its older sibling DALL-E. DALL-E was a separate image generation model made by OpenAI, and while it still exists, it will likely go away soon.
GPT-4o is extremely productive, and treats image generation as just another thing you can do during conversation. It's able to go back and forth and refine the image in the flow of conversation, like an extremely patient design intern who never gets annoyed with your indecisiveness.
For example:
I asked it to generate an orange cat on the roof in Tokyo.
Then I asked it to put a top hat on the cat.
Then, I asked it to add a street sign to the roof where our cat is sitting.
This is almost the exact same image - but the only difference is that the cat wears a hat in the second one, and there’s a sign in the third.
There two key differences between GPT-4o and every other model on the market:
It allows for back-and-forth image adjustments. This is the only platform that allows us as of date of publication.
GPT-4o is extremely good at writing text, which we can see from the new sign.
Just like everything else OpenAI makes, GPT-4o feels a bit sterile and buttoned down. But it’s only devoid of personality by default. You can infuse it with personality via prompting, and by building up context through conversation. Just don’t expect it to surprise or delight you on its own.
Pros:
✅ Creates text in images you can actually read.
✅ Already included in your ChatGPT subscription.
✅ Refines images through natural conversation.
✅ Can generate detailed infographics and visualizations.
Cons:
❌ Images look corporate and sterile by default.
❌ Takes its sweet time generating (up to a minute per image).
❌ Editing and remixing of images is only possible by prompting.
Pricing1:
💰 Limited on the free plan.
💰 Expanded limits on the $20/mo ChatGPT plan, no limits for $200/mo plan.
💰 TLDR: $20/mo plan is fine for me.
Midjourney: The Hipster Artist
Midjourney is the artsy one. The hipster lives in a sprawling Manhattan loft, with half-finished masterpieces covering every possible surface. There’s a can of paint hanging from the ceiling over a deliberately punctured canvas, occupying the space where most people would have a coffee table. It’s impossible to tell how he can afford to live there (trust fund perhaps?). But the art is so captivating, the economics aren’t worth caring about.
Midjourney is too cool to integrate via API. Everything is artisanal and bespoke. your style and artistic preferences. You can feed it a combination of reference images (for character and style inspiration), text prompts, and a collection of Midjourney-specific parameters to get the artistry you’re looking for.
For example, if I like this rose:
But if I want to make an image of a cat in the same style, I can provide this image to Midjourney as a style reference along with my prompt. I can also pass a “—sw 100
” 2along with my prompt to control how heavily (0 - 1000) my reference image influences the result.
The result is a cat image with the same feel as the rose:
Midjourney is extraordinarily powerful. But it’s easy to get sucked in, and realize that you just spent an entire evening crafting a Zoom background (I speak from experience).
Pros:
✅ Very powerful, especially for truly making art or creating something unique.
✅ A wide variety of layers to pull for precise control.
✅ Personalization allows Midjourney to get to know what you like over time and match your style.
✅ Powerful editing and remixing interface, letting you replace or regenerate parts of the image.
Cons:
❌ No “official” API integrations. You want an image, you come to the loft.
❌ Prompt parameters are very powerful, but come with a learning curve.
❌ Requires an additional subscription (on top of your other AI tools).
Pricing3:
💰 No free plan.
💰 TLDR: I paid for $96 for the year, and it’s been enough.
Ideogram: The Tech Savvy Designer
Ideogram is the tech-savvy designer living in a minimalist San Francisco apartment. The place is meticulously organized with modern, sleek furniture and font-art on the walls. There’s a pristine workstation featuring an iMac that’s displaying a design mockup. No need to guess economics here - she’s employed by a top AI startup, which is how she pays her SF rent bill.
Ideogram is deeply practical, but still creative. It can read your handwriting when you sketch ideas on a napkin, and turn it into something presentable. Like GPT-4o, it’s good at text generation (but GPT-4o is better). With its “magic prompt” feature, you can mash a prompt into the keyboard, and get a professional result back.
For example my prompt:
“An orange, fuzzy cat sitting on the roof of a house in Tokyo, Japan. It's a clear summer night. The cat is staring at the full moon in the sky dreaming about what it's like to be up there. The style is anime, with a light influence of steampunk.”
Turns into a Magic Prompt:
“An anime-style illustration with a light steampunk influence, featuring an orange, fuzzy cat sitting on the roof of a house in Tokyo, Japan. The cat is staring intently at a full moon in the clear summer night sky, its eyes reflecting the moon's glow, and its tail curled neatly around it. The rooftop is adorned with steampunk elements like gears and pipes, while the background showcases the city's neon lights and traditional architecture, creating a magical and dreamy atmosphere.”
This prompt then turns into:
Ideogram is the one you call to help you make the blog post illustration. You ask it for the cool Insta post. Don’t expect mind-blowing art from Ideogram. Its strength lies in the ability to create images for publishing online.
Pros:
✅ Excels at creating images with readable, coherent text.
✅ Really tuned for making the type of images you would post on Instagram, blog posts, and other online publications.
✅ Delivers consistently usable results without a lot of complicated prompting.
✅ Powerful editing and remixing interface, letting you replace or regenerate parts of the image.
Cons
❌ Less “artsy” style than Midjourney.
❌ A separate subscription outside of other AI tools.
❌ Fewer advanced customization options for control freaks.
Pricing4:
💰 No free plan.
💰 TLDR: I got the $7/mo plan, and it’s been fine.
Canva Magic Media: The Cheery Roommate
Canva's AI art component - Magic Media - is the cheerful roommate who lives in the spare room of Canva's well-decorated apartment. It approaches image generation like a personal shopper - delivering social media graphics in a pinch, presentation visuals that match your brand colors, and quick newsletter headers that integrate seamlessly into your existing designs.
Magic Media's strength comes from its deep integration with the Canva platform. If you're using Canva as a small business with a set of brand colors, Magic Media will follow those styles in the images it generates. When stock photos fail you, Magic Media is there to help create exactly what you need.
The results won’t win any awards. But they’ll look good on LinkedIn or your newsletter. Magic Media knows exactly what it is - a practical tool for every day needs. And it’s very content in that playground.
Pros
✅ Seamlessly integrates with Canva's existing design platform, letting images fit directly into designs and utilize brand kits.
✅ Creates usable results even with simple prompts.
✅ No separate subscription needed if you already use Canva Pro.
Cons:
❌ Limited creative range compared to dedicated AI image platforms.
❌ Fewer customization options and prompt parameters.
❌ Image remixing / editing functionality exists, but only as part of overall Canva ❌ platform.
Pricing5:
💰 Pretty solid free plan, you can do most things.
💰 TLDR: I’m on the free plan. But if I run out of credits, the $120/yr pro plan would be worth it to me with everything Canva can do.
Mix and Match
I don’t have a perfect rule for when to use which tool. But by giving these tools personalities, I am able to think of who I want to work with on a creative task.
Sometimes, I’ll give the Hipster Artist a call, use Tech Savvy Designer to make some edits to the output, and stitch together several images with the Cheery Roommate.
There’s no right way. There’s the way that feels right to you and produces the result you want. And the only way to know what feels right is to dive in.
Whenever I use these tools, I get a rush of creativity and a flood of ideas into my brain. This technology opened the door in me to something I didn’t know was there. I found a visual imagination that was previously trapped by limited artistic skills.
What was once inaccessible to me is now at my fingertips. And I freaking love it.
My Challenge to You
So here’s my challenge to you:
Pick a tool.
Make something every day for a week.
Comment on this post with your creations.
You will find a joy in creativity - a familiar feeling that many of us haven’t had since childhood. A feeling of openness, possibility, a world where we don’t yet know where the limits are.
After a week, try another tool. See what’s different. Try to make them work together. Eventually, you’ll build up a sense of what works, and start to play with your own ideas.
In the meantime, if you’re really struggling for prompts:
Imagine an orange, fuzzy cat sitting on the roof of a house in Tokyo, Japan. It's a clear summer night. The cat is staring at the full moon in the sky dreaming about what it's like to be up there. The style is anime, with a light influence of steampunk.
🎨 Let the Tech Savvy Designer create it. 🎨
🎭 Have the Hipster Artist style it. 🎭
🔧 Tweak it with the Corporate Multitasker. 🔧
🖌️ And share it with the Cheery Roommate. 🖌️
A tremendous thank you for the help and patience of:
, , , , , , , , . This piece has been 3 months in the making and I’m pretty sure they’re all tired of me talking about it.
Wow this is super helpful!
…your passion for the realm is limitless and enraptured…