Closer

Closer #20

GPT-5 is out and underwhelming. Our mini six-week experiment has come to an end. There's only five more regular editions of this newsletter before we wrap up with some preliminary conclusions in edition #25.

Mitchel Lensink

Aug 20, 2025

4 min read

Welcome to Closer,

A weekly experiment in AI interpretation: I share a photo, ask AI to describe it, then generate an image from that description. This is edition #020. Prompts are in grey boxes. Descriptions are in italics.

Now that our unfruitful mini-experiment has come to an end, we can conclude we shouldn't ask ChatGPT to write prompts for us, that are then used to describe input images. Unless you want underwhelming results.

Also, yes, you read it correctly, I've now definitively decided to hit pause on this newsletter in five weeks. We've gathered sufficient sample data on the state of AI image-generation in 2025. Significant advancements have to be made in the field before we pick this back up.

Please stay subscribed if you want, I won't message you in the meantime. The moment something interesting happens with the technology that has impact for photographers, writers, artists in general, you'll hear from me again.

Everything I make is self-funded — driven by curiosity, shared to add a little good to the world. Want to support that?

You can:
- purchase a print
- buy a book
- become a member

Thanks so much in advance.

LLM models used

GPT-5 for image interpretation, and image generation.

The input

Ivar Blaauw, who previously contributed to this newsletter already, has shared an interesting image-generation approach he came across on social media with me. If you haven't unsubscribed these past six weeks for the lack of 'hot-take tech' or general interesting results, perhaps this hyper-technical edition will do it! 😅 (plz don't though)

I recommend you watch the video linked above to get a quick understanding of the concept. If you don't wanna (which I understand), let's first just 'do the thing,' and then I'll explain what is happening here.

✍️

provide me with an advanced JSON context profile

The output

{
  "scene": {
    "location_type": "European town square",
    "environment": {
      "setting": "outdoor",
      "ground": "cobblestone street",
      "vegetation": ["large trees with green foliage", "hanging flower baskets"],
      "lighting": "soft daylight, overcast"
    },
    "architecture": {
      "buildings": "historic European style, white facades",
      "outdoor_seating": true,
      "cafes_restaurants": true
    }
  },
  "subjects": {
    "foreground": {
      "main_couple": {
        "approximate_age": "elderly",
        "clothing_style": "elegant, classic",
        "man": {
          "outfit": "white shirt, brown suspenders, beige trousers",
          "accessory": "dark leather briefcase"
        },
        "woman": {
          "outfit": "white blouse with tie-back, beige trousers, white sneakers",
          "accessory": "small handbag"
        },
        "gesture": "walking together, holding hands"
      }
    },
    "midground": [
      {
        "group": "families, children, stroller",
        "activity": "casual walking, socializing"
      },
      {
        "dog": {
          "breed_type": "small companion dog",
          "action": "walking alongside owner"
        }
      }
    ],
    "background": {
      "people": "seated at outdoor café tables",
      "activity": "dining, talking, socializing"
    }
  },
  "objects": {
    "vehicle": {
      "type": "delivery truck",
      "branding": "pharma logistics",
      "license_plate_visible": true
    },
    "street_elements": [
      "lamp posts with hanging flowers",
      "no parking sign",
      "restaurant signage"
    ],
    "furniture": ["tables", "chairs", "strollers"]
  },
  "atmosphere": {
    "mood": "leisurely, social, relaxed",
    "tone": "warm, nostalgic, communal"
  },
  "possible_metadata": {
    "image_type": "documentary / street photography",
    "time_period": "contemporary",
    "season": "summer (based on light clothing)",
    "location_guess": "Western Europe (likely Netherlands or Belgium)"
  }
}

A piece of JSON code that's generated based on the input image.

What we did here is break the photo down into simple building blocks, almost like a recipe card for the scene. Instead of a flowing story, it lists:

the where (a town square),
the who (an elderly couple, families, a dog),
the what (a truck, flowers, café tables),
and the vibe (relaxed, social, summery).

This format is called JSON in the tech world, but you can think of it as a neat way to organize messy reality into clear categories. It’s a way of describing a moment so that both people and computers can make sense of it.

✍️

Generate an image with an aspect ratio of 3:2 based on this description: {the description previously generated}

This content is only available to subscribers

Subscribe now and have access to all our stories, enjoy exclusive content and stay up to date with constant updates.

Unlock content

Already have an account? Sign In

Closer #20

Closer #019

Closer #018

Studio update, cats, sharing your work & organization

Closer #20

LLM models used

The input

The output

This content is only available to subscribers

Closer #20

Closer #019

Closer #018

Studio update, cats, sharing your work & organization

LLM models used

The input

The output

This content is only available to subscribers

Closer #019

Closer #018

Closer #017

Subscribe to Dialogue