ES
builder. agents, systems, whatever's next.
notes

The Semantic Camera

People keep asking whether AI can make art. This is the wrong question. It was the wrong question when they asked it about photography 150 years ago and it is the wrong question now.

In 1983, Vilém Flusser wrote Towards a Philosophy of Photography. He proposed that the camera is not a tool. A hammer is a tool. A tool extends the human body to act on the world. The camera is something else: an apparatus, a black box that operates according to its own internal program, a combinatorial space of possible outputs that the human operator (Flusser called them functionaries) can explore but never fully exhaust. The photographer does not use the camera the way a carpenter uses a hammer. The photographer plays within the camera's program.

An LLM is a camera. Not metaphorically. Structurally. "Camera" here names a class of apparatus: a black box with a program that turns an operator's framing into a technical artifact.

The camera takes photons from the world, passes them through glass and chemistry, and on the other end produces a technical image: the apparatus's interpretation of reality filtered through its program. The LLM takes language from a human who is in the world, passes it through billions of weighted connections shaped by training, and on the other end produces a technical output: the apparatus's interpretation of that reality filtered through its own program.

  • Camera: reality → light → optics → technical image

  • LLM: reality → language → weights → technical output

Photography starts with contact: light touching a sensor. The semantic camera starts one abstraction layer deeper, with language already compressing reality before the program even begins. The black box has even more room to launder its output into "truth."

In both cases, the human is the one who points the apparatus at something. The photographer points the camera at a street scene. The prompter points the model at a concept, a feeling, a question, a fragment of lived experience. The human points. The apparatus encodes. And in both cases, what comes out the other end is not reality. It is reality as encoded by the apparatus's program.

Flusser kept insisting: people mistake photographs for windows onto the world, but they are products of the black box. The same mistake happens with LLMs. People read the output as if it were "the answer" or "the truth." It is the model's programmatic transformation of the input. It is a technical text. It looks like language the way a photograph looks like reality, but it is a product of the apparatus.

Different models are different cameras. Text models and image models are different species of the same genus: generative apparatuses with distinct programs. DALL·E, Midjourney, Stable Diffusion, Claude, GPT, Gemini: each has its own optics, its own program, its own way of sampling and processing reality. A portrait taken with a Leica looks different from one taken with a Canon, not because reality changed but because the apparatus interprets differently. An image from Midjourney looks different from one from DALL·E, not because the prompt changed but because the weights, the architecture, the training corpus constitute a different lens.

The question was never "can the camera make art." The question was always "what is the human doing with the apparatus, and does it resist the program's tendency toward redundancy." By redundancy I mean the outputs the program wants to produce: the high probability defaults, the almost-duplicates, the auto mode of meaning.

Same question. New camera.


Notes from Inside the Flood, 2026

References: