What defines a mixed-use space?
How do we visually recognise the coexistence of residential, commercial, and public spaces in a city?
Can AI perceive these complex urban conditions as we do?
This workshop explores how Visual Language Models (VLMs) interact with urban imagery, using street view images to assess land-use patterns. By comparing Barcelona and Montreal, we examine how mixed-use elements appear, overlap, and influence urban classification.

Identifying Mixed Use Space

What are the key indicators of mixed-use environments?
- Storefronts & signage → Commercial activity
- Apartment balconies → Residential presence
- Public plazas & open spaces → Accessibility
- Streetlights, seating & pedestrian areas → Infrastructure for social interaction
Ranking mixed-use potential perceived from images
Different groups assigned varying scores based on perceived mixed-use density, and a set of instructions to rank the images. But, what made an image rank higher or lower?




Final Reflections: The Limits of Visual Perception
Through this exercise, we critically engaged with how AI might interpret mixed-use spaces by stepping into its role ourselves. The process revealed that ranking urban images requires more than just recognizing physical elements—it demands an understanding of social, spatial, and regulatory contexts.
While visual patterns provide cues, the true complexity of mixed-use spaces lies in their lived experience. AI can assist in identifying elements, but human expertise remains essential in defining what truly makes a space ‘mixed-use.’