Rock the Rock - IAAC BLOG

Team member(s): Neil Trouw, Aleksandra Kraeva, jose ignacio fuentes-cantillana monereo and Santosh Prabhu Shenbagamoorthy
Modified by Neil Trouw on May 19, 2025

A Real-Time Audio-Visual Stone Symphony

Rock the Rock is an interactive audio-visual installation that identifies and tracks rocks in real time, generating dynamic sound and projection overlays. By leveraging computer vision and finite state machines, it transforms geological forms into a sensory experience.

Concept and Context

This project served as our introduction to Finite State Machines (FSMs), developed for a camera-projection feedback setup. We chose to explore this hardware scenario through an immersive installation, using YOLO-based computer vision to track rocks and trigger visual and audio responses.

What Is a Finite State Machine?

A Finite State Machine is a computational model consisting of:

A finite number of states
Events that trigger transitions
Actions associated with each state

It defines how a system transitions between states in response to inputs. In our case, the FSM controlled the behavior of the detection, audio, and projection systems to ensure smooth and coordinated interaction.

Vision and Projection Setup

At the heart of the system is a feedback loop between camera-based detection and projector-based output. The camera captures the scene, detects rocks, and sends this data to a projection engine. The output—both visual and auditory—is mapped onto the physical location of the rocks, enabling a responsive and tactile interaction.

Rock Classification and Training

Using a pre-existing dataset on Roboflow, we trained a YOLO model to classify rocks into three categories:

Igneous
Metamorphic
Sedimentary

These classifications served as core input types for the FSM, allowing for distinct behavior based on geological identity.

System Iterations

First System: Distributed Architecture

In our initial setup, the system was split across two machines:

Device 1 (Python): Handled detection, FSM logic, and data packaging
Device 2 (TouchDesigner + Ableton): Received data and generated visuals and sound

Data was captured via OpenCV + YOLO, converted into JSON (coordinates, area, rock type), and sent via UDP socket to the second machine. This distributed architecture enabled high data throughput by separating computational tasks.

FSM 1: Modular but Complex

The first FSM defined distinct states for detection, projection, and sound playback. Each task was isolated into a controlled state, reducing interference and increasing clarity. However, the complexity introduced more potential points of failure and made debugging more difficult.

Second System: Centralized Architecture

In the refined version, we consolidated all logic and output into a single machine using Grasshopper as the central platform:

A Python script captured YOLO vision data and sent it via UDP
Grasshopper received, interpreted, and used this data to drive both audio and visual output natively

FSM 2: Lean and Reliable

This FSM had a minimal cycle:

Start detection and output
Stop on command or error

Fewer states meant fewer transitions and lower complexity, leading to faster and more reliable performance in short development cycles.

Reflection

While the second system performed more reliably, this was largely due to time constraints rather than inherent superiority. The first system, though more complex, offered greater modularity, performance scalability, and the potential for distributed computing across space and time.

With more development time, the distributed approach could evolve into a powerful framework for high-resolution, real-time data interpretation and feedback—especially for installations requiring multiple sensory outputs or remote access.

Closing Thoughts

Rock the Rock demonstrates the expressive power of structured systems like FSMs when paired with real-time data inputs and multi-sensory outputs. Whether centralized for rapid prototyping or distributed for complex installations, these systems open a path for responsive, computational design grounded in the physical world.

Tags: Artificial Intelligence, Python

Rock the Rock is a project of IAAC, Institute for Advanced Architecture of Catalonia developed in the Master in Robotics & Advanced Construction 01 - 2024-2025 by the student(s) Neil Trouw, Aleksandra Kraeva, jose ignacio fuentes-cantillana monereo and Santosh Prabhu Shenbagamoorthy during the course MRAC01 24/25 Hardware III "Interactive Projection" with Huanyu Li and Sameer Kishore.