A System for 3D Visualization of Linguistic Structures

so, what is a room?

Beyond being a mere term, a room is a spatial entity with defined dimensions, enclosed by walls, and interconnected with its surrounding spaces. Understanding a room in this way requires more than just recognizing the word; it involves comprehending its topological and spatial context.

Exploratory – Framework

Our study focuses in finding a common ground between language, its geometrical representation a topological aspects with the goal to create a a geometric language that enhances 3D prompting. Further we will delve into methodologies for mapping text elements onto a 3d plane.

Research – Framework

At the start of our research, we explored text-to-image systems as an initial approach to spatial mapping. We found that, despite highly specific instructions, the system consistently struggled to interpret spatial commands accurately.

3D CAD model generation from natural language:

Strengths:

  • Excels at translating prompts into 3D models.
  • Allows for precise customization and rapid prototyping, particularly for architectural designs.

Limitations:

  • May struggle with accurately depicting spatial relationships in complex designs.

Text-to-function call using GPT-4

Strengths:

  • Enhanced Accessibility: Makes advanced design tools usable without extensive technical skills.
  • Intuitive Interaction: Enables natural language commands for easier software manipulation.
  • Real-Time Feedback: Provides immediate updates and adjustments within the Rhino viewport.

Limitations:

  • The system’s effectiveness is contingent upon the accuracy of GPT-4 in generating correct live function calls, which may impact the reliability of the results.

Methodology – Introduction

As prior to the pipeline development, we experimented with word mapping, in order to get some insights on how a geometric language can be developed. First we identified the geometric associations. the transformations related to these, and finally, topological associations, nonetheless this was limited to the context of architecture.

Classification and Hierarchy

Since one of our goals was to develop a language capable of understanding topological relationships, we integrated TopologicPy to guide the methodological aspect of our research. This approach provided valuable insights into the hierarchical relationships between architectural elements.

We started off with a limited amount of architectural elements, then we classified them accordingly to topologic hierarchical categories. Then studied how these related to each other. We found 2 main different conditions, which was a parent-child relationship and adjacency relationships.

Once the classification process is complete, we construct a topology dataframe, which organizes the architectural elements and their relationships. This dataframe is structured with rows and columns, as illustrated in this figure. The key components of the dataframe are as follows:

  • Rows: Each row represents an individual architectural element (word) from the dataset.
  • Columns: The columns capture the various topological and geometric relationships associated with each element.

Methodology – Workflow

5 main steps are identified, parting from a prompt, we would look into ways to process those instructions to then be categorized and mapped. Once this is achieved, the spatial instructions will be processed by out topologic pseudocode in order to achieve a geometric response.

Files Structure

This is how we organized our file structure. Using a main tokenization script to process prompts, then a script that would be the placeholder for the rest of the functions involved. this range from prompt processing, prompt classification and geometry creation, addition and transformation.

This is how the files relate to the workflow in a linear way. We envision this can be the back end of a web app where the used can prompt in and later visualize the prompt.

Methodology – Part I

First we will delve into the prompt processing. from every prompt the words will be classified and associations for each word will be found. From every prompt, we will identify commands, objects and parameters, to later use as the base for a geometrical response.

Commands are mapped according to their role in 3 categories, creation, addition and transformation. Objects are categorized as the topological category they belong to and finally parameters are classified according to they function: unit, angle, direction etc. So, here create a room, is mapped as create a cell.

Methodology – Part II

The second part of our methodology involves all the steps used to reach a topological response after the input parsing: Briefly the inputs are classified, each parameter is stored as a dynamic variable in order to ensure flexibility when being linked to the topologic functions.

  • During the Input Classification stage, all prompts are stored as a list, then numbered with a numerical identifier.
  • then Dynamic Variables are created for each parameter are created and later used as parameters to complete the topologic function.
  • each operation in our system will have a Link to Topologic function associated: example create_cell, add_cell, create_wall, add_wall etc. from each of these functions we obtain a geometry and a dictionary for each of the elements composing it.
  • this information is stored in a json format, to be later used in the following prompt
  • Finally, Shape Grammar rules are set for linking additional topologies with the existing topologies, based on our exploration of shape grammars.

Results – Findings

Prompting

Flexibility Constraints

  • Flexibility of the system was limited by the parameters required for each function.
  • Specific operations demanded predefined inputs, leaving no room for variation or alternative approaches.
  • The system would require a predictive text tool to handle typing errors and ensure more accurate input, further improving user experience and operational efficiency.
  • This limitation could potentially be mitigated by incorporating a fine-tuned LLM capable of generating these variables dynamically, offering greater adaptability and customization.

System

First Text-to-3D Prompt

The initial text-to-3D input was successfully achieved, with the first prompt functioning correctly and able to host variable parameters. However, the current method of information storage may present challenges in the future, potentially leading to inefficiencies or errors as the system scales.

Second Text-to-3D Prompt

Although the system handled the predefined second prompt, it struggled to accommodate prompts beyond the initial configuration. This highlights the need for further flexibility in the system.

Response Chaining

The system’s ability to dynamically chain functions did not perform as expected, resulting in a failure. This indicates a need for further refinement to enable seamless function chaining and improve the overall process flow.

The resulting topologies from a series of prompts adding rooms with specified parameters and attached to a specified location on the existing rooms from previous prompts.

Results – Conclusion

Potential: With the use of Topologicpy, the methodology’s output can be used for further analysis ; energy modeling, graph machine learning, and more.

Portability: Does not require to load a large language model, which allows for better portability.

Hierarchy, Connectivity, Preciseness: The resultant topologies have successfully inherited the hierarchy of elements.

  • Precise models, when done well.

Inflexibility: Multiple options / cases / variations for shapes, objects positions and orientations.

  • Difficulty in predicting all cases.
  • Requires allowances for each variation / possibility.

Prompting: system for prompting would have been better with an LLM as a base

Inputs / Parameters: It would be helpful to make improvements in the parameter dynamic input part.