Re-Routing Bikeways

Team member(s): Farida Fidvi and Christopher Mina
Modified by Farida Fidvi on July 9, 2024

In Switzerland, biking to work is highly encouraged and widely practiced. This is facilitated by the country’s extensive network of bike lanes, bike-friendly infrastructure, and commitment to sustainability However, in terms of safety, Switzerland saw 5,287 cyclist fatalities in 2022, up from 3,793 in 2020. Over the past five years, cycling injuries have increased by 14%, and e-biking injuries have surged by 77%. Meanwhile, incidents involving cars, pedestrians, and motorcycles have decreased.

**Figure 1** : Pie chart representing road accidents percentage caused involving different users

Irony leading to the aim

One would think that as the number of cyclists increase the number of cyclists related accidents and bound to increase, but ironically that is not the case, as it can be seen in the graph below the with the increase in the number of cyclists in a country the accidents reduce substantially which can be attributed easily to traffic considerations while designing as well as increased public awareness on road. Thus the need to emphasize increased safety measures and awareness to protect cyclists and other vulnerable road users in urban areas like Zurich where mixed traffic conditions pose significant risks by predicting the severity of accidents at different road junctions.

**Figure 2** : Statistical Analysis of different countries of number of cyclists vis a vis cyclists fatalities

Aim

Zürich is the largest city in Switzerland and has the highest number of cyclists in Switzerland, with approximately 10% of all trips within the city being made by bike. This translates to over 50,000 daily bicycle trips, highlighting the city’s strong cycling culture and infrastructure. The aim was to predict severity of accidents as classes depending on different features using node classification graph machine learning. The use of node classification was because of dual reasons as nodes are where the maximum accidents occur as well as that is how routes are decided from one node to the other.

The Figure 3 illustrates the dataset containing of the accident severity plotted the locations of police-registered traffic accidents in the urban area of Zurich and the severity level of it, since 2011.

**Figure 3:** Plotted Dataset of Accident Severity

Graph Machine Learning Aim

The above dataset is added as classes to the nodes which become the labelled nodes (Figure 4) and the aim is to find the classes, i.e. the severity of accidents which can be caused at the unlabeled nodes represented in green.

Class 0: Accident with material damage

Class 1: Accident with light injuries

Class 2: Accident with severe injuries or fatalities

and the Unknown Class is represented by green.

Methodology

Feature 01: Road Noise Pollution Emissions sections

Description: The dataset comprises emission sections delineating road noise pollution within the urban landscape of the City of Zurich as line strings. By integrating road noise data with accident records, a bicycle rider is also aware of the street noise of that bikeway, as cycling is usually done as a tranquil activity rather than just a means of transport.
Hypothesis: As per the initial hypothesis more street noise is a parameter that can be directly co-related to –

Street Vehicular Traffic
Pedestrian Rush
Street Hawkers
Concentration levels of bicycle riders

and that is why this dataset was chosen to find out the Unknown Class. Figure 5 represents the color coded line strings to show the noise level from 5 dB to over 400 dB.

**Figure 5:** Plotted Dataset of Accident Severity Classes

These values are attributed to the nodes to the nodes from the centroids of these line strings using R-tree to find the nearest nodes, the attributes added were as follows

Quiet Nodes: < 20 dB
20 dB< Noticeable Nodes < 40 dB
40 dB=< Loud Nodes =< 60 dB
60 dB < Very Loud Nodes

**Figure 6:** Nodes with Street Noise Attributes plotted

Feature 02: Proximity Cluster

Neighborhood clusters usually have similarities, which is why K-means clustering is used to calculate the proximity index of each node from cluster 0 to 9 and used as a feature to the node. To decide the number of clusters in which the city should be divided, the silhouette core from sklearn.metrics was calculated to be 0.385 for 10 clusters. Any more or less, and the score dropped significantly. This is a good conclusion, as Zurich is divided into 10 districts accordingly.

**Figure 7:** K – means Clustering used to plot proximity clusters

**Figure 8:** Correlation Heatmap of Feature 01, Proximity Cluster Index and Classes

Initial Training

Co-relation mapping

The above two features were chosen for the initial training, before that a correlation heatmap was generated to evaluate the correlation between the features and the classes as seen in Figure 8, from the map we inferenced that proximity cluster index and Very Loud noise attributes shows a correlation with the classes as well as the proximity cluster index as compared to the other attributes, but none of them show a correlation above 0.5.

Masking of Labelled nodes

The below figure 9, represents the labelled and unlabeled nodes in part ???.For training, the labelled nodes, they are split into 80, 10, 10 % as train, validation and test nodes , which are the represent in part????, with the last graph representing the nodes to be predicted which are the unlabeled nodes, i.e. the Unknown Class.

Training

The hyperparameters for training were number of hidden layers =4 ,learning rate=0.01 and n umber of epochs=100 .False Hope: Though the training graph gave an illusion that the training went good as seen in Figure 10.

Initial Training Conclusion: From the confusion matrix(Figure 11) it became clear that only Class 01 is getting predicted well while other classes hardly get predicted, and the training graph looked good just because the machine is probably getting trained well for Class 01.

This led to the understanding that probably we we were overestimating the significance of the correlation between noise and severity of accidents, as though practically our hypothesis makes sense but in actuality it wasn’t that simple. Which led to adding more features which have a first layered impact on the pedestrian and vehicular traffic and visibility rather than a double layered one as noise.

Feature 03 : Hotspots:Building and typologies and route accompanients

As per research, what could be the hotspots of bicycle accidents or undirected crowd mingling as well as road particulars were added from OSM as well as existence and number of signals, crossing and streetlamps all having a direct impact on the severity of accidents.

The features plotted are shown in the Figure 12, The strategy used to transfer this data to the nodes is similar, the centroid of these polygons or points are taken as center points to construct bounding boxes of width 500, and based on the number of times a node has been selected and this data is added to each node as an attribute, this is done because as because of these features the traffic will get affected of all the nearby nodes, these values are normalized before being added as attributes.

**Figure 12:** Left: Features taken from OSM ,Right: Hotspots plotted on Zurich Map

Final Training

Co-relation mapping

A correlation matrix was plotted to understand all the features relationship with the classes, in which we can see the signals and crossings, along with the residential , very loud and proximity cluster have a better relationship followed by commercial, parks, public building and then the streetlamps.

**Figure 13:** Correlation Heatmap of all features and the Classes

Training

Keeping the hyperparameters same, the model was trained once again, but as it can be that though the training graph looked good, the confusion matrix made it clear that only Class 01 was getting predicted well.

Adding Class weights

Keeping the hyperparameters same, class balancing was attempted to see if the results improve, but as it can be concluded from the confusion matric only Class 01 gets predicted well, the earlier verticality in the graph gets converted to a horizontality.

**Figure 15:Left:**Training and Validation Curve, **Right:** Confusion Matrix

Trained Graph: Testing Ground Truth vs Prediction

**Figure 16:**Comparision between the Ground Truth and the Prediction of the Labelled Nodes

Completing Missing Classes

As can be seen clearly in the figure below most of the nodes get predicted as Class 01.

**Figure 17:** The Incomplete and the completed Accident Severity on Zurich nodes

Conclusion

Class 01 that is Accidents with Light Injuries is predicted well in all scenarios, which can be because of two reasons, the happenstance can be because this class is actually dependent on the features used or it could just be because this Class had the most number of nodes with these attributes.

Tags: Structure, Motor

Re-Routing Bikeways is a project of IAAC, Institute for Advanced Architecture of Catalonia developed in the Master in Advanced Computation for Architecture and Design - 2023-2024 by the student(s) Farida Fidvi and Christopher Mina during the course MaCAD 23/24 Graph Machine Learning with David Andres Leon and Erida Bendo.