Processing Pipeline

From Sensor Data to Real-Time Safety Alerts

Overview

The CycleSafely pipeline processes multi-modal sensor data in real-time to detect vehicles, predict their trajectories, and alert cyclists of potential dangers. The four-stage pipeline consists of: Object Detection, Object Tracking, Trajectory Prediction, and Collision Risk Assessment.

The system supports two deployment modes: a full desktop/embedded version with comprehensive sensors and processing, and an optimized mobile version for smartphones that prioritizes real-time performance on resource-constrained devices.

Stage 1: Data Acquisition and Sensor Fusion

Sensor Input

The system captures synchronized multi-modal sensor data from cameras, LIDAR, GPS, and IMU sensors to build a comprehensive understanding of the cyclist's environment and surrounding traffic.

  • Desktop/Embedded: Velodyne HDL-64E or Livox Mid-360 LIDAR for high-resolution 3D point clouds, RGB-D camera for high-quality visual data, and Insta 360 camera for panoramic capture.
  • Mobile: Smartphone camera (mounted on handlebar) with adaptive resolution based on processing load, optional external sensors via Bluetooth/USB.

Registration and Localization

GPS and IMU data are fused with visual odometry to precisely determine the cyclist's position, orientation, and velocity in real-time, enabling accurate coordinate transformations and map alignment.

  • Desktop/Embedded: ORB-SLAM3 for robust visual odometry and registration, point cloud registration and alignment with OpenStreetMap data.
  • Mobile: Simplified visual odometry using feature tracking, compass-based orientation estimation, and GPS+IMU fusion with Kalman filtering.

Stage 2: Object Detection (SFA3D/YOLO3D)

Vehicle Detection

Deep learning models identify vehicles in the surrounding environment by analyzing sensor data to create 3D bounding boxes, achieving real-time detection rates of over 10 frames per second on both platforms.

Object Detection in Action

Real-time object detection showing bounding boxes around detected vehicles

  • Desktop/Embedded: SFA3D (Super Fast and Accurate 3D Object Detection) using PyTorch processes LIDAR point clouds converted to Bird's-Eye View (BEV) representations, combined with YOLOv4/YOLO3D for RGB image detection and vehicle surface reconstruction.
  • Mobile: Quantized YOLO3D model (INT8) with GPU/NPU acceleration (Metal/CoreML on iOS, TensorFlow Lite on Android), monocular depth estimation using LiteDepth, and frame skipping during low-risk scenarios to conserve battery.

Stage 3: Object Tracking (3D Multi-Object Tracker)

Multi-Vehicle Tracking

Detected vehicles are tracked across consecutive frames using Kalman filtering to maintain consistent identities, handle occlusions, and compute distances with low identity switch rates for reliable monitoring.

  • Desktop/Embedded: DeepSORT for robust multi-object tracking with precise distance computation using LIDAR point clouds.
  • Mobile: Lightweight tracking algorithm (simplified DeepSORT) with approximate distance estimation from bounding box size and temporal coherence.

Trajectory Analysis

The system analyzes vehicle motion by computing speed and acceleration from GPS and IMU data, calculating relative velocities, fitting smooth path curves, and classifying driver behaviors such as passing, turning, or crossing maneuvers.

Trajectory Extraction

Trajectory extraction showing vehicle paths over time

Stage 4: Trajectory Prediction and Risk Assessment

PRECOG Trajectory Prediction

The PRECOG (PREdiction Conditioned On Goals) framework forecasts future vehicle trajectories over 1-3 seconds by generating multi-modal predictions that consider vehicle dynamics, driver intentions, environmental constraints, and interactions between multiple agents.

Trajectory Prediction

PRECOG trajectory prediction showing future vehicle paths

  • Desktop/Embedded: Full PRECOG framework using TensorFlow with multi-agent trajectory prediction, goal-conditioned forecasting based on likely destinations, and consideration of road boundaries and traffic rules.
  • Mobile: Compressed PRECOG model or heuristic-based prediction with simplified risk assessment (distance and speed based) focusing on immediate threats with a shorter prediction horizon.

Collision Risk Assessment

Predicted trajectories are analyzed against the cyclist's path to calculate time-to-collision (TTC), estimate collision probability, detect safety distance violations (1.5m threshold for vehicles exceeding 30km/h), identify traffic law violations, and classify collision severity levels.

Stage 5: Output, Alerts, and Recording

User Interface and Alerts

The system provides multi-modal warnings through visual displays, audible alerts, and haptic feedback to ensure cyclists are immediately aware of imminent dangers while maintaining focus on the road.

  • Desktop/Embedded: Real-time display on tablet showing distance warnings and visualization of predicted vehicle paths.
  • Mobile: Simple visual indicators on phone screen with glanceable display design optimized for cycling safety, plus vibration alerts for close passes.

Data Recording and Analysis

Continuous video buffering automatically records the last N seconds before incidents, stores localized collision statistics, and uploads anonymized data (with faces and license plates blurred) for post-incident reconstruction and safety research.

  • Desktop/Embedded: High-quality continuous video buffer with immediate upload capability for comprehensive analysis.
  • Mobile: Rolling buffer recording limited by storage constraints, with background upload when connected to WiFi to conserve mobile data.

Mobile Optimizations (BA Kozonits)

Key optimizations enabling real-time processing on smartphones:

  • Model Quantization: INT8 quantization reducing inference time by 3-4x
  • Model Compression: Pruning redundant parameters and knowledge distillation
  • Efficient Processing: Selective processing of frames and regions of interest
  • Asynchronous Pipeline: Parallel processing stages for better performance
  • Battery Optimization: Battery-aware operation modes and adaptive resolution
  • Temporal Coherence: Exploiting frame-to-frame consistency

System Integration

Both pipeline variants integrate with external systems and tools:

  • OpenStreetMap: For road geometry, lane markings, and semantic map data
  • Cloud Backend: For anonymized data collection and aggregated statistics
  • Post-Processing Tools: For incident reconstruction and detailed analysis
  • Visualization Platform: For displaying collision statistics, road widths, and dangerous locations (Rerun 3D Viewer)
  • CARLA Simulator: For synthetic data generation and testing

Performance Characteristics

Desktop/Embedded

  • High accuracy with comprehensive sensors
  • Real-time processing >10 FPS
  • Precise distance measurements from LIDAR
  • Full PRECOG multi-agent prediction
  • Requires dedicated computing hardware

Mobile

  • Optimized for battery life and thermal management
  • Acceptable frame rates on mid-range phones
  • Approximate distances from camera
  • Simplified prediction models
  • Runs on standard smartphone hardware