PhD-BPD Doctoral Proposal Presentation: Tiancheng Zhao
Title: Intelligent Chat Bot: The Robot Can See, Reason, and Communicate
Name: Tiancheng Zhao, Ph.D. candidate in Building Performance and Diagnostics (PhD-BPD)
Date: Wednesday, December 17, 2025
Time: 1:00-3:00pm ET
Location: Remote on Zoom
Dissertation Committee:
Erica Cochran Hameen, Ph.D., Assoc. AIA, NOMA, LEED AP
Associate Professor
School of Architecture
Carnegie Mellon University
Bhiksha Ramakrishnan, Ph.D.
Language Technologies Institute
Carnegie Mellon University
Deva Kannan Ramanan, Ph.D.
Robotics Institute
Carnegie Mellon University
Abstract:
This research introduces a unified, multimodal AI framework that fundamentally rethinks how Indoor Environmental Quality (IEQ) is measured, analyzed, and improved within buildings. At its core, the system integrates three traditionally separate modalities — environmental sensor streams, visual context from onboard cameras, and natural language feedback from occupants — into a coherent representation of occupant comfort and indoor environmental performance. By leveraging advancements in Large Language Models (LLMs) and Vision Foundation Models (VFMs), the system elevates IEQ assessment from a labor-intensive, episodic activity to a continuous, autonomous, and expert-level analytic process.
The core concept is to treat a building as a spatial-temporal knowledge space, where each location contains accumulated environmental statistics, visual features, and human comfort cues. A SLAM-generated map serves as a stable coordinate system, enabling the robot to anchor measurements and visual observations to precise locations. Instead of repeatedly recomputing expensive vision features, the system constructs a persistent, spatially indexed cache of scene-level and object-level embeddings, enabling rapid retrieval of contextual information during assessment. This transforms the robot’s world model into a dynamic memory structure similar to a KV-cache in language models, but grounded in physical space.
This research also provides a secondary central contribution focused on adapting open-source MLLMs into specialized IEQ reasoning experts. Through a hierarchy of strategies — in-context learning, supervised fine-tuning, and preference-based alignment (reinforcement learning from human feedback / direct preference optimization) — the system distills high-quality domain knowledge from models such as GPT-5 while ensuring privacy, hardware feasibility, and real-time deployability on a mobile platform. This gives the IEQ robot the ability not only to interpret multimodal signals, but also to generate actionable, human-centered suggestions tailored to each occupant’s needs and the building’s physical conditions.
The impact of this approach is significant. It shifts IEQ monitoring from human technicians and expert consultants to an autonomous agent capable of high-resolution sensing, rich contextual interpretation, and personalized recommendation generation. It dramatically increases spatial and temporal coverage, reduces operational cost, and democratizes expert-level IEQ guidance. More broadly, the work demonstrates how robotics, LLMs and VFMs can be integrated into a coherent computational architecture that bridges the gap between physical sensing, semantic context, and human comfort. This framework lays the foundation for intelligent indoor environments that proactively maintain health, comfort, and sustainability in ways that were previously impractical or impossible through manual assessment alone.