Integrating Machine Learning with Advanced Building Simulation: Optimising Heating Energy, Thermal Comfort and Indoor Air Quality in naturally Ventilated Homes
Introduction
With urban populations projected to rise to 66% by 2050, the energy performance and indoor environmental quality (IEQ) of residential buildings are more critical than ever. Buildings account for a significant portion of global energy consumption and carbon emissions, making optimisation in design and operation essential. However, balancing energy use with occupant comfort and air quality is a complex challenge, particularly in naturally ventilated homes.
Our research introduces a machine learning-driven, multi-objective optimisation framework that streamlines decision-making for heating energy consumption, thermal comfort, and CO2 concentration (Figure 1). By integrating metamodeling with advanced optimisation techniques, we achieve a significant reduction in computational effort while maintaining high accuracy—providing a scalable solution for sustainable building management. This study presents a comprehensive framework for optimising energy consumption, thermal comfort, and indoor air quality in naturally ventilated residential dwellings. The methodology combines advanced parametric simulations, machine learning-based metamodels, and multi-objective optimisation to create energy-efficient and occupant-friendly designs. The approach balances heating energy consumption (kWh), thermal discomfort (hours), and elevated CO2 levels (hours), addressing the complex trade-offs inherent in building performance optimisation.

Simulation Framework
The core of this study is built on parametric simulations conducted using EnergyPlus. The simulations were tailored for semi-detached residential buildings in a temperate oceanic climate, representing typical Irish homes. The 3D models of archetypes were develop din Design Builder and then exported to EnergyPlus. Input parameters included building envelope characteristics (U-values), operational settings (heating setpoints, window operation schedules), and occupancy-related variables (density, metabolic rates). The simulation outputs focused on three performance metrics:
- Annual heating energy consumption (kWh).
- Thermal discomfort hours, as defined by WHO and CIBSE TM59 standards.
- Hours with indoor CO2 concentrations exceeding 1000 ppm, indicating poor air quality.
Zone-based modelling within EnergyPlus used the AirflowNetwork module to simulate airflow dynamics, accounting for natural ventilation and infiltration. Each zone's characteristics (e.g., occupancy schedules, ventilation rates) were individually modelled to provide granular insights into building performance.
Data Generation and Validation
The simulation process generated a dataset of 60,000 instances using Latin Hypercube Sampling (LHS) to ensure robust coverage of parameter space (Figure 2). Validation of the simulation outputs was conducted using real-world data collected from energy and IEQ sensors installed in Irish dwellings. The calibrated models demonstrated discrepancies of less than 8% between simulated and measured values for energy consumption, thermal discomfort, and CO2 levels, ensuring the reliability of the base case.

Development of a Machine Learning-Based Metamodel
To overcome the computational intensity of traditional simulations, the study developed a metamodel using machine learning algorithms, including XGBoost, Random Forest, and Support Vector Machines (SVM) (Figure 3). XGBoost emerged as the best performer, achieving up to 99% accuracy in predicting the three output metrics. The metamodel significantly reduced the computational load, enabling rapid predictions while maintaining high accuracy. Key steps in metamodel development included:
- Feature Selection: Sobol sensitivity analysis and Principal Component Analysis (PCA) identified critical parameters influencing performance metrics.
- Training and Validation: Hyperparameter tuning and cross-validation ensured the robustness of the predictive models.
- Integration: The metamodel was integrated into an optimisation framework for further analysis.

Multi-Objective Optimisation with NSGA-II
The optimisation process employed the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) to identify Pareto optimal solutions that balance the three competing objectives (Figure 4). Unlike traditional weighted-sum methods, NSGA-II handled each objective independently, identifying trade-offs between energy efficiency, thermal comfort, and air quality.
Occupancy-related variables, such as metabolic rate and window operation schedules, were included to ensure the robustness of the optimised solutions. The results revealed a significant reduction in CO2 levels (up to 94%) and thermal discomfort without excessive energy consumption, meeting WHO and CIBSE guidelines.

Design Implications
The study provides valuable insights for architects, engineers, and policymakers. By incorporating variables such as occupancy density, window operation times, and heating setpoints, the framework ensures designs are resilient to real-world variations in occupant behaviour (Table 1). The results also emphasise the importance of integrating energy consumption, thermal comfort, and indoor air quality into the design process for naturally ventilated homes. This case study showcases the potential of advanced simulation and optimisation tools to create healthier, more sustainable residential environments. It demonstrates how cutting-edge methodologies can be applied to address the complexities of building performance while meeting the needs of both energy efficiency and occupant well-being.

More details can be found in the following publications related to this work:
Publication 1: https://doi.org/10.1016/j.buildenv.2024.112255
Publication 2: https://doi.org/10.1088/1742-6596/2600/3/032002
Conclusion
By leveraging machine learning for building energy optimisation, this study provides a powerful tool to balance energy efficiency, thermal comfort, and IAQ in naturally ventilated residential buildings. The integration of occupancy-related factors ensures the developed framework is both practical and scalable, paving the way for smarter, more sustainable building management solutions.