Safe Reinforcement Learning under Uncertainty for Hybrid Separation Processes with Recycles in Chemical Engineering
| Subject area | Chemical and thermal process engineering, Reinforcement learning, Automation, Control systems, Intelligent technical systems, Nonlinear model predictive control, Surrogate modeling, Uncertainty quantification, Hybrid separation processes, Batch distillation and pervaporation |
| Term | since 2025 |
| Funding | Deutsche Forschungsgemeinschaft (DFG) |
Project description
Online methods for the optimal operation of chemical processes, such as nonlinear model predictive control (NMPC), are highly desirable to save resources and reduce costs. They typically rely on rigorous, first-principles dynamic models, but solving the resulting optimization problems in real time is often infeasible when the models capture complex behaviour such as plant start-up, which has so far limited the use of NMPC for many real chemical processes. The growing availability of large amounts of process data opens the door to data-driven operating strategies such as reinforcement learning (RL); however, the rigorous handling of process constraints and the large amount of data required have prevented RL from being widely adopted in chemical engineering.
In the first phase of this project, these challenges were addressed by replacing the rigorous model with a surrogate used inside an NMPC controller, while an RL algorithm interacting with the rigorous model adapted the NMPC problem to achieve optimal performance. The approach was demonstrated on the batch distillation of an ethanol–water mixture, with the column built and modelled at TU Berlin and the NMPC-based RL framework developed at TU Dortmund. The first phase, however, assumed the availability of a perfect rigorous model. Real chemical processes are subject to uncertainty arising from model parameters fitted to experimental data; when surrogates are used, the mismatch between the original model and the surrogate must also be accounted for. Moreover, chemical engineering rarely operates single units in isolation: processes are typically embedded in flowsheets that include recycles for mass or energy integration.
The second phase of the project tackles these challenges systematically. To improve parameter estimation, the objective of the NMPC-based RL framework is reformulated as an information metric inspired by optimal experimental design. Parametric uncertainty is quantified via Bayesian inversion or, alternatively, bootstrapping, and the mismatch between rigorous and surrogate models is addressed by training a Bayesian last-layer network. The quantified uncertainty is then used to formulate a robust NMPC, which is in turn adapted via RL using the rigorous model. The project also investigates how surrogate models of individual units can be combined by accounting for the resulting larger uncertainties. The proposed framework is applied to the hybrid separation of ethanol and water by batch distillation coupled with pervaporation: the column from the first phase is extended with the membrane process, and a dynamic model of pervaporation is built. The goal is the real-time optimal control of this hybrid process with recycles using the NMPC-based RL framework.
This project is the second phase of the previous "Safe Reinforcement Learning for Chemical Processes" project and is funded by the DFG within Priority Programme SPP 2331 "Machine Learning in Chemical Engineering" (project number 466380688). It is carried out jointly with Prof. Dr.-Ing. Jens-Uwe Repke (TU Berlin).
