Intelligent Systems
Note: This research group has relocated.


2022


The Wheelbot: A Jumping Reaction Wheel Unicycle
The Wheelbot: A Jumping Reaction Wheel Unicycle

Geist, A. R., Fiene, J., Tashiro, N., Jia, Z., Trimpe, S.

IEEE Robotics and Automation Letters, 7(4):9683-9690, IEEE, 2022 (article)

Abstract
Combining off-the-shelf components with 3D- printing, the Wheelbot is a symmetric reaction wheel unicycle that can jump onto its wheels from any initial position. With non-holonomic and under-actuated dynamics, as well as two coupled unstable degrees of freedom, the Wheelbot provides a challenging platform for nonlinear and data-driven control research. This letter presents the Wheelbot's mechanical and electrical design, its estimation and control algorithms, as well as experiments demonstrating both self-erection and disturbance rejection while balancing.

link (url) DOI [BibTex]

2022

link (url) DOI [BibTex]

2021


Learning-enhanced robust controller synthesis with rigorous statistical and control-theoretic guarantees
Learning-enhanced robust controller synthesis with rigorous statistical and control-theoretic guarantees

Fiedler, C., Scherer, C. W., Trimpe, S.

In 60th IEEE Conference on Decision and Control (CDC), IEEE, December 2021 (inproceedings) Accepted

Abstract
The combination of machine learning with control offers many opportunities, in particular for robust control. However, due to strong safety and reliability requirements in many real-world applications, providing rigorous statistical and control-theoretic guarantees is of utmost importance, yet difficult to achieve for learning-based control schemes. We present a general framework for learning-enhanced robust control that allows for systematic integration of prior engineering knowledge, is fully compatible with modern robust control and still comes with rigorous and practically meaningful guarantees. Building on the established Linear Fractional Representation and Integral Quadratic Constraints framework, we integrate Gaussian Process Regression as a learning component and stateof-the-art robust controller synthesis. In a concrete robust control example, our approach is demonstrated to yield improved performance with more data, while guarantees are maintained throughout.

link (url) [BibTex]

2021

link (url) [BibTex]


no image
Local policy search with Bayesian optimization

Müller, S., von Rohr, A., Trimpe, S.

In Advances in Neural Information Processing Systems 34, 25, pages: 20708-20720, (Editors: Ranzato, M. and Beygelzimer, A. and Dauphin, Y. and Liang, P. S. and Wortman Vaughan, J.), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021) , December 2021 (inproceedings)

Abstract
Reinforcement learning (RL) aims to find an optimal policy by interaction with an environment. Consequently, learning complex behavior requires a vast number of samples, which can be prohibitive in practice. Nevertheless, instead of systematically reasoning and actively choosing informative samples, policy gradients for local search are often obtained from random perturbations. These random samples yield high variance estimates and hence are sub-optimal in terms of sample complexity. Actively selecting informative samples is at the core of Bayesian optimization, which constructs a probabilistic surrogate of the objective from past samples to reason about informative subsequent ones. In this paper, we propose to join both worlds. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient. Based on the model, the algorithm decides where to query a noisy zeroth-order oracle to improve the gradient estimates. The resulting algorithm is a novel type of policy search method, which we compare to existing black-box algorithms. The comparison reveals improved sample complexity and reduced variance in extensive empirical evaluations on synthetic objectives. Further, we highlight the benefits of active sampling on popular RL benchmarks.

arXiv GitHub link (url) [BibTex]

arXiv GitHub link (url) [BibTex]


Using Physics Knowledge for Learning Rigid-Body Forward Dynamics with Gaussian Process Force Priors
Using Physics Knowledge for Learning Rigid-Body Forward Dynamics with Gaussian Process Force Priors

Rath, L., Geist, A. R., Trimpe, S.

In Proceedings of the 5th Conference on Robot Learning, 164, pages: 101-111, Proceedings of Machine Learning Research, (Editors: Faust, Aleksandra and Hsu, David and Neumann, Gerhard), PMLR, 5th Conference on Robot Learning (CoRL 2021), November 2021 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Models for Data-Efficient Reinforcement Learning on Real-World Applications

Doerr, A.

University of Stuttgart, Stuttgart, October 2021 (phdthesis)

DOI [BibTex]

DOI [BibTex]


GoSafe: Globally Optimal Safe Robot Learning
GoSafe: Globally Optimal Safe Robot Learning

Baumann, D., Marco, A., Turchetta, M., Trimpe, S.

In 2021 IEEE International Conference on Robotics and Automation (ICRA 2021), pages: 4452-4458, IEEE, Piscataway, NJ, IEEE International Conference on Robotics and Automation (ICRA 2021), October 2021 (inproceedings)

DOI [BibTex]

DOI [BibTex]


Probabilistic robust linear quadratic regulators with Gaussian processes
Probabilistic robust linear quadratic regulators with Gaussian processes

von Rohr, A., Neumann-Brosig, M., Trimpe, S.

Proceedings of the 3rd Conference on Learning for Dynamics and Control, pages: 324-335, Proceedings of Machine Learning Research (PMLR), Vol. 144, (Editors: Jadbabaie, Ali and Lygeros, John and Pappas, George J. and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.), PMLR, Brookline, MA 02446 , 3rd Annual Conference on Learning for Dynamics and Control (L4DC), June 2021 (conference)

Abstract
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design. While learning-based control has the potential to yield superior performance in demanding applications, robustness to uncertainty remains an important challenge. Since Bayesian methods quantify uncertainty of the learning results, it is natural to incorporate these uncertainties in a robust design. In contrast to most state-of-the-art approaches that consider worst-case estimates, we leverage the learning methods’ posterior distribution in the controller synthesis. The result is a more informed and thus efficient trade-off between performance and robustness. We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin. The formulation is based on a recently proposed algorithm for linear quadratic control synthesis, which we extend by giving probabilistic robustness guarantees in the form of credibility bounds for the system’s stability. Comparisons to existing methods based on worst-case and certainty-equivalence designs reveal superior performance and robustness properties of the proposed method.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


On exploration requirements for learning safety constraints
On exploration requirements for learning safety constraints

Massiani, P., Heim, S., Trimpe, S.

In Proceedings of the 3rd Conference on Learning for Dynamics and Control, pages: 905-916, Proceedings of Machine Learning Research (PMLR), Vol. 144, (Editors: Jadbabaie, Ali and Lygeros, John and Pappas, George J. and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie), PMLR, 3rd Annual Conference on Learning for Dynamics and Control (L4DC), June 2021 (inproceedings)

Abstract
Enforcing safety for dynamical systems is challenging, since it requires constraint satisfaction along trajectory predictions. Equivalent control constraints can be computed in the form of sets that enforce positive invariance, and can thus guarantee safety in feedback controllers without predictions. However, these constraints are cumbersome to compute from models, and it is not yet well established how to infer constraints from data. In this paper, we shed light on the key objects involved in learning control constraints from data in a model-free setting. In particular, we discuss the family of constraints that enforce safety in the context of a nominal control policy, and expose that these constraints do not need to be accurate everywhere. They only need to correctly exclude a subset of the state-actions that would cause failure, which we call the critical set.

link (url) [BibTex]

link (url) [BibTex]


Structured learning of rigid-body dynamics: A survey and unified view from a robotics perspective
Structured learning of rigid-body dynamics: A survey and unified view from a robotics perspective

Geist, A. R., Trimpe, S.

GAMM-Mitteilungen, 44(2):e202100009, Special Issue: Scientific Machine Learning, 2021 (article)

Abstract
Accurate models of mechanical system dynamics are often critical for model-based control and reinforcement learning. Fully data-driven dynamics models promise to ease the process of modeling and analysis, but require considerable amounts of data for training and often do not generalize well to unseen parts of the state space. Combining data-driven modeling with prior analytical knowledge is an attractive alternative as the inclusion of structural knowledge into a regression model improves the model's data efficiency and physical integrity. In this article, we survey supervised regression models that combine rigid-body mechanics with data-driven modeling techniques. We analyze the different latent functions (such as kinetic energy or dissipative forces) and operators (such as differential operators and projection matrices) underlying common descriptions of rigid-body mechanics. Based on this analysis, we provide a unified view on the combination of data-driven regression models, such as neural networks and Gaussian processes, with analytical model priors. Furthermore, we review and discuss key techniques for designing structured models such as automatic differentiation.

DOI [BibTex]

DOI [BibTex]


Practical and Rigorous Uncertainty Bounds for Gaussian Process Regression
Practical and Rigorous Uncertainty Bounds for Gaussian Process Regression

Fiedler, C., Scherer, C. W., Trimpe, S.

In The Thirty-Fifth AAAI Conference on Artificial Intelligence, the Thirty-Third Conference on Innovative Applications of Artificial Intelligence, the Eleventh Symposium on Educational Advances in Artificial Intelligence, 8, pages: 7439-7447, AAAI Press, Palo Alto, CA, Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021), Thirty-Third Conference on Innovative Applications of Artificial Intelligence (IAAI 2021), Eleventh Symposium on Educational Advances in Artificial Intelligence (EAAI 2021), May 2021 (inproceedings)

Abstract
Gaussian Process regression is a popular nonparametric regression method based on Bayesian principles that provides uncertainty estimates for its predictions. However, these estimates are of a Bayesian nature, whereas for some important applications, like learning-based control with safety guarantees, frequentist uncertainty bounds are required. Although such rigorous bounds are available for Gaussian Processes, they are too conservative to be useful in applications. This often leads practitioners to replacing these bounds by heuristics, thus breaking all theoretical guarantees. To address this problem, we introduce new uncertainty bounds that are rigorous, yet practically useful at the same time. In particular, the bounds can be explicitly evaluated and are much less conservative than state of the art results. Furthermore, we show that certain model misspecifications lead to only graceful degradation. We demonstrate these advantages and the usefulness of our results for learning-based control with numerical examples.},

link (url) [BibTex]

link (url) [BibTex]


A little damping goes a long way
A little damping goes a long way

Heim, S., Millard, M., Mouel, C. L., Badri-Spröwitz, A.

In Integrative and Comparative Biology, 61(Supplement 1):E367-E367, Oxford University Press, Society for Integrative and Comparative Biology Annual Meeting (SICB Annual Meeting 2021) , March 2021 (inproceedings)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Robot Learning with Crash Constraints

Marco, A., Baumann, D., Khadiv, M., Hennig, P., Righetti, L., Trimpe, S.

IEEE Robotics and Automation Letters, 6(2):1439-1446, IEEE, February 2021 (article)

Abstract
In the past decade, numerous machine learning algorithms have been shown to successfully learn optimal policies to control real robotic systems. However, it is common to encounter failing behaviors as the learning loop progresses. Specifically, in robot applications where failing is undesired but not catastrophic, many algorithms struggle with leveraging data obtained from failures. This is usually caused by (i) the failed experiment ending prematurely, or (ii) the acquired data being scarce or corrupted. Both complicate the design of proper reward functions to penalize failures. In this paper, we propose a framework that addresses those issues. We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints, where no data is obtained upon constraint violation. The no-data case is addressed by a novel GP model (GPCR) for the constraint that combines discrete events (failure/success) with continuous observations (only obtained upon success). We demonstrate the effectiveness of our framework on simulated benchmarks and on a real jumping quadruped, where the constraint threshold is unknown a priori. Experimental data is collected, by means of constrained Bayesian optimization, directly on the real robot. Our results outperform manual tuning and GPCR proves useful on estimating the constraint threshold.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Joint State and Dynamics Estimation With High-Gain Observers and Gaussian Process Models

Buisson-Fenet, M., Morgenthaler, V., Trimpe, S., Di Meglio, F.

IEEE Control Systems Letters, 5(5):1627-1632, 2021 (article)

Abstract
With the rising complexity of dynamical systems generating ever more data, learning dynamics models appears as a promising alternative to physics-based modeling. However, the data available from physical platforms may be noisy and not cover all state variables. Hence, it is necessary to jointly perform state and dynamics estimation. In this letter, we propose interconnecting a high-gain observer and a dynamics learning framework, specifically a Gaussian process state-space model. The observer provides state estimates, which serve as the data for training the dynamics model. The updated model, in turn, is used to improve the observer. Joint convergence of the observer and the dynamics model is proved for high enough gain, up to the measurement and process perturbations. Simultaneous dynamics learning and state estimation are demonstrated on simulations of a mass-spring-mass system.

DOI [BibTex]

DOI [BibTex]


Wireless Control for Smart Manufacturing: Recent Approaches and Open Challenges
Wireless Control for Smart Manufacturing: Recent Approaches and Open Challenges

Baumann, D., Mager, F., Wetzker, U., Thiele, L., Zimmerling, M., Trimpe, S.

Proceedings of the IEEE, 109(4):441-467, 2021 (article)

arXiv DOI [BibTex]

arXiv DOI [BibTex]


Learning Event-triggered Control from Data through Joint Optimization
Learning Event-triggered Control from Data through Joint Optimization

Funk, N., Baumann, D., Berenz, V., Trimpe, S.

IFAC Journal of Systems and Control, 16, pages: 100144, 2021 (article)

Abstract
We present a framework for model-free learning of event-triggered control strategies. Event-triggered methods aim to achieve high control performance while only closing the feedback loop when needed. This enables resource savings, e.g., network bandwidth if control commands are sent via communication networks, as in networked control systems. Event-triggered controllers consist of a communication policy, determining when to communicate, and a control policy, deciding what to communicate. It is essential to jointly optimize the two policies since individual optimization does not necessarily yield the overall optimal solution. To address this need for joint optimization, we propose a novel algorithm based on hierarchical reinforcement learning. The resulting algorithm is shown to accomplish high-performance control in line with resource savings and scales seamlessly to nonlinear and high-dimensional systems. The method’s applicability to real-world scenarios is demonstrated through experiments on a six degrees of freedom real-time controlled manipulator. Further, we propose an approach towards evaluating the stability of the learned neural network policies.

arXiv link (url) DOI [BibTex]


Event-triggered Learning for Linear Quadratic Control
Event-triggered Learning for Linear Quadratic Control

Schlüter, H., Solowjow, F., Trimpe, S.

IEEE Transactions on Automatic Control, 66(10):4485-4498, 2021 (article)

arXiv DOI [BibTex]

arXiv DOI [BibTex]


no image
Controller Design via Experimental Exploration With Robustness Guarantees

Holicki, T., Scherer, C. W., Trimpe, J. S.

IEEE Control Systems Letters, 5(2):641-646, 2021 (article)

DOI [BibTex]

DOI [BibTex]


A Learnable Safety Measure
A Learnable Safety Measure

Heim, S., Rohr, A. V., Trimpe, S., Badri-Spröwitz, A.

Proceedings of the Conference on Robot Learning, 100, pages: 627-639, Proceedings of Machine Learning Research, (Editors: Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei), PMLR, Conference on Robot Learning, October 2020 (article)

Arxiv [BibTex]

Arxiv [BibTex]


A little damping goes a long way: a simulation study of how damping influences task-level stability in running
A little damping goes a long way: a simulation study of how damping influences task-level stability in running

Heim, S., Millard, M., Le Mouel, C., Badri-Spröwitz, A.

Biology Letters, 16(9):20200467, September 2020 (article)

Abstract
It is currently unclear if damping plays a functional role in legged locomotion, and simple models often do not include damping terms. We present a new model with a damping term that is isolated from other parameters: that is, the damping term can be adjusted without retuning other model parameters for nominal motion. We systematically compare how increased damping affects stability in the face of unexpected ground-height perturbations. Unlike most studies, we focus on task-level stability: instead of observing whether trajectories converge towards a nominal limit-cycle, we quantify the ability to avoid falls using a recently developed mathematical measure. This measure allows trajectories to be compared quantitatively instead of only being separated into a binary classification of ‘stable' or ‘unstable'. Our simulation study shows that increased damping contributes significantly to task-level stability; however, this benefit quickly plateaus after only a small amount of damping. These results suggest that the low intrinsic damping values observed experimentally may have stability benefits and are not simply minimized for energetic reasons. All Python code and data needed to generate our results are available open source.

link (url) DOI Project Page [BibTex]


Bayesian Optimization in Robot Learning - Automatic Controller Tuning and Sample-Efficient Methods
Bayesian Optimization in Robot Learning - Automatic Controller Tuning and Sample-Efficient Methods

Marco-Valle, A.

Eberhard Karls Universität Tübingen, Tübingen, July 2020 (phdthesis)

Abstract
The problem of designing controllers to regulate dynamical systems has been studied by engineers during the past millennia. Ever since, suboptimal performance lingers in many closed loops as an unavoidable side effect of manually tuning the parameters of the controllers. Nowadays, industrial settings remain skeptic about data-driven methods that allow one to automatically learn controller parameters. In the context of robotics, machine learning (ML) keeps growing its influence on increasing autonomy and adaptability, for example to aid automating controller tuning. However, data-hungry ML methods, such as standard reinforcement learning, require a large number of experimental samples, prohibitive in robotics, as hardware can deteriorate and break. This brings about the following question: Can manual controller tuning, in robotics, be automated by using data-efficient machine learning techniques? In this thesis, we tackle the question above by exploring Bayesian optimization (BO), a data-efficient ML framework, to buffer the human effort and side effects of manual controller tuning, while retaining a low number of experimental samples. We focus this work in the context of robotic systems, providing thorough theoretical results that aim to increase data-efficiency, as well as demonstrations in real robots. Specifically, we present four main contributions. We first consider using BO to replace manual tuning in robotic platforms. To this end, we parametrize the design weights of a linear quadratic regulator (LQR) and learn its parameters using an information-efficient BO algorithm. Such algorithm uses Gaussian processes (GPs) to model the unknown performance objective. The GP model is used by BO to suggest controller parameters that are expected to increment the information about the optimal parameters, measured as a gain in entropy. The resulting “automatic LQR tuning” framework is demonstrated on two robotic platforms: A robot arm balancing an inverted pole and a humanoid robot performing a squatting task. In both cases, an existing controller is automatically improved in a handful of experiments without human intervention. BO compensates for data scarcity by means of the GP, which is a probabilistic model that encodes prior assumptions about the unknown performance objective. Usually, incorrect or non-informed assumptions have negative consequences, such as higher number of robot experiments, poor tuning performance or reduced sample-efficiency. The second to fourth contributions presented herein attempt to alleviate this issue. The second contribution proposes to include the robot simulator into the learning loop as an additional information source for automatic controller tuning. While doing a real robot experiment generally entails high associated costs (e.g., require preparation and take time), simulations are cheaper to obtain (e.g., they can be computed faster). However, because the simulator is an imperfect model of the robot, its information is biased and could have negative repercussions in the learning performance. To address this problem, we propose “simu-vs-real”, a principled multi-fidelity BO algorithm that trades off cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. The resulting algorithm is demonstrated on a cart-pole system, where simulations and real experiments are alternated, thus sparing many real evaluations. The third contribution explores how to adequate the expressiveness of the probabilistic prior to the control problem at hand. To this end, the mathematical structure of LQR controllers is leveraged and embedded into the GP, by means of the kernel function. Specifically, we propose two different “LQR kernel” designs that retain the flexibility of Bayesian nonparametric learning. Simulated results indicate that the LQR kernel yields superior performance than non-informed kernel choices when used for controller learning with BO. Finally, the fourth contribution specifically addresses the problem of handling controller failures, which are typically unavoidable in practice while learning from data, specially if non-conservative solutions are expected. Although controller failures are generally problematic (e.g., the robot has to be emergency-stopped), they are also a rich information source about what should be avoided. We propose “failures-aware excursion search”, a novel algorithm for Bayesian optimization under black-box constraints, where failures are limited in number. Our results in numerical benchmarks indicate that by allowing a confined number of failures, better optima are revealed as compared with state-of-the-art methods. The first contribution of this thesis, “automatic LQR tuning”, lies among the first on applying BO to real robots. While it demonstrated automatic controller learning from few experimental samples, it also revealed several important challenges, such as the need of higher sample-efficiency, which opened relevant research directions that we addressed through several methodological contributions. Summarizing, we proposed “simu-vs-real”, a novel BO algorithm that includes the simulator as an additional information source, an “LQR kernel” design that learns faster than standard choices and “failures-aware excursion search”, a new BO algorithm for constrained black-box optimization problems, where the number of failures is limited.

Repository (Universitätsbibliothek) - University of Tübingen PDF DOI [BibTex]


Event-triggered Learning
Event-triggered Learning

Solowjow, F., Trimpe, S.

Automatica, 117, pages: 109009, Elsevier, July 2020 (article)

arXiv PDF DOI [BibTex]

arXiv PDF DOI [BibTex]


Learning of sub-optimal gait controllers for magnetic walking soft millirobots
Learning of sub-optimal gait controllers for magnetic walking soft millirobots

Culha, U., Demir, S. O., Trimpe, S., Sitti, M.

In Robotics: Science and Systems XVI, pages: P070, (Editors: Toussaint, Marc and Bicchi, Antonio and Hermans, Tucker), RSS Foundation, Robotics: Science and Systems 2020 (RSS 2020), 2020 (inproceedings)

Abstract
Untethered small-scale soft robots have promising applications in minimally invasive surgery, targeted drug delivery, and bioengineering applications as they can access confined spaces in the human body. However, due to highly nonlinear soft continuum deformation kinematics, inherent stochastic variability during fabrication at the small scale, and lack of accurate models, the conventional control methods cannot be easily applied. Adaptivity of robot control is additionally crucial for medical operations, as operation environments show large variability, and robot materials may degrade or change over time,which would have deteriorating effects on the robot motion and task performance. Therefore, we propose using a probabilistic learning approach for millimeter-scale magnetic walking soft robots using Bayesian optimization (BO) and Gaussian processes (GPs). Our approach provides a data-efficient learning scheme to find controller parameters while optimizing the stride length performance of the walking soft millirobot robot within a small number of physical experiments. We demonstrate adaptation to fabrication variabilities in three different robots and to walking surfaces with different roughness. We also show an improvement in the learning performance by transferring the learning results of one robot to the others as prior information.

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


Actively Learning Gaussian Process Dynamics
Actively Learning Gaussian Process Dynamics

Buisson-Fenet, M., Solowjow, F., Trimpe, S.

Proceedings of the 2nd Conference on Learning for Dynamics and Control, 120, pages: 5-15, Proceedings of Machine Learning Research (PMLR), (Editors: Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie), PMLR, 2nd Annual Conference on Learning for Dynamics and Control (L4DC), June 2020 (conference)

Abstract
Despite the availability of ever more data enabled through modern sensor and computer technology, it still remains an open problem to learn dynamical systems in a sample-efficient way. We propose active learning strategies that leverage information-theoretical properties arising naturally during Gaussian process regression, while respecting constraints on the sampling process imposed by the system dynamics. Sample points are selected in regions with high uncertainty, leading to exploratory behavior and data-efficient training of the model. All results are verified in an extensive numerical benchmark.

ArXiv link (url) [BibTex]

ArXiv link (url) [BibTex]


Learning Constrained Dynamics with Gauss Principle adhering Gaussian Processes
Learning Constrained Dynamics with Gauss Principle adhering Gaussian Processes

Geist, A. R., Trimpe, S.

In Proceedings of the 2nd Conference on Learning for Dynamics and Control, 120, pages: 225-234, Proceedings of Machine Learning Research (PMLR), (Editors: Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie), PMLR, 2nd Annual Conference on Learning for Dynamics and Control (L4DC), June 2020 (inproceedings)

Abstract
The identification of the constrained dynamics of mechanical systems is often challenging. Learning methods promise to ease an analytical analysis, but require considerable amounts of data for training. We propose to combine insights from analytical mechanics with Gaussian process regression to improve the model's data efficiency and constraint integrity. The result is a Gaussian process model that incorporates a priori constraint knowledge such that its predictions adhere to Gauss' principle of least constraint. In return, predictions of the system's acceleration naturally respect potentially non-ideal (non-)holonomic equality constraints. As corollary results, our model enables to infer the acceleration of the unconstrained system from data of the constrained system and enables knowledge transfer between differing constraint configurations.

Proceedings of Machine Learning Research link (url) [BibTex]

Proceedings of Machine Learning Research link (url) [BibTex]


Data-efficient Autotuning with Bayesian Optimization: An Industrial Control Study
Data-efficient Autotuning with Bayesian Optimization: An Industrial Control Study

Neumann-Brosig, M., Marco, A., Schwarzmann, D., Trimpe, S.

IEEE Transactions on Control Systems Technology, 28(3):730-740, May 2020 (article)

Abstract
Bayesian optimization is proposed for automatic learning of optimal controller parameters from experimental data. A probabilistic description (a Gaussian process) is used to model the unknown function from controller parameters to a user-defined cost. The probabilistic model is updated with data, which is obtained by testing a set of parameters on the physical system and evaluating the cost. In order to learn fast, the Bayesian optimization algorithm selects the next parameters to evaluate in a systematic way, for example, by maximizing information gain about the optimum. The algorithm thus iteratively finds the globally optimal parameters with only few experiments. Taking throttle valve control as a representative industrial control example, the proposed auto-tuning method is shown to outperform manual calibration: it consistently achieves better performance with a low number of experiments. The proposed auto-tuning framework is flexible and can handle different control structures and objectives.

arXiv (PDF) DOI Project Page [BibTex]

arXiv (PDF) DOI Project Page [BibTex]


no image
Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization

Turchetta, M., Krause, A., Trimpe, S.

In 2020 IEEE International Conference on Robotics and Automation (ICRA 2020), pages: 10702-10708, IEEE, Piscataway, NJ, IEEE International Conference on Robotics and Automation (ICRA 2020), May 2020 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Sliding Mode Control with Gaussian Process Regression for Underwater Robots

Lima, G. S., Trimpe, S., Bessa, W. M.

Journal of Intelligent & Robotic Systems, 99(3-4):487-498, January 2020 (article)

DOI [BibTex]

DOI [BibTex]


Hierarchical Event-triggered Learning for Cyclically Excited Systems with Application to Wireless Sensor Networks
Hierarchical Event-triggered Learning for Cyclically Excited Systems with Application to Wireless Sensor Networks

Beuchert, J., Solowjow, F., Raisch, J., Trimpe, S., Seel, T.

IEEE Control Systems Letters, 4(1):103-108, January 2020 (article)

arXiv PDF DOI [BibTex]

arXiv PDF DOI [BibTex]


Control-guided Communication: Efficient Resource Arbitration and Allocation in Multi-hop Wireless Control Systems
Control-guided Communication: Efficient Resource Arbitration and Allocation in Multi-hop Wireless Control Systems

Baumann, D., Mager, F., Zimmerling, M., Trimpe, S.

IEEE Control Systems Letters, 4(1):127-132, January 2020 (article)

arXiv PDF DOI [BibTex]


Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures
Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures

Marco, A., Rohr, A. V., Baumann, D., Hernández-Lobato, J. M., Trimpe, S.

2020 (proceedings) In revision

Abstract
When learning to ride a bike, a child falls down a number of times before achieving the first success. As falling down usually has only mild consequences, it can be seen as a tolerable failure in exchange for a faster learning process, as it provides rich information about an undesired behavior. In the context of Bayesian optimization under unknown constraints (BOC), typical strategies for safe learning explore conservatively and avoid failures by all means. On the other side of the spectrum, non conservative BOC algorithms that allow failing may fail an unbounded number of times before reaching the optimum. In this work, we propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Empirical validation shows that our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods. In addition, we propose an original algorithm for unconstrained Bayesian optimization inspired by the notion of excursion sets in stochastic processes, upon which the failures-aware algorithm is built.

arXiv code (python) PDF [BibTex]


no image
Online learning with stability guarantees: A memory-based warm starting for real-time MPC

Schwenkel, L., Gharbi, M., Trimpe, S., Ebenbauer, C.

Automatica, 122, pages: 109247, 2020 (article)

DOI [BibTex]


Spatial Scheduling of Informative Meetings for Multi-Agent Persistent Coverage
Spatial Scheduling of Informative Meetings for Multi-Agent Persistent Coverage

Haksar, R. N., Trimpe, S., Schwager, M.

IEEE Robotics and Automation Letters, 5(2):3027-3034, 2020 (article)

DOI [BibTex]

DOI [BibTex]


Safe and Fast Tracking on a Robot Manipulator: Robust MPC and Neural Network Control
Safe and Fast Tracking on a Robot Manipulator: Robust MPC and Neural Network Control

Nubert, J., Koehler, J., Berenz, V., Allgower, F., Trimpe, S.

IEEE Robotics and Automation Letters, 5(2):3050-3057, 2020 (article)

Abstract
Fast feedback control and safety guarantees are essential in modern robotics. We present an approach that achieves both by combining novel robust model predictive control (MPC) with function approximation via (deep) neural networks (NNs). The result is a new approach for complex tasks with nonlinear, uncertain, and constrained dynamics as are common in robotics. Specifically, we leverage recent results in MPC research to propose a new robust setpoint tracking MPC algorithm, which achieves reliable and safe tracking of a dynamic setpoint while guaranteeing stability and constraint satisfaction. The presented robust MPC scheme constitutes a one-layer approach that unifies the often separated planning and control layers, by directly computing the control command based on a reference and possibly obstacle positions. As a separate contribution, we show how the computation time of the MPC can be drastically reduced by approximating the MPC law with a NN controller. The NN is trained and validated from offline samples of the MPC, yielding statistical guarantees, and used in lieu thereof at run time. Our experiments on a state-of-the-art robot manipulator are the first to show that both the proposed robust and approximate MPC schemes scale to real-world robotic systems.

arXiv PDF DOI [BibTex]

arXiv PDF DOI [BibTex]

2019


Controlling Heterogeneous Stochastic Growth Processes on Lattices with Limited Resources
Controlling Heterogeneous Stochastic Growth Processes on Lattices with Limited Resources

Haksar, R., Solowjow, F., Trimpe, S., Schwager, M.

In Proceedings of the 58th IEEE International Conference on Decision and Control (CDC) , pages: 1315-1322, 58th IEEE International Conference on Decision and Control (CDC), December 2019 (conference)

PDF [BibTex]

2019

PDF [BibTex]


Fast Feedback Control over Multi-hop Wireless Networks with Mode Changes and Stability Guarantees
Fast Feedback Control over Multi-hop Wireless Networks with Mode Changes and Stability Guarantees

Baumann, D., Mager, F., Jacob, R., Thiele, L., Zimmerling, M., Trimpe, S.

ACM Transactions on Cyber-Physical Systems, 4(2):18, November 2019 (article)

arXiv PDF DOI [BibTex]

arXiv PDF DOI [BibTex]


Predictive Triggering for Distributed Control of Resource Constrained Multi-agent Systems
Predictive Triggering for Distributed Control of Resource Constrained Multi-agent Systems

Mastrangelo, J. M., Baumann, D., Trimpe, S.

In Proceedings of the 8th IFAC Workshop on Distributed Estimation and Control in Networked Systems, pages: 79-84, 8th IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), September 2019 (inproceedings)

arXiv PDF DOI [BibTex]

arXiv PDF DOI [BibTex]


Event-triggered Pulse Control with Model Learning (if Necessary)
Event-triggered Pulse Control with Model Learning (if Necessary)

Baumann, D., Solowjow, F., Johansson, K. H., Trimpe, S.

In Proceedings of the American Control Conference, pages: 792-797, American Control Conference (ACC), July 2019 (inproceedings)

arXiv PDF [BibTex]

arXiv PDF [BibTex]


Data-driven inference of passivity properties via Gaussian process optimization
Data-driven inference of passivity properties via Gaussian process optimization

Romer, A., Trimpe, S., Allgöwer, F.

In Proceedings of the European Control Conference, European Control Conference (ECC), June 2019 (inproceedings)

PDF [BibTex]

PDF [BibTex]


Trajectory-Based Off-Policy Deep Reinforcement Learning
Trajectory-Based Off-Policy Deep Reinforcement Learning

Doerr, A., Volpp, M., Toussaint, M., Trimpe, S., Daniel, C.

In Proceedings of the International Conference on Machine Learning (ICML), International Conference on Machine Learning (ICML), June 2019 (inproceedings)

Abstract
Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently get stuck in local optima. This work addresses these weaknesses by combining recent improvements in the reuse of off-policy data and exploration in parameter space with deterministic behavioral policies. The resulting objective is amenable to standard neural network optimization strategies like stochastic gradient descent or stochastic gradient Hamiltonian Monte Carlo. Incorporation of previous rollouts via importance sampling greatly improves data-efficiency, whilst stochastic optimization schemes facilitate the escape from local optima. We evaluate the proposed approach on a series of continuous control benchmark tasks. The results show that the proposed algorithm is able to successfully and reliably learn solutions using fewer system interactions than standard policy gradient methods.

arXiv PDF [BibTex]

arXiv PDF [BibTex]


Resource-aware IoT Control: Saving Communication through Predictive Triggering
Resource-aware IoT Control: Saving Communication through Predictive Triggering

Trimpe, S., Baumann, D.

IEEE Internet of Things Journal, 6(3):5013-5028, June 2019 (article)

Abstract
The Internet of Things (IoT) interconnects multiple physical devices in large-scale networks. When the 'things' coordinate decisions and act collectively on shared information, feedback is introduced between them. Multiple feedback loops are thus closed over a shared, general-purpose network. Traditional feedback control is unsuitable for design of IoT control because it relies on high-rate periodic communication and is ignorant of the shared network resource. Therefore, recent event-based estimation methods are applied herein for resource-aware IoT control allowing agents to decide online whether communication with other agents is needed, or not. While this can reduce network traffic significantly, a severe limitation of typical event-based approaches is the need for instantaneous triggering decisions that leave no time to reallocate freed resources (e.g., communication slots), which hence remain unused. To address this problem, novel predictive and self triggering protocols are proposed herein. From a unified Bayesian decision framework, two schemes are developed: self triggers that predict, at the current triggering instant, the next one; and predictive triggers that check at every time step, whether communication will be needed at a given prediction horizon. The suitability of these triggers for feedback control is demonstrated in hardware experiments on a cart-pole, and scalability is discussed with a multi-vehicle simulation.

PDF arXiv DOI [BibTex]


Feedback Control Goes Wireless: Guaranteed Stability over Low-power Multi-hop Networks
Feedback Control Goes Wireless: Guaranteed Stability over Low-power Multi-hop Networks

(Best Paper Award)

Mager, F., Baumann, D., Jacob, R., Thiele, L., Trimpe, S., Zimmerling, M.

In Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems, pages: 97-108, 10th ACM/IEEE International Conference on Cyber-Physical Systems, April 2019 (inproceedings)

Abstract
Closing feedback loops fast and over long distances is key to emerging applications; for example, robot motion control and swarm coordination require update intervals below 100 ms. Low-power wireless is preferred for its flexibility, low cost, and small form factor, especially if the devices support multi-hop communication. Thus far, however, closed-loop control over multi-hop low-power wireless has only been demonstrated for update intervals on the order of multiple seconds. This paper presents a wireless embedded system that tames imperfections impairing control performance such as jitter or packet loss, and a control design that exploits the essential properties of this system to provably guarantee closed-loop stability for linear dynamic systems. Using experiments on a testbed with multiple cart-pole systems, we are the first to demonstrate the feasibility and to assess the performance of closed-loop control and coordination over multi-hop low-power wireless for update intervals from 20 ms to 50 ms.

arXiv PDF DOI Project Page [BibTex]

arXiv PDF DOI Project Page [BibTex]


no image
Demo Abstract: Fast Feedback Control and Coordination with Mode Changes for Wireless Cyber-Physical Systems

(Best Demo Award)

Mager, F., Baumann, D., Jacob, R., Thiele, L., Trimpe, S., Zimmerling, M.

Proceedings of the 18th ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN), pages: 340-341, 18th ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN), April 2019 (poster)

arXiv PDF DOI [BibTex]

arXiv PDF DOI [BibTex]


Fast and Resource-Efficient Control of Wireless Cyber-Physical Systems
Fast and Resource-Efficient Control of Wireless Cyber-Physical Systems

Baumann, D.

KTH Royal Institute of Technology, Stockholm, February 2019 (phdthesis)

PDF [BibTex]

PDF [BibTex]

2018


Deep Reinforcement Learning for Event-Triggered Control
Deep Reinforcement Learning for Event-Triggered Control

Baumann, D., Zhu, J., Martius, G., Trimpe, S.

In Proceedings of the 57th IEEE International Conference on Decision and Control (CDC), pages: 943-950, 57th IEEE International Conference on Decision and Control (CDC), December 2018 (inproceedings)

arXiv PDF DOI Project Page [BibTex]

2018

arXiv PDF DOI Project Page [BibTex]