Tuning Legged Locomotion Controllers via Safe Bayesian Optimization NEORL Sim-FSVGD MaxInfoRL - Boosting exploration in reinforcement learning through information gain maximization Globally safe model-free exploration Optimistic Active Exploration of Dynamical Systems Hallucinated Adversarial Control for Conservative Offline Policy Evaluation Gradient-based trajectory optimization with learned dynamics