Bhavya Sukhija

OAT Y19, ETH Zürich

Hi!

I am pursuing a PhD in computer science at ETH Zürich. I am co-supervised by Prof. Andreas Krause and Prof. Stelian Coros. My research interests are Reinforcement learning (RL): model-based RL, nonepisodic RL, safe RL, meta RL, continuous time RL, exploration (multimodal) in RL, active learning and robotics.

Since December 2024, I have started an internship at Amazon Web Services in Berlin as a research scientist. Currently, I am working on leveraging RL for fine-tuning LLM agents to master real-world applications.

From July 2024 – December 2024 I was a research visitor at University of California, Berkeley at the Berkeley Robot Learning lab where I was supervised by Prof. Pieter Abbeel. During the visit, I worked on developing a class of simple, efficient, and scalable algorithms for exploration in RL, see MaxInfoRL.

Prior to my PhD, I completed a BSc in Mechanical Engineering and a masters in Robotics at ETH. I completed my master thesis at the RWTH Aachen university under the supervision of Prof. Dominik Baumann, Prof. Sebastian Trimpe, and Prof. Andreas Krause. I received the ETH medal for my thesis.

Besides research, I enjoy playing football and support Liverpool FC.

News

Dec 1, 2024	Started as a research scientist intern at at Amazon Web Services, working on leveraging RL for fine-tuning LLM agents to master real-world applications.
Jul 1, 2024	Started as a visiting research in University of California, Berkeley at the Berkeley Robot Learning lab under the supervision of Dr. Carlo Sferrazza and Prof. Pieter Abbeel.
Jun 30, 2024	Sim-FSVGD got accepted as oral presentation at IROS 2024.
Feb 29, 2024	My great master student (now PhD at LAS), Jonas Hübotter, is awarded the ETH medal for his master thesis.

Selected publications

SOMBRL: Scalable and Optimistic Model-Based RL

Bhavya Sukhija, Lenart Treven, Carmelo Sferrazza, and 3 more authors

NeurIPS, 2025

Bib Link

@article{sukhija2025somrbl,
  title = {SOMBRL: Scalable and Optimistic Model-Based RL},
  author = {Sukhija, Bhavya and Treven, Lenart and Sferrazza, Carmelo and Dörfler, Florian and Abbeel, Pieter and Krause, Andreas},
  journal = {NeurIPS},
  year = {2025},
}

MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Bhavya Sukhija, Stelian Coros, Andreas Krause, and 2 more authors

ICLR, 2025

Bib Link

@article{sukhija2024maxinforl,
  title = {MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization},
  author = {Sukhija, Bhavya and Coros, Stelian and Krause, Andreas and Abbeel, Pieter and Sferrazza, Carmelo},
  journal = {ICLR},
  year = {2025}
}

ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

Yarden As*, Bhavya Sukhija*, Lenart Treven, and 3 more authors

arXiv preprint arXiv:2410.09486, 2024

Bib Link

@article{as2024actsafe,
  title = {ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning},
  author = {As*, Yarden and Sukhija*, Bhavya and Treven, Lenart and Sferrazza, Carmelo and Coros, Stelian and Krause, Andreas},
  journal = {arXiv preprint arXiv:2410.09486},
  year = {2024}
}

NeoRL: Efficient Exploration for Nonepisodic RL

Bhavya Sukhija, Lenart Treven, Florian Dörfler, and 2 more authors

Proc. Neural Information Processing Systems (NeurIPS), 2024

Spotlight

Bib Link

@article{sukhija2024neorl,
  title = {NeoRL: Efficient Exploration for Nonepisodic RL},
  author = {Sukhija, Bhavya and Treven, Lenart and Dörfler, Florian and Coros, Stelian and Krause, Andreas},
  journal = {Proc. Neural Information Processing Systems (NeurIPS)},
  year = {2024},
  note = {<b>Spotlight</b>},
}

Bridging the Sim-to-Real Gap with Bayesian Inference

Jonas Rothfuss*, Bhavya Sukhija*, Lenart Treven*, and 3 more authors

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

Oral presentation

Bib Link

@article{rothfuss2024bridging,
  title = {Bridging the Sim-to-Real Gap with Bayesian Inference},
  author = {Rothfuss*, Jonas and Sukhija*, Bhavya and Treven*, Lenart and Dörfler, Florian and Coros, Stelian and Krause, Andreas},
  journal = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2024},
  note = {<b>Oral presentation</b>},
}

Optimistic Active Exploration of Dynamical Systems

Bhavya Sukhija, Lenart Treven, Cansu Sancaktar, and 3 more authors

In Proc. Neural Information Processing Systems (NeurIPS), 2023

Bib Link

@inproceedings{opax,
  title = {Optimistic Active Exploration of Dynamical Systems},
  author = {Sukhija, Bhavya and Treven, Lenart and Sancaktar, Cansu and Blaes, Sebastian and Coros, Stelian and Krause, Andreas},
  booktitle = {Proc. Neural Information Processing Systems (NeurIPS)},
  year = {2023},
  eprint = {2306.12371},
}

GoSafeOpt: Scalable safe exploration for global optimization of dynamical systems

Bhavya Sukhija, Matteo Turchetta, David Lindner, and 3 more authors

Artificial Intelligence Journal (AIJ), 2023

Bib Link

@article{SUKHIJA2023103922,
  title = {GoSafeOpt: Scalable safe exploration for global optimization of dynamical systems},
  journal = {Artificial Intelligence Journal (AIJ)},
  volume = {320},
  pages = {103922},
  year = {2023},
  author = {Sukhija, Bhavya and Turchetta, Matteo and Lindner, David and Krause, Andreas and Trimpe, Sebastian and Baumann, Dominik},
  url = {https://www.sciencedirect.com/science/article/pii/S0004370223000681},
}

Hallucinated Adversarial Control for Conservative Offline Policy Evaluation

Jonas Rothfuss*, Bhavya Sukhija*, Tobias Birchler*, and 2 more authors

In Conference on Uncertainty in Artificial Intelligence (UAI), 2023

Bib Link

@inproceedings{rothfuss2023hallucinated,
  title = {Hallucinated Adversarial Control for Conservative Offline Policy Evaluation},
  author = {Rothfuss*, Jonas and Sukhija*, Bhavya and Birchler*, Tobias and Kassraie, Parnian and Krause, Andreas},
  year = {2023},
  booktitle = {Conference on Uncertainty in Artificial Intelligence (UAI)},
}

Tuning Legged Locomotion Controllers via Safe Bayesian Optimization

Daniel Widmer*, Dongho Kang*, Bhavya Sukhija, and 3 more authors

In Conference on Robot Learning (CoRL), 2023

Bib Link

@inproceedings{widmer2023tuning,
  title = {Tuning Legged Locomotion Controllers via Safe Bayesian Optimization},
  author = {Widmer*, Daniel and Kang*, Dongho and Sukhija, Bhavya and H\"ubotter, Jonas and Krause, Andreas and Coros, Stelian},
  booktitle = {Conference on Robot Learning (CoRL)},
  year = {2023},
}

Gradient-Based Trajectory Optimization With Learned Dynamics

Bhavya Sukhija, Nathanael Köhler, Miguel Zamora, and 4 more authors

IEEE International Conference on Robotics and Automation (ICRA), 2023

Bib Link

@article{sukhija2022gradientbased,
  title = {Gradient-Based Trajectory Optimization With Learned Dynamics},
  author = {Sukhija, Bhavya and K\"ohler, Nathanael and Zamora, Miguel and Zimmermann, Simon and Curi, Sebastian and Krause, Andreas and Coros, Stelian},
  journal = {IEEE International Conference on Robotics and Automation (ICRA)},
  year = {2023},
}