Zhaohan Daniel Guo

Cytowane przez

	Wszystkie	Od 2019
Cytowania	9175	9090
h-indeks	18	18
i10-indeks	22	22

3000

1500

750

2250

201820192020202120222023202446 73 243 1140 2272 2943 2397

Dostęp publiczny

Wyświetl wszystko

3 artykuły

0 artykułów

dostępne

niedostępne

Objęte finansowaniem

Współautorzy

Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityZweryfikowany adres z cs.stanford.edu
Philip ThomasUniversity of Massachusetts AmherstZweryfikowany adres z cs.umass.edu
Shayan DoroudiAssistant Professor at the University of California, IrvineZweryfikowany adres z uci.edu
Yao LiuAmazonZweryfikowany adres z stanford.edu

Obserwuj

Zhaohan Daniel Guo

DeepMind

Zweryfikowany adres z google.com - Strona główna

Reinforcement learning


Tytuł Sortuj wg cytatów Sortuj wg roku Sortuj wg tytułu	Cytowane przez Cytowane przez	Rok
Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	6465	2020
Agent57: Outperforming the atari human benchmark AP Badia, B Piot, S Kapturowski, P Sprechmann, A Vitvitskyi, ZD Guo, ... International conference on machine learning, 507-517, 2020	671	2020
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	474	2020
Never give up: Learning directed exploration strategies AP Badia, P Sprechmann, A Vitvitskyi, D Guo, B Piot, S Kapturowski, ... arXiv preprint arXiv:2002.06038, 2020	342	2020
Joint semantic utterance classification and slot filling with recursive neural networks D Guo, G Tur, W Yih, G Zweig 2014 IEEE Spoken Language Technology Workshop (SLT), 554-559, 2014	249	2014
A general theoretical paradigm to understand learning from human preferences MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024	187	2024
Bootstrap latent-predictive representations for multitask reinforcement learning ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar International Conference on Machine Learning, 3875-3886, 2020	152	2020
Neural predictive belief representations ZD Guo, MG Azar, B Piot, BA Pires, R Munos arXiv preprint arXiv:1811.06407, 2018	89	2018
A pac rl algorithm for episodic pomdps ZD Guo, S Doroudi, E Brunskill Artificial Intelligence and Statistics, 510-518, 2016	68	2016
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	60	2022
Nash learning from human feedback R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ... arXiv preprint arXiv:2312.00886, 2023	54	2023
Using options and covariance testing for long horizon off-policy policy evaluation Z Guo, PS Thomas, E Brunskill Advances in Neural Information Processing Systems 30, 2017	49	2017
Bootstrap your own latent: A new approach to self-supervised learning. arXiv JB Grill, F Strub, F Altché, C Tallec, PH Richemond, E Buchatskaya, ... arXiv preprint arXiv:2006.07733, 2020	43	2020
Geometric entropic exploration ZD Guo, MG Azar, A Saade, S Thakoor, B Piot, BA Pires, M Valko, ... arXiv preprint arXiv:2101.02055, 2021	40	2021
Concurrent pac rl Z Guo, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	30	2015
Generalized preference optimization: A unified approach to offline alignment Y Tang, ZD Guo, Z Zheng, D Calandriello, R Munos, M Rowland, ... arXiv preprint arXiv:2402.05749, 2024	29	2024
Understanding self-predictive learning for reinforcement learning Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ... International Conference on Machine Learning, 33632-33656, 2023	28	2023
Pac continuous state online multitask reinforcement learning with identification Y Liu, Z Guo, E Brunskill Proceedings of the 2016 International Conference on Autonomous Agents …, 2016	21	2016
Understanding the performance gap between online and offline alignment algorithms Y Tang, DZ Guo, Z Zheng, D Calandriello, Y Cao, E Tarassov, R Munos, ... arXiv preprint arXiv:2405.08448, 2024	15	2024
Charline Le Lan, Michal Valko, Tianqi Liu, et al. Human alignment of large language models through online preference optimisation D Calandriello, D Guo, R Munos, M Rowland, Y Tang, BA Pires, ... arXiv preprint arXiv:2403.08635, 2024	12	2024

Nie można teraz wykonać tej operacji. Spróbuj ponownie później.

Prace 1–20

Cytowania rocznie

Powielone cytowania

Scalone cytowania

Dodaj współautorówWspółautorzy

Obserwuj

Cytowane przez

Współautorzy