You are about to action:

Object being modified by the action

Do you want to proceed?

Share Video Link

https://www.bitchute.com/video/We20YSAJZSE/

Click to copy, then share by pasting into your messages, comments, social media posts and websites.

Embed Video HTML

Click to copy, then add into your webpages so users can view and engage with this video from your site.

Share to Social Media

Share to social media by clicking on the quick share links.

Report Content

Reason

Please select the most appropriate reason from the list provided.

Note: For a more detailed description of each reason, see our Community Guidelines.

Additional Comments

Please add any additonal comments that will help with the assessment of your request.

Note: Copyright claims must contain all the items specified within the Copyright Policy.

Email Submissions

We also accept reports via email. Please see the Guidelines Enforcement Process for instructions on how to make a request via email.

Thank you for submitting your report

We will investigate and take the appropriate action.

Add to Playlist

MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Next video playing soon

Click to cancel

Autoplay has been paused

Click to watch next video

First published at 13:28 UTC on November 21st, 2019.

Yannic Kilcher

subscribers

BitChute Premium. More Badges. More Channels. More Playlists. Free Merch & More.

BitChute Advertisement

MuZero harnesses the power of AlphaZero, but without relying on an accurate environment model. This opens up planning-based reinforcement learning to entirely new domains, where such environment models aren't available. The difference to previous work is that, instead of learning a model predicting future observations, MuZero predicts the future observations' latent representations, and thus learns to only represent things that matter to the task!

Abstract:
Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a perfect simulator is available. However, in real-world problems the dynamics governing the environment are often complex and unknown. In this work we present the MuZero algorithm which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics. MuZero learns a model that, when applied iteratively, predicts the quantities most directly relevant to planning: the reward, the action-selection policy, and the value function. When evaluated on 57 different Atari games - the canonical video game environment for testing AI techniques, in which model-based planning approaches have historically struggled - our new algorithm achieved a new state of the art. When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

Authors: Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https:..

LESS