Reinforcement Learning Progress

The post discusses OpenAI's recent achievement in reinforcement learning, where a team of agents trained using Proximal Policy Optimization successfully played Dota and defeated semi-professional players. This demonstrates the potential of deep reinforcement learning to tackle complex real-world problems through self-training in simulated environments, paving the way for advancements in machine learning and general intelligence.

Published 2018-06-25 1 min read 212 words 3 topics

Before you set out 3/5 Navigation SeverityCartographic IncidentThe route is real, but the signage gets weird enough to deserve a field note.

Excerpt · opening linespulled from source

Today, OpenAI released a new result. We used PPO (Proximal Policy Optimization), a general reinforcement learning algorithm invented by OpenAI, to train a team of 5 agents to play Dota and beat semi-pros. This is the gam

SamNav stores no post bodies. Only enough to orient you - the rest lives at the source.

Original source

Read the full essay on Sam Altman's blog

blog.samaltman.com/reinforcement-learning-progress

Opens the original in a new tab · last reachable 2026-06-23

Go to source ↗

→

Nearby Entries

Within range by shared topic and coordinate - not by an algorithm.

SA-2024-05 2SeverityPoorly Marked

GPT-4o

The post highlights two key aspects of OpenAI's recent announcement: the commitment to providing advanced AI tools for free or at low cost, and the introduction of a new voice and video interface that enhances user interaction with AI. The author expresses excitement about the potential for these innovations to transform how people engage with technology.

2 MIN

↗ shares 3 topics

SA-2025-01 2SeverityPoorly Marked

Reflections

The post reflects on the journey of OpenAI over the past nine years, particularly focusing on the transformative impact of ChatGPT's launch. It discusses the challenges and growth experienced within the company, the importance of effective governance, and the gratitude felt towards colleagues and supporters. The author emphasizes the potential of AI to benefit society and the commitment to advancing safety and alignment in future developments.

9 MIN

↗ shares 3 topics

SA-2026-04 3SeverityCartographic Incident

Untitled post

A personal post written after an attack at Altman's home, using the incident to reflect on family safety, public rhetoric, and the heightened anxiety around AI. It also lays out his current beliefs about broad access to AI, democratic control, safety, adaptability, and the need to argue about the future without intimidation or violence.

5 MIN

↗ shares 3 topics