Reinforcement Learning Reward Hacking

OpenAI Sounds the Alarm : The Hidden Dangers of Controlling AI Thought Processes

OpenAI has issued a critical warning to AI research labs, emphasizing the dangers of directly manipulating the internal reasoning processes of advanced AI systems. The organization cautions against ...

China's AI Hacking Skills Reportedly on Par with Claude

According to security experts, Zhipu AI's open model GLM-5.2 matches Anthropic's Mythos in bug detection capabilities.

Tech Times

Open-Source Coding Model Ornith-1.0 Writes Its Own Training Scaffold in Reinforcement Learning

Open-source agentic coding model Ornith-1.0, released today under the MIT license, uses a self-improving reinforcement ...

14d

New OpenAI research explores how reinforcement learning can make AI systems more aligned and resilient

OpenAI researchers have published a new study examining whether reinforcement learning (RL) can be used not only to improve model capabilities but also to strengthen alignment and beneficial behavior ...

Forbes

The Rise And Rise Of Reinforcement Learning: AI’s Quiet Revolution

Forbes contributors publish independent expert analyses and insights. Author, Researcher and Speaker on Technology and Business Innovation. Apr 19, 2025, 03:24am EDT Apr 21, 2025, 10:40am EDT ...

Nature

How dopamine neurons devalue delayed rewards

The attractiveness of a reward decreases with delay — a phenomenon known as temporal discounting. Humans and other animals typically devalue short-term rewards more steeply than those further in the ...

The Conversation

What is reinforcement learning? An AI researcher explains a key method of teaching machines – and how it relates to training your dog

Understanding intelligence and creating intelligent machines are grand scientific challenges of our times. The ability to learn from experience is a cornerstone of intelligence for machines and living ...

Medical Xpress

An AI-informed model of human reward-based learning: Hybrid approach could aid studies of mood disorders

Eventually, such hybrid modeling approaches could help to shed new light on the underpinnings of human decision-making, as well as on disorders characterized by disruptions in reward-based learning ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results