Reinforcement
The term “reinforcement learning” is often meant to be exclusive of supervised learning or other learning problems that fit into a narrower framework. We definitely don’t mean to use it in this narrower sense. We use the term to refer to a very broad category that intentionally subsumes most modern machine learning.
Reinforcement learning can be coupled with reward engineering. For example, supervised learning fits into this definition of reinforcement learning, since we can use the data distribution and loss function to define rewards.
Reinforcement learning often refers to sequential decision problems, but I don’t mean to make this restriction. So far, I think these are generalizations that researchers in RL would agree with (though they’d likely consider sequential problems most interesting).
Reinforcement learning implies an interaction between an agent and an environment, but I don’t mean to make any assumptions on the nature of the environment. The “environment” is just whatever process computes the rewards and observations. It could be anything from a SAT checker, to a human reviewer, to a board game, to a rich and realistic environment.
Reinforcement learning often focuses on choosing actions, but we want to explicitly include cognitive actions. Some of these are clear fits — e.g. allocating memory effectively. Others don’t feel at all like “reinforcement learning” — e.g. learning to form sparse representations. But from a formal perspective a representation is just another kind of output, and the reinforcement learning framework captures these cases as well.
Reinforcement learning can be coupled with reward engineering. For example, supervised learning fits into this definition of reinforcement learning, since we can use the data distribution and loss function to define rewards.
Reinforcement learning often refers to sequential decision problems, but I don’t mean to make this restriction. So far, I think these are generalizations that researchers in RL would agree with (though they’d likely consider sequential problems most interesting).
Reinforcement learning implies an interaction between an agent and an environment, but I don’t mean to make any assumptions on the nature of the environment. The “environment” is just whatever process computes the rewards and observations. It could be anything from a SAT checker, to a human reviewer, to a board game, to a rich and realistic environment.
Reinforcement learning often focuses on choosing actions, but we want to explicitly include cognitive actions. Some of these are clear fits — e.g. allocating memory effectively. Others don’t feel at all like “reinforcement learning” — e.g. learning to form sparse representations. But from a formal perspective a representation is just another kind of output, and the reinforcement learning framework captures these cases as well.