sentence0
at
SoMeSci
http://data.gesis.org/somesci/PMC5381785/sentence0
Property
Value
nif:
beginIndex
0
(xsd:integer)
nif:
broaderContext
sms:
PMC5381785
nif:
endIndex
205
(xsd:integer)
nif:
isString
The goal of reinforcement learning is to find the policy π—a set of rules to select an action in each possible state—that would maximize the agent’s accumulated long term reward in a dynamical environment.
rdf:
type
nif:
Context
nif:
OffsetBasedString
nif:
Sentence