sentence0 | SoMeSci

Property	Value
nif:beginIndex	0 (xsd:integer)
nif:broaderContext	sms:PMC5381785
nif:endIndex	205 (xsd:integer)
nif:isString	The goal of reinforcement learning is to find the policy π—a set of rules to select an action in each possible state—that would maximize the agent’s accumulated long term reward in a dynamical environment.
rdf:type	nif:Context nif:OffsetBasedString nif:Sentence