sentence10 | SoMeSci

Property	Value
nif:beginIndex	0 (xsd:integer)
nif:broaderContext	sms:PMC5381785
nif:endIndex	180 (xsd:integer)
nif:isString	Hence an optimal policy is easily derived from the optimal values by selecting the highest valued action in each state, and the problem only amounts to obtaining accurate Q-values.
rdf:type	nif:Context nif:OffsetBasedString nif:Sentence