PropertyValue
nif:beginIndex
  • 0 (xsd:integer)
nif:broaderContext
nif:endIndex
  • 180 (xsd:integer)
nif:isString
  • Hence an optimal policy is easily derived from the optimal values by selecting the highest valued action in each state, and the problem only amounts to obtaining accurate Q-values.
rdf:type