PropertyValue
nif:beginIndex
  • 0 (xsd:integer)
nif:broaderContext
nif:endIndex
  • 189 (xsd:integer)
nif:isString
  • Given state s, action a, reward r and next state s′, it is possible to approximate Q*(s, a) by iteratively solving the Bellman recurrence equation [1]: Qi+1(s, a) = E[r + γmaxa′Qi(s′, a′)].
rdf:type