PropertyValue
nif:beginIndex
  • 0 (xsd:integer)
nif:broaderContext
nif:endIndex
  • 128 (xsd:integer)
nif:isString
  • The st is the state of the environment, at the action taken by the agent and rt the reward received by the agent at time-step t.
rdf:type