`scml.oneshot.rl.agent`

Module Contents

A oneshot agent that can execute trained RL models in appropriate worlds. It falls back to the given agent type otherwise

class scml.oneshot.rl.agent.OneShotRLAgent(*args, models: list[scml.oneshot.rl.common.RLModel] | tuple[scml.oneshot.rl.common.RLModel, Ellipsis] = tuple(), observation_managers: list[scml.oneshot.rl.observation.ObservationManager] | tuple[scml.oneshot.rl.observation.ObservationManager, Ellipsis] = tuple(), action_managers: list[scml.oneshot.rl.action.ActionManager] | tuple[scml.oneshot.rl.action.ActionManager, Ellipsis] | None = None, fallback_type: type[scml.oneshot.agent.OneShotAgent] | None = GreedyOneShotAgent, fallback_params: dict[str, Any] | None = None, dynamic_context_switching: bool = False, randomize_test_order: bool = False, **kwargs)[source]

A oneshot agent that can execute trained RL models in appropriate worlds. It falls back to the given agent type otherwise

Parameters:

models – List of models to choose from.
observation_managers – List of observation managers. Must be the same length as models
action_managers – List of action managers of the same length as models or None to use the default action manager.
fallback_type – A OneShotAgent type to use as a fall-back if the current world is not compatible with any observation/action managers
fallback_params – Parameters of the fallback_type
dynamic_context_switching – If True, the world is tested each step (instead of only at init) to find the appropriate model
randomize_test_order – If True, the order at which the observation/action managers are checked for compatibility with the current world is randomized.
**kwargs – Any other OneShotPolicy parameters

Called once after the AWI is set.

Remarks:

encode_state(mechanism_states: dict[str, negmas.sao.common.SAOState]) → scml.oneshot.rl.common.RLState[source]: Called to generate a state to be passed to the act() method. The default is all of awi of type OneShotState

decode_action(action: scml.oneshot.rl.common.RLAction) → dict[str, negmas.sao.common.SAOResponse][source]: Generates offers to all partners from an encoded action. Default is to return the action as it is assuming it is a dict[str, SAOResponse]

act(state: scml.oneshot.rl.common.RLState) → scml.oneshot.rl.common.RLAction[source]: The main policy. Generates an action given a state

propose(*args, **kwargs) → negmas.outcomes.Outcome | None[source]: Called when the agent is asking to propose in one negotiation

respond(*args, **kwargs) → negmas.gb.common.ResponseType[source]: Called when the agent is asked to respond to an offer

before_step()[source]: Called at at the BEGINNING of every production step (day)

on_negotiation_failure(*args, **kwargs) → None[source]: Called when a negotiation the agent is a party of ends without agreement

on_negotiation_success(*args, **kwargs) → None[source]: Called when a negotiation the agent is a party of ends with agreement