scml.oneshot.rl.agent ===================== .. py:module:: scml.oneshot.rl.agent Classes ------- .. autoapisummary:: scml.oneshot.rl.agent.OneShotRLAgent Module Contents --------------- .. py:class:: OneShotRLAgent(*args, models: list[scml.oneshot.rl.common.RLModel] | tuple[scml.oneshot.rl.common.RLModel, Ellipsis] = tuple(), observation_managers: list[scml.oneshot.rl.observation.ObservationManager] | tuple[scml.oneshot.rl.observation.ObservationManager, Ellipsis] = tuple(), action_managers: list[scml.oneshot.rl.action.ActionManager] | tuple[scml.oneshot.rl.action.ActionManager, Ellipsis] | None = None, fallback_type: type[scml.oneshot.agent.OneShotAgent] | None = GreedyOneShotAgent, fallback_params: dict[str, Any] | None = None, dynamic_context_switching: bool = False, randomize_test_order: bool = False, **kwargs) Bases: :py:obj:`scml.oneshot.policy.OneShotPolicy` A oneshot agent that can execute trained RL models in appropriate worlds. It falls back to the given agent type otherwise :param models: List of models to choose from. :param observation_managers: List of observation managers. Must be the same length as `models` :param action_managers: List of action managers of the same length as `models` or `None` to use the default action manager. :param fallback_type: A `OneShotAgent` type to use as a fall-back if the current world is not compatible with any observation/action managers :param fallback_params: Parameters of the `fallback_type` :param dynamic_context_switching: If `True`, the world is tested each step (instead of only at init) to find the appropriate model :param randomize_test_order: If `True`, the order at which the observation/action managers are checked for compatibility with the current world is randomized. :param \*\*kwargs: Any other OneShotPolicy parameters .. py:attribute:: _models :value: () .. py:attribute:: _action_managers :value: None .. py:attribute:: _obs_managers :value: () .. py:attribute:: _fallback_type .. py:attribute:: _dynamic_context_switching :value: False .. py:attribute:: _randomize_test_order :value: False .. py:attribute:: _fallback_params :value: None .. py:attribute:: _valid_context :type: scml.oneshot.context.Context :value: None .. py:attribute:: _valid_action_manager :type: scml.oneshot.rl.action.ActionManager :value: None .. py:attribute:: _valid_obs_manager :type: scml.oneshot.rl.observation.ObservationManager :value: None .. py:attribute:: _valid_index :type: int :value: -1 .. py:attribute:: _fallback_agent :type: scml.oneshot.agent.OneShotAgent :value: None .. py:method:: setup_fallback() .. py:method:: has_no_valid_model() .. py:method:: context_switch() .. py:method:: init() Called once after the AWI is set. Remarks: - Use this for any proactive initialization code. .. py:method:: encode_state(mechanism_states: dict[str, negmas.sao.common.SAOState]) -> scml.oneshot.rl.common.RLState Called to generate a state to be passed to the act() method. The default is all of `awi` of type `OneShotState` .. py:method:: decode_action(action: scml.oneshot.rl.common.RLAction) -> dict[str, negmas.sao.common.SAOResponse] Generates offers to all partners from an encoded action. Default is to return the action as it is assuming it is a `dict[str, SAOResponse]` .. py:method:: act(state: scml.oneshot.rl.common.RLState) -> scml.oneshot.rl.common.RLAction The main policy. Generates an action given a state .. py:method:: propose(*args, **kwargs) -> negmas.outcomes.Outcome | None Called when the agent is asking to propose in one negotiation .. py:method:: respond(*args, **kwargs) -> negmas.gb.common.ResponseType Called when the agent is asked to respond to an offer .. py:method:: before_step() Called at at the BEGINNING of every production step (day) .. py:method:: step() Called at at the END of every production step (day) .. py:method:: on_negotiation_failure(*args, **kwargs) -> None Called when a negotiation the agent is a party of ends without agreement .. py:method:: on_negotiation_success(*args, **kwargs) -> None Called when a negotiation the agent is a party of ends with agreement