scml.oneshot.rl.agent
=====================

.. py:module:: scml.oneshot.rl.agent


Classes
-------

.. autoapisummary::

   scml.oneshot.rl.agent.OneShotRLAgent


Module Contents
---------------

.. py:class:: OneShotRLAgent(*args, models: list[scml.oneshot.rl.common.RLModel] | tuple[scml.oneshot.rl.common.RLModel, Ellipsis] = tuple(), observation_managers: list[scml.oneshot.rl.observation.ObservationManager] | tuple[scml.oneshot.rl.observation.ObservationManager, Ellipsis] = tuple(), action_managers: list[scml.oneshot.rl.action.ActionManager] | tuple[scml.oneshot.rl.action.ActionManager, Ellipsis] | None = None, fallback_type: type[scml.oneshot.agent.OneShotAgent] | None = GreedyOneShotAgent, fallback_params: dict[str, Any] | None = None, dynamic_context_switching: bool = False, randomize_test_order: bool = False, **kwargs)

   Bases: :py:obj:`scml.oneshot.policy.OneShotPolicy`


   A oneshot agent that can execute  trained RL models in appropriate worlds. It falls back to the given agent type otherwise

   :param models: List of models to choose from.
   :param observation_managers: List of observation managers. Must be the same length as `models`
   :param action_managers: List of action managers of the same length as `models` or `None` to use the default action manager.
   :param fallback_type: A `OneShotAgent` type to use as a fall-back if the current world is not compatible with any observation/action managers
   :param fallback_params: Parameters of the `fallback_type`
   :param dynamic_context_switching: If `True`, the world is tested each step (instead of only at init) to find the appropriate model
   :param randomize_test_order: If `True`, the order at which the observation/action managers are checked for compatibility with the current world
                                is randomized.
   :param \*\*kwargs: Any other OneShotPolicy parameters


   .. py:attribute:: _models
      :value: ()


   .. py:attribute:: _action_managers
      :value: None


   .. py:attribute:: _obs_managers
      :value: ()


   .. py:attribute:: _fallback_type


   .. py:attribute:: _dynamic_context_switching
      :value: False


   .. py:attribute:: _randomize_test_order
      :value: False


   .. py:attribute:: _fallback_params
      :value: None


   .. py:attribute:: _valid_context
      :type:  scml.oneshot.context.Context
      :value: None


   .. py:attribute:: _valid_action_manager
      :type:  scml.oneshot.rl.action.ActionManager
      :value: None


   .. py:attribute:: _valid_obs_manager
      :type:  scml.oneshot.rl.observation.ObservationManager
      :value: None


   .. py:attribute:: _valid_index
      :type:  int
      :value: -1


   .. py:attribute:: _fallback_agent
      :type:  scml.oneshot.agent.OneShotAgent
      :value: None


   .. py:method:: setup_fallback()


   .. py:method:: has_no_valid_model()


   .. py:method:: context_switch()


   .. py:method:: init()

      Called once after the AWI is set.

      Remarks:
          - Use this for any proactive initialization code.


   .. py:method:: encode_state(mechanism_states: dict[str, negmas.sao.common.SAOState]) -> scml.oneshot.rl.common.RLState

      Called to generate a state to be passed to the act() method. The default is all of `awi` of type `OneShotState`


   .. py:method:: decode_action(action: scml.oneshot.rl.common.RLAction) -> dict[str, negmas.sao.common.SAOResponse]

      Generates offers to all partners from an encoded action. Default is to return the action as it is assuming it is a `dict[str, SAOResponse]`


   .. py:method:: act(state: scml.oneshot.rl.common.RLState) -> scml.oneshot.rl.common.RLAction

      The main policy. Generates an action given a state


   .. py:method:: propose(*args, **kwargs) -> negmas.outcomes.Outcome | None

      Called when the agent is asking to propose in one negotiation


   .. py:method:: respond(*args, **kwargs) -> negmas.gb.common.ResponseType

      Called when the agent is asked to respond to an offer


   .. py:method:: before_step()

      Called at at the BEGINNING of every production step (day)


   .. py:method:: step()

      Called at at the END of every production step (day)


   .. py:method:: on_negotiation_failure(*args, **kwargs) -> None

      Called when a negotiation the agent is a party of ends without agreement


   .. py:method:: on_negotiation_success(*args, **kwargs) -> None

      Called when a negotiation the agent is a party of ends with agreement