scml.std.rl.observation ======================= .. py:module:: scml.std.rl.observation Attributes ---------- .. autoapisummary:: scml.std.rl.observation.DefaultObservationManager Classes ------- .. autoapisummary:: scml.std.rl.observation.FlexibleObservationManager scml.std.rl.observation.ObservationManager Module Contents --------------- .. py:data:: DefaultObservationManager The default observation manager .. py:class:: FlexibleObservationManager Bases: :py:obj:`BaseObservationManager` An observation manager that can be used with any SCML world. :param capacity_multiplier: A factor to multiply by the number of lines to give the maximum quantity allowed in offers :param exogenous_multiplier: A factor to multiply maximum production capacity with when encoding exogenous quantities :param continuous: If given the observation space will be a Box otherwise it will be a MultiDiscrete :param n_prices: The number of prices to use for encoding the unit price (if not `continuous`) :param max_production_cost: The limit for production cost. Anything above that will be mapped to this max :param max_group_size: Maximum size used for grouping observations from multiple partners. This will be used in the number of partners in the simulation is larger than the number used for training. :param n_past_received_offers: Number of past received offers to add to the observation. :param n_bins: N. bins to use for discretization (if not `continuous`) :param n_sigmas: The number of sigmas used for limiting the range of randomly distributed variables :param extra_checks: If given, extra checks are applied to make sure encoding and decoding make sense Remarks: ... .. py:attribute:: capacity_multiplier :type: int :value: 1 .. py:attribute:: n_prices :type: int :value: 2 .. py:attribute:: max_group_size :type: int :value: 2 .. py:attribute:: reduce_space_size :type: bool :value: True .. py:attribute:: n_past_received_offers :type: int :value: 1 .. py:attribute:: extra_checks :type: bool :value: False .. py:attribute:: n_bins :type: int :value: 40 .. py:attribute:: n_sigmas :type: int :value: 2 .. py:attribute:: max_production_cost :type: int :value: 10 .. py:attribute:: exogenous_multiplier :type: int :value: 1 .. py:attribute:: max_quantity :type: int .. py:attribute:: _chosen_partner_indices :type: list[int] | None .. py:attribute:: _previous_offers :type: collections.deque .. py:attribute:: _dims :type: list[int] | None .. py:method:: __attrs_post_init__() .. py:method:: get_dims() -> list[int] Get the sizes of all dimensions in the observation space. Used if not continuous. .. py:method:: make_space() -> gymnasium.spaces.MultiDiscrete | gymnasium.spaces.Box Creates the action space .. py:method:: make_first_observation(awi: scml.oneshot.awi.OneShotAWI) -> numpy.ndarray Creates the initial observation (returned from gym's reset()) .. py:method:: encode(awi: scml.oneshot.awi.OneShotAWI) -> numpy.ndarray Encodes the awi as an array .. py:method:: extra_obs(awi: scml.oneshot.awi.OneShotAWI) -> list[tuple[float, int] | float] The observation values other than offers and previous offers. :returns: A list of tuples. Each is some observation variable as a real number between zero and one and a number of bins to use for discrediting this variable. If a single value, the number of bins will be self.n_bin .. py:method:: get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) -> dict[str, negmas.outcomes.Outcome | None] Gets offers from an encoded awi. .. py:class:: ObservationManager Bases: :py:obj:`Protocol` Manages the observations of an agent in an RL environment .. py:property:: context :type: scml.oneshot.context.BaseContext .. py:method:: make_space() -> gymnasium.spaces.Space Creates the observation space .. py:method:: encode(awi: scml.oneshot.awi.OneShotAWI) -> numpy.ndarray Encodes an observation from the agent's awi .. py:method:: make_first_observation(awi: scml.oneshot.awi.OneShotAWI) -> numpy.ndarray Creates the initial observation (returned from gym's reset()) .. py:method:: get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) -> dict[str, negmas.outcomes.Outcome | None] Gets the offers from an encoded awi