scml.oneshot.rl.observation

Defines ways to encode and decode observations.

Attributes

DefaultObservationManager

The default observation manager

Classes

ObservationManager

Manages the observations of an agent in an RL environment

FlexibleObservationManager

An observation manager that can be used with any SCML world.

Module Contents

class scml.oneshot.rl.observation.ObservationManager[source]

Bases: Protocol

Manages the observations of an agent in an RL environment

property context: scml.oneshot.context.BaseContext[source]
make_space() gymnasium.spaces.Space[source]

Creates the observation space

encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Encodes an observation from the agent’s awi

make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Creates the initial observation (returned from gym’s reset())

get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None][source]

Gets the offers from an encoded awi

class scml.oneshot.rl.observation.FlexibleObservationManager[source]

Bases: BaseObservationManager

An observation manager that can be used with any SCML world.

Parameters:
  • capacity_multiplier – A factor to multiply by the number of lines to give the maximum quantity allowed in offers

  • exogenous_multiplier – A factor to multiply maximum production capacity with when encoding exogenous quantities

  • continuous – If given the observation space will be a Box otherwise it will be a MultiDiscrete

  • n_prices – The number of prices to use for encoding the unit price (if not continuous)

  • max_production_cost – The limit for production cost. Anything above that will be mapped to this max

  • max_group_size – Maximum size used for grouping observations from multiple partners. This will be used in the number of partners in the simulation is larger than the number used for training.

  • n_past_received_offers – Number of past received offers to add to the observation.

  • n_bins

    1. bins to use for discretization (if not continuous)

  • n_sigmas – The number of sigmas used for limiting the range of randomly distributed variables

  • extra_checks – If given, extra checks are applied to make sure encoding and decoding make sense

Remarks:

capacity_multiplier: int = 1[source]
n_prices: int = 2[source]
max_group_size: int = 2[source]
reduce_space_size: bool = True[source]
n_past_received_offers: int = 1[source]
extra_checks: bool = False[source]
n_bins: int = 40[source]
n_sigmas: int = 2[source]
max_production_cost: int = 10[source]
exogenous_multiplier: int = 1[source]
max_quantity: int[source]
_chosen_partner_indices: list[int] | None[source]
_previous_offers: collections.deque[source]
_dims: list[int] | None[source]
__attrs_post_init__()[source]
get_dims() list[int][source]

Get the sizes of all dimensions in the observation space. Used if not continuous.

make_space() gymnasium.spaces.MultiDiscrete | gymnasium.spaces.Box[source]

Creates the action space

make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Creates the initial observation (returned from gym’s reset())

encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Encodes the awi as an array

extra_obs(awi: scml.oneshot.awi.OneShotAWI) list[tuple[float, int] | float][source]

The observation values other than offers and previous offers.

Returns:

A list of tuples. Each is some observation variable as a real number between zero and one and a number of bins to use for discrediting this variable. If a single value, the number of bins will be self.n_bin

get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None][source]

Gets offers from an encoded awi.

scml.oneshot.rl.observation.DefaultObservationManager[source]

The default observation manager