scml.std.rl.observation

Module Contents

Classes

FlexibleObservationManager

An observation manager that can be used with any SCML world.

ObservationManager

Manages the observations of an agent in an RL environment

Attributes

DefaultObservationManager

The default observation manager

scml.std.rl.observation.DefaultObservationManager[source]

The default observation manager

class scml.std.rl.observation.FlexibleObservationManager[source]

Bases: BaseObservationManager

An observation manager that can be used with any SCML world.

Parameters:
  • capacity_multiplier – A factor to multiply by the number of lines to give the maximum quantity allowed in offers

  • exogenous_multiplier – A factor to multiply maximum production capacity with when encoding exogenous quantities

  • continuous – If given the observation space will be a Box otherwise it will be a MultiDiscrete

  • n_prices – The number of prices to use for encoding the unit price (if not continuous)

  • max_production_cost – The limit for production cost. Anything above that will be mapped to this max

  • max_group_size – Maximum size used for grouping observations from multiple partners. This will be used in the number of partners in the simulation is larger than the number used for training.

  • n_past_received_offers – Number of past received offers to add to the observation.

  • n_bins

    1. bins to use for discretization (if not continuous)

  • n_sigmas – The number of sigmas used for limiting the range of randomly distributed variables

  • extra_checks – If given, extra checks are applied to make sure encoding and decoding make sense

Remarks:

capacity_multiplier: int = 1
n_prices: int = 2
max_group_size: int = 2
reduce_space_size: bool = True
n_past_received_offers: int = 1
extra_checks: bool = False
n_bins: int = 40
n_sigmas: int = 2
max_production_cost: int = 10
exogenous_multiplier: int = 1
max_quantity: int
_chosen_partner_indices: list[int] | None
_previous_offers: collections.deque
_dims: list[int] | None
__attrs_post_init__()[source]
get_dims() list[int][source]

Get the sizes of all dimensions in the observation space. Used if not continuous.

make_space() gymnasium.spaces.MultiDiscrete | gymnasium.spaces.Box[source]

Creates the action space

make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Creates the initial observation (returned from gym’s reset())

encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Encodes the awi as an array

extra_obs(awi: scml.oneshot.awi.OneShotAWI) list[tuple[float, int] | float][source]

The observation values other than offers and previous offers.

Returns:

A list of tuples. Each is some observation variable as a real number between zero and one and a number of bins to use for discrediting this variable. If a single value, the number of bins will be self.n_bin

get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None][source]

Gets offers from an encoded awi.

class scml.std.rl.observation.ObservationManager[source]

Bases: Protocol

Manages the observations of an agent in an RL environment

property context: scml.oneshot.context.BaseContext
make_space() gymnasium.spaces.Space[source]

Creates the observation space

encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Encodes an observation from the agent’s awi

make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray[source]

Creates the initial observation (returned from gym’s reset())

get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None][source]

Gets the offers from an encoded awi