scml.std.rl.observation
Module Contents
Classes
An observation manager that can be used with any SCML world. |
|
Manages the observations of an agent in an RL environment |
Attributes
The default observation manager |
- class scml.std.rl.observation.FlexibleObservationManager[source]
Bases:
BaseObservationManager
An observation manager that can be used with any SCML world.
- Parameters:
capacity_multiplier – A factor to multiply by the number of lines to give the maximum quantity allowed in offers
exogenous_multiplier – A factor to multiply maximum production capacity with when encoding exogenous quantities
continuous – If given the observation space will be a Box otherwise it will be a MultiDiscrete
n_prices – The number of prices to use for encoding the unit price (if not
continuous
)max_production_cost – The limit for production cost. Anything above that will be mapped to this max
max_group_size – Maximum size used for grouping observations from multiple partners. This will be used in the number of partners in the simulation is larger than the number used for training.
n_past_received_offers – Number of past received offers to add to the observation.
n_bins –
bins to use for discretization (if not
continuous
)
n_sigmas – The number of sigmas used for limiting the range of randomly distributed variables
extra_checks – If given, extra checks are applied to make sure encoding and decoding make sense
- Remarks:
…
- _previous_offers: collections.deque
- get_dims() list[int] [source]
Get the sizes of all dimensions in the observation space. Used if not continuous.
- make_space() gymnasium.spaces.MultiDiscrete | gymnasium.spaces.Box [source]
Creates the action space
- make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Creates the initial observation (returned from gym’s reset())
- encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Encodes the awi as an array
- extra_obs(awi: scml.oneshot.awi.OneShotAWI) list[tuple[float, int] | float] [source]
The observation values other than offers and previous offers.
- Returns:
A list of tuples. Each is some observation variable as a real number between zero and one and a number of bins to use for discrediting this variable. If a single value, the number of bins will be self.n_bin
- get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None] [source]
Gets offers from an encoded awi.
- class scml.std.rl.observation.ObservationManager[source]
Bases:
Protocol
Manages the observations of an agent in an RL environment
- property context: scml.oneshot.context.BaseContext
- encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Encodes an observation from the agent’s awi
- make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Creates the initial observation (returned from gym’s reset())
- get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None] [source]
Gets the offers from an encoded awi