scml.std
Submodules
Attributes
Index of quantity in negotiation issues |
|
Index of unit price in negotiation issues |
|
Index of time in negotiation issues |
|
A constant indicating an invalid cost for lines incapable of running some process |
|
ID of the system buyer agent |
|
ID of the system seller agent |
|
The default action manager |
|
We assume that RL states are numpy arrays |
|
We assume that RL actions are numpy arrays |
|
A policy is a callable that receives a state and returns an action |
|
The default observation manager |
|
Classes
Base class for all agents in the standard game. |
|
Base class for agents that negotiate synchronously by receiving all offers at once then responding to all of them at once. |
|
Base class for all SAO negotiators. Implemented by implementing propose() and respond() methods. |
|
Uses a time-based strategy to accept a single agreement from the set |
|
A greedy agent based on StdAgent |
|
A greedy agent based on OneShotSyncAgent |
|
A greedy agent based on OneShotAgent |
|
An agent that always raises an exception if called to negotiate. It is useful as a placeholder (for example for RL and MARL exposition) |
|
An agent that distributes its needs over its partners randomly. |
|
An agent that distributes its needs over its partners randomly. |
|
A naive random agent |
|
The agent world interface for the one-shot game. |
|
State of a one-shot agent |
|
Exogenous contract information |
|
Defines all private information of a factory |
|
A report published periodically by the system showing the financial standing of an agent |
|
A context that generates std worlds with agents of a given |
|
A context that generates std worlds with agents of a given |
|
Generates a world limiting the range of the agent level, production capacity |
|
Generates a oneshot world limiting the range of the agent level, production capacity |
|
Generates a oneshot world with no constraints except compatibility with a specific ANAC competition year. |
|
A world context that can generate any world compatible with the observation manager |
|
A supplier with almost many consumers relative to competitors |
|
A supplier with almost same number of consumers as competitors |
|
A supplier with few consumers relative to competitors |
|
A world context that can generate any world compatible with the observation manager |
|
A supplier with almost many consumers relative to competitors |
|
A supplier with almost same number of consumers as competitors |
|
A supplier with few consumers relative to competitors |
|
A world context that can generate any world compatible with the observation manager |
|
A consumer with almost many suppliers relative to competitors |
|
A consumer with almost same number of suppliers as competitors |
|
A consumer with few suppliers relative to competitors |
|
A basic context fixing stationary world config parameters |
|
Encapsulates one or more configs and switches between them when asked to generate or make something. |
|
A std agent structured in three components, state encoder, policy (action) and action decoder. |
|
Manges actions of an agent in an RL environment. |
|
An action manager that matches any context. |
|
The main Gymnasium class for implementing Reinforcement Learning Agents environments. |
|
Manages the observations of an agent in an RL environment |
|
An observation manager that can be used with any SCML world. |
|
Represents a reward function. |
|
The default reward function of SCML |
|
The base class of all agents running in Std based on StdAgent. |
|
Implements an agent for handling system operations |
|
Calculates the utility function of a list of contracts or offers. |
|
The world representing the base standard simulation (starting SCML 2024) |
|
The SCML-standard simulation as used in [SCML 2024](https://scml.cs.brown.edu) |
Functions
|
Checks whether an agent is a system agent or not |
|
Checks whether an agent is a system agent or not |
|
Wraps a stable_baselines3 model as an RL model |
|
Samples a random action from the action space of the |
|
Ends the negotiation or accepts with a predefined probability or samples a random response. |
|
A simple greedy policy. |
|
Returns all built-in agents. |
Package Contents
- class scml.std.StdAgent(owner=None, ufun: scml.oneshot.OneShotUFun | None = None, name=None)[source]
Bases:
scml.oneshot.agent.OneShotAgent
Base class for all agents in the standard game.
- Remarks:
You can access all of the negotiators associated with the agent using
self.negotiators
which is a dictionary mapping thenegotiator_id
to a tuple of two values: TheSAONegotiator
object and a key-value context dictionary.The
negotiator_id
associated with a negotiation with some partner will be the same as the agent ID of that partner. This means that all negotiators engaged with some partner over all simulation steps will have the same ID which is useful if you are keeping information about past negotiations and partner behavior.
- class scml.std.StdSyncAgent(*args, **kwargs)[source]
Bases:
scml.oneshot.agent.OneShotSyncAgent
,StdAgent
Base class for agents that negotiate synchronously by receiving all offers at once then responding to all of them at once.
- class scml.std.EndingNegotiator(preferences: negmas.preferences.preferences.Preferences | None = None, ufun: negmas.preferences.base_ufun.BaseUtilityFunction | None = None, name: str | None = None, parent: negmas.negotiators.Controller | None = None, owner: negmas.situated.Agent | None = None, id: str | None = None, type_name: str | None = None, can_propose: bool = True, **kwargs)[source]
Bases:
negmas.sao.SAONegotiator
,negmas.ControlledNegotiator
Base class for all SAO negotiators. Implemented by implementing propose() and respond() methods.
- Parameters:
name – Negotiator name
parent – Parent controller if any
preferences – The preferences of the negotiator
ufun – The utility function of the negotiator (overrides preferences if given)
owner – The
Agent
that owns the negotiator.
- Remarks:
The only method that must be implemented by any SAONegotiator is
propose
.The default
respond
method, accepts offers with a utility value no less than whateverpropose
returns with the same mechanism state.A default implementation of respond() is provided which simply accepts any offer better than the last offer I gave or the next one I would have given in the current state.
See also
SAOCallNegotiator
- propose(state)[source]
Propose an offer or None to refuse.
- Parameters:
state –
GBState
giving current state of the negotiation.- Returns:
The outcome being proposed or None to refuse to propose
- Remarks:
This function guarantees that no agents can propose something with a utility value
- respond(state, source=None)[source]
Called to respond to an offer. This is the method that should be overriden to provide an acceptance strategy.
- Parameters:
state – a
SAOState
giving current state of the negotiation.source – The ID of the negotiator that gave this offer
- Returns:
The response to the offer
- Return type:
ResponseType
- Remarks:
The default implementation never ends the negotiation
The default implementation asks the negotiator to
propose`() and accepts the `offer
if its utility was at least as good as the offer that it would have proposed (and above the reserved value).The current offer to respond to can be accessed through
state.current_offer
- class scml.std.SingleAgreementAspirationAgent(*args, **kwargs)[source]
Bases:
scml.oneshot.agent.OneShotSyncAgent
Uses a time-based strategy to accept a single agreement from the set it is considering.
- before_step()[source]
Called at the beginning of every step.
- Remarks:
Use this for any proactive code that needs to be done every simulation step.
- counter_all(offers, states)[source]
Calculate a response to all offers from all negotiators (negotiator ID is the key).
- Parameters:
offers – Maps negotiator IDs to offers
states – Maps negotiator IDs to offers AT the time the offers were made.
- Returns:
A dictionary mapping negotiator ID to an
SAOResponse
. The response per agent consist of a tuple. In case of acceptance or ending the negotiation the second item of the tuple should be None. In case of rejection, the second item should be the counter offer.
- Remarks:
The response type CANNOT be WAIT.
If the system determines that a loop is formed, the agent may
receive this call for a subset of negotiations not all of them.
- class scml.std.GreedyStdAgent(*args, concession_exponent=None, acc_price_slack=float('inf'), step_price_slack=None, opp_price_slack=None, opp_acc_price_slack=None, range_slack=None, future_threshold=0.9, production_target=0.75, **kwargs)[source]
Bases:
scml.std.agent.StdAgent
A greedy agent based on StdAgent
- Parameters:
concession_exponent – A real number controlling how fast does the agent concede on price.
acc_price_slack – The allowed slack in price limits compared with best prices I got so far
step_price_slack – The allowed slack in price limits compared with best prices I got this step
opp_price_slack – The allowed slack in price limits compared with best prices I got so far from a given opponent in this step
opp_acc_price_slack – The allowed slack in price limits compared with best prices I got so far from a given opponent so far
range_slack – Always consider prices above (1-
range_slack
) of the best possible prices good enough.production_target – Fraction of production capacity to be secured in advance
- Remarks:
A
concession_exponent
greater than one makes the agent concede super linearly and vice versa
- _e = None
- _acc_price_slack
- _step_price_slack = None
- _opp_price_slack = None
- _opp_acc_price_slack = None
- _range_slack = None
- _production_target = 0.75
- _future_threshold = 0.9
- propose(negotiator_id: str, state, source=None) negmas.Outcome | None [source]
Proposes an offer to one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
an outcome to offer.
- respond(negotiator_id, state, source=None) negmas.ResponseType [source]
Responds to an offer from one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
A response type which can either be reject, accept, or end negotiation.
- Remarks:
default behavior is to accept only if the current offer is the same or has a higher utility compared with what the agent would have proposed in the given state and reject otherwise
- class scml.std.GreedySyncAgent(*args, threshold=None, **kwargs)[source]
Bases:
scml.oneshot.agent.OneShotSyncAgent
,GreedyOneShotAgent
A greedy agent based on OneShotSyncAgent
- _threshold = None
- ufun: scml.oneshot.ufun.OneShotUFun
Returns the preferences if it is a
BaseUtilityFunction
else None
- before_step()[source]
Called at the beginning of every step.
- Remarks:
Use this for any proactive code that needs to be done every simulation step.
- first_proposals()[source]
Decide a first proposal on every negotiation. Returning None for a negotiation means ending it.
- counter_all(offers, states) dict [source]
Respond to a set of offers given the negotiation state of each.
- propose(negotiator_id, state)[source]
Proposes an offer to one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
an outcome to offer.
- respond(negotiator_id, state, source='')[source]
Responds to an offer from one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
A response type which can either be reject, accept, or end negotiation.
- Remarks:
default behavior is to accept only if the current offer is the same or has a higher utility compared with what the agent would have proposed in the given state and reject otherwise
- class scml.std.GreedyOneShotAgent(*args, concession_exponent=None, acc_price_slack=float('inf'), step_price_slack=None, opp_price_slack=None, opp_acc_price_slack=None, range_slack=None, **kwargs)[source]
Bases:
scml.oneshot.agent.OneShotAgent
A greedy agent based on OneShotAgent
- Parameters:
concession_exponent – A real number controlling how fast does the agent concede on price.
acc_price_slack – The allowed slack in price limits compared with best prices I got so far
step_price_slack – The allowed slack in price limits compared with best prices I got this step
opp_price_slack – The allowed slack in price limits compared with best prices I got so far from a given opponent in this step
opp_acc_price_slack – The allowed slack in price limits compared with best prices I got so far from a given opponent so far
range_slack – Always consider prices above (1-
range_slack
) of the best possible prices good enough.
- Remarks:
A
concession_exponent
greater than one makes the agent concede super linearly and vice versa
- _e = None
- _acc_price_slack
- _step_price_slack = None
- _opp_price_slack = None
- _opp_acc_price_slack = None
- _range_slack = None
- propose(negotiator_id: str, state, source=None) negmas.Outcome | None [source]
Proposes an offer to one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
an outcome to offer.
- respond(negotiator_id, state, source=None) negmas.ResponseType [source]
Responds to an offer from one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
A response type which can either be reject, accept, or end negotiation.
- Remarks:
default behavior is to accept only if the current offer is the same or has a higher utility compared with what the agent would have proposed in the given state and reject otherwise
- class scml.std.StdPlaceholder(*args, **kwargs)[source]
Bases:
scml.std.policy.StdPolicy
An agent that always raises an exception if called to negotiate. It is useful as a placeholder (for example for RL and MARL exposition)
- class scml.std.SyncRandomStdAgent(*args, today_target_productivity=0.3, future_target_productivity=0.3, today_concentration=0.25, future_concentration=0.75, today_concession_exp=2.0, future_concession_exp=4.0, future_min_price=0.25, prioritize_near_future: bool = False, prioritize_far_future: bool = False, pfuture=0.15, **kwargs)[source]
Bases:
scml.std.agent.StdSyncAgent
An agent that distributes its needs over its partners randomly.
- ptoday = 0.85
- today_exp = 2.0
- future_exp = 4.0
- fmin = 0.25
- today_productivity = 0.3
- future_productivity = 0.3
- near = False
- far = False
- future_concentration = 0.75
- today_concentration = 0.25
- first_proposals()[source]
Gets a set of proposals to use for initializing the negotiation.
- Returns:
A dictionary mapping each negotiator (in self.negotiators dict) to an outcome to be used as the first proposal if the agent is to start a negotiation.
- counter_all(offers, states)[source]
Calculate a response to all offers from all negotiators (negotiator ID is the key).
- Parameters:
offers – Maps negotiator IDs to offers
states – Maps negotiator IDs to offers AT the time the offers were made.
- Returns:
A dictionary mapping negotiator ID to an
SAOResponse
. The response per agent consist of a tuple. In case of acceptance or ending the negotiation the second item of the tuple should be None. In case of rejection, the second item should be the counter offer.
- Remarks:
The response type CANNOT be WAIT.
If the system determines that a loop is formed, the agent may
receive this call for a subset of negotiations not all of them.
- distribute_todays_needs(partners=None) dict[str, int] [source]
Distributes my needs randomly over all my partners
- distribute_future_offers(partners: list[str]) dict[str, negmas.Outcome | None] [source]
Distribute future offers over the given partners
- buy_price(t: float, mn: float, mx: float, today: bool) float [source]
Return a good price to buy at
- class scml.std.SyncRandomOneShotAgent(*args, equal: bool = False, overordering_max: float = 0.2, overordering_min: float = 0.0, overordering_exp: float = 0.4, mismatch_exp: float = 4.0, mismatch_max: float = 0.3, **kwargs)[source]
Bases:
scml.oneshot.agent.OneShotSyncAgent
An agent that distributes its needs over its partners randomly.
- Parameters:
equal – If given, it tries to equally distribute its needs over as many of its suppliers/consumers as possible
overordering_max – Maximum fraction of needs to over-order. For example, it the agent needs 5 items and this is 0.2, it will order 6 in the first negotiation step.
overordering_min – Minimum fraction of needs to over-order. Used in the last negotiation step.
overordering_exp – Controls how fast does the over-ordering quantity go from max to min.
concession_exp – Controls how fast does the agent concedes on matching its needs exactly.
mismatch_max – Maximum mismtach in quantity allowed between needs and accepted offers. If a fraction, it is will be this fraction of the production capacity (n_lines).
- equal_distribution = False
- overordering_max = 0.2
- overordering_min = 0.0
- overordering_exp = 0.4
- mismatch_exp = 4.0
- mismatch_max = 0.3
- init()[source]
Called once after the AWI is set.
- Remarks:
Use this for any proactive initialization code.
- distribute_needs(t: float) dict[str, int] [source]
Distributes my needs randomly over all my partners
- first_proposals()[source]
Gets a set of proposals to use for initializing the negotiation.
- Returns:
A dictionary mapping each negotiator (in self.negotiators dict) to an outcome to be used as the first proposal if the agent is to start a negotiation.
- counter_all(offers, states)[source]
Calculate a response to all offers from all negotiators (negotiator ID is the key).
- Parameters:
offers – Maps negotiator IDs to offers
states – Maps negotiator IDs to offers AT the time the offers were made.
- Returns:
A dictionary mapping negotiator ID to an
SAOResponse
. The response per agent consist of a tuple. In case of acceptance or ending the negotiation the second item of the tuple should be None. In case of rejection, the second item should be the counter offer.
- Remarks:
The response type CANNOT be WAIT.
If the system determines that a loop is formed, the agent may
receive this call for a subset of negotiations not all of them.
- class scml.std.RandomStdAgent(owner=None, ufun=None, name=None, p_accept=PROB_ACCEPTANCE, p_end=PROB_END)[source]
Bases:
scml.std.agent.StdAgent
A naive random agent
- propose(negotiator_id: str, state: negmas.sao.SAOState) negmas.Outcome | None [source]
Proposes an offer to one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
an outcome to offer.
- respond(negotiator_id, state, source=None)[source]
Responds to an offer from one of the partners.
- Parameters:
negotiator_id – ID of the negotiator (and partner)
state – Mechanism state including current step
- Returns:
A response type which can either be reject, accept, or end negotiation.
- Remarks:
default behavior is to accept only if the current offer is the same or has a higher utility compared with what the agent would have proposed in the given state and reject otherwise
- class scml.std.StdAWI(world: scml.oneshot.world.SCMLBaseWorld, agent: scml.oneshot.agent.OneShotAgent)[source]
Bases:
scml.oneshot.awi.OneShotAWI
The agent world interface for the one-shot game.
This class contains all the methods needed to access the simulation to extract information which are divided into 4 groups:
- Static World Information:
Information about the world and the agent that does not change over time. These include:
Market Information:
n_products: Number of products in the production chain.
n_processes: Number of processes in the production chain.
n_competitors: Number of other factories on the same production level.
all_suppliers: A list of all suppliers by product.
all_consumers: A list of all consumers by product.
- proudction_capacities: The total production capacity (i.e. number of lines)
for each production level (i.e. manufacturing process).
is_system: Is the given system ID corresponding to a system agent?
is_bankrupt: Is the given agent bankrupt? None asks about self
catalog_prices: A list of the catalog prices (by product).
price_multiplier: The multiplier multiplied by the trading/catalog price when the negotiation agendas are created to decide the maximum and lower quantities.
is_exogenous_forced: Are exogenous contracts always forced or can the agent decide not to sign them.
current_step: Current simulation step (inherited from
negmas.situated.AgentWorldInterface
).n_steps: Number of simulation steps (inherited from
negmas.situated.AgentWorldInterface
).relative_time: fraction of the simulation completed (inherited from
negmas.situated.AgentWorldInterface
).state: The full state of the agent (
OneShotState
).**settings* The system settings (inherited from
negmas.situated.AgentWorldInterface
).**quantity_range* The maximum quantity in all negotiation agendas (new in 0.6.1)
**price_range* The maximum number of different prices in any negotiation agenda (new in 0.6.1)
**horizon* The negotiation horizon for delivery dates. A value greater than zero indicates that you can get agreements about future deliveries.
Agent Information:
profile: Gives the agent profile including its production cost, number of production lines, input product index, mean of its delivery penalties, mean of its disposal costs, standard deviation of its shortfall penalties and standard deviation of its disposal costs. See
OneShotProfile
for full description. This information is private information and no other agent knows it.n_lines: the number of production lines in the factory (private information).
is_first_level: Is the agent in the first production level (i.e. it is an input agent that buys the raw material).
is_last_level: Is the agent in the last production level (i.e. it is an output agent that sells the final product).
is_middle_level: Is the agent neither a first level nor a last level agent
my_input_product: The input product to the factory controlled by the agent.
my_output_product: The output product from the factory controlled by the agent.
level: The production level which is numerically the same as the input product.
my_suppliers: A list of IDs for all suppliers to the agent (i.e. agents that can sell the input product of the agent).
my_consumers: A list of IDs for all consumers to the agent (i.e. agents that can buy the output product of the agent).
penalties_scale: The scale at which to calculate disposal cost/delivery penalties. “trading” and “catalog” mean trading and catalog prices. “unit” means the contract’s unit price while “none” means that disposal cost/shortfall penalty are absolute.
n_input_negotiations: Number of negotiations with suppliers.
n_output_negotiations: Number of negotiations with consumers.
- Dynamic World Information:
Information about the world and the agent that changes over time.
Market Information:
trading_prices: The trading prices of all products. This information is only available if
publish_trading_prices
is set in the world.exogenous_contract_summary: A list of n_products tuples each giving the total quantity and average price of exogenous contracts for a product. This information is only available if
publish_exogenous_summary
is set in the world.is_perishable: Are all products perishable?
Other Agents’ Information:
reports_of_agent: Gives all past financial reports of a given agent. See
FinancialReport
for details.reports_at_step: Gives all reports of all agents at a given step. See
FinancialReport
for details.
Current Negotiations Information:
current_input_outcome_space: The current outcome-space for all negotiations to buy the input product of the agent. If the agent is at level zero, this will have no issues.
current_output_outcome_space: The current outcome-space for all negotiations to buy the output product of the agent. If the agent is at level n_products - 1, this will have no issues.
current_negotiation_details: Details on all current negotiations separated into “buy” and “sell” dictionaries.
Useful helpers about current negotiations:
current_input_issues: The current issues for all negotiations to buy the input product of the agent. If the agent is at level zero, this will be empty. This is exactly the same as current_input_outcome_space.issues
current_output_issues: The current issues for all negotiations to buy the output product of the agent. If the agent is at level n_products - 1, this will be empty. This is exactly the same as current_output_outcome_space.issues
current_buy_nmis: All NMIs for current buy negotiations.
current_sell_nmis: All NMIs for current sell negotiations.
current_nmis: All states for current negotiations.
current_buy_states: All states for current buy negotiations.
current_sell_states: All states for current sell negotiations.
current_states: All states for current negotiations.
current_buy_offers: All offers for current buy negotiations.
current_sell_offers: All offers for current sell negotiations.
current_offers: All offers for current negotiations.
running_buy_nmis: All NMIs for running buy negotiations.
running_sell_nmis: All NMIs for running sell negotiations.
running_nmis: All states for running negotiations.
running_buy_states: All states for running buy negotiations.
running_sell_states: All states for running sell negotiations.
running_states: All states for running negotiations.
Agent Information:
current_exogenous_input_quantity: The total quantity the agent have in its input exogenous contract.
current_exogenous_input_price: The total price of the agent’s input exogenous contract.
current_exogenous_output_quantity: The total quantity the agent have in its output exogenous contract.
current_exogenous_output_price: The total price of the agent’s output exogenous contract
current_disposal_cost: The disposal cost per unit item in the current step.
current_shortfall_penalty: The shortfall penalty per unit item in the current step.
current_balance: The current balance of the agent
current_score: The current score (balance / initial balance) of the agent
current_inventory_input: The total quantity remaining in the inventory of the input product
current_inventory_output: The total quantity remaining in the inventory of the output product
current_inventory: The total quantity remaining in the inventory of the input and output product
Sales and Supplies (quantities) for today:
sales: Today’s sales per customer so far.
supplies: Today’s supplies per supplier so far.
total_sales: Today’s total sales so far.
total_supplies: Today’s total supplies so far.
needed_sales: Today’s needed sales as of now (exogenous input + total supplies - exogenous output - total sales so far).
needed_supplies: Today’s needed supplies as of now (exogenous output + total sales - exogenous input - total supplies so far).
future_sales: Future quantity of the output product in standing contracts not executed nor nullified.
future_supplies: Future quantity of the input product in standing contracts not executed nor nullified.
total_future_sales: Total future quantity of the output product in standing contracts not executed nor nullified.
total_future_supplies: Total future quantity of the input product in standing contracts not executed nor nullified.
total_future_sales_between: Total future sale quantities between the given two simulated days (non-exogenous).
total_future_supplies_between: Total future supply quantities between the given two simulated days (non-exogenous).
total_future_sales_until: Total future sale quantities between tomorrow and the given day (non-exogenous).
total_future_supplies_until: Total future supply quantities between tomorrow and the given day (non-exogenous).
total_future_sales_at: Total future sale quantities at the given day (non-exogenous).
total_future_supplies_at: Total future supply quantities at the given day (non-exogenous).
future_sales_cost: Future total_cost of the output product in standing contracts not executed nor nullified.
future_supplies_cost: Future total cost of the input product in standing contracts not executed nor nullified.
- Services (All inherited from
negmas.situated.AgentWorldInterface
): logdebug/loginfo/logwarning/logerror: Logs to the world log at the given log level.
logdebug_agent/loginf_agnet/…: Logs to the agent specific log at the given log level.
bb_query: Queries the bulletin-board.
bb_read: Read a section of the bulletin-board.
- class scml.std.StdState[source]
Bases:
scml.oneshot.common.OneShotState
State of a one-shot agent
- class scml.std.StdExogenousContract[source]
Bases:
scml.oneshot.common.OneShotExogenousContract
Exogenous contract information
- class scml.std.StdProfile[source]
Bases:
scml.oneshot.common.OneShotProfile
Defines all private information of a factory
- class scml.std.FinancialReport[source]
A report published periodically by the system showing the financial standing of an agent
- __slots__ = ['agent_id', 'step', 'cash', 'assets', 'breach_prob', 'breach_level', 'is_bankrupt', 'agent_name']
- breach_prob: float
Number of times the agent breached a contract over the total number of contracts it signed.
- breach_level: float
Sum of the agent’s breach levels so far divided by the number of contracts it signed.
- scml.std.is_system_agent(aid: str) bool [source]
Checks whether an agent is a system agent or not
- Parameters:
aid – Agent ID
- Returns:
True if the ID is for a system agent.
- scml.std.INFINITE_COST = 4611686018427387903[source]
A constant indicating an invalid cost for lines incapable of running some process
- scml.std.is_system_agent(aid: str) bool [source]
Checks whether an agent is a system agent or not
- Parameters:
aid – Agent ID
- Returns:
True if the ID is for a system agent.
- class scml.std.BaseStdContext[source]
Bases:
scml.oneshot.context.BaseContext
A context that generates std worlds with agents of a given
types
with predetermined structure and settings- world_type: type[scml.oneshot.world.SCMLBaseWorld]
- non_competitors: tuple[str | type[scml.oneshot.agent.OneShotAgent], Ellipsis]
- class scml.std.GeneralStdContext[source]
Bases:
scml.oneshot.context.GeneralContext
A context that generates std worlds with agents of a given
types
with predetermined structure and settings- world_type: type[scml.oneshot.world.SCMLBaseWorld]
- non_competitors: tuple[str | type[scml.oneshot.agent.OneShotAgent], Ellipsis]
- class scml.std.FixedPartnerNumbersStdContext[source]
Bases:
scml.oneshot.context.FixedPartnerNumbersContext
Generates a world limiting the range of the agent level, production capacity and the number of suppliers, consumers, and optionally same-level competitors.
- world_type: type[scml.oneshot.world.SCMLBaseWorld]
- non_competitors: tuple[str | type[scml.oneshot.agent.OneShotAgent], Ellipsis]
- class scml.std.LimitedPartnerNumbersStdContext[source]
Bases:
scml.oneshot.context.LimitedPartnerNumbersOneShotContext
Generates a oneshot world limiting the range of the agent level, production capacity and the number of suppliers, consumers, and optionally same-level competitors.
- class scml.std.ANACStdContext[source]
Bases:
GeneralStdContext
Generates a oneshot world with no constraints except compatibility with a specific ANAC competition year.
- class scml.std.SupplierStdContext(*args, **kwargs)[source]
Bases:
scml.oneshot.context.SupplierContext
A world context that can generate any world compatible with the observation manager
- world_type: type[scml.oneshot.world.SCMLBaseWorld]
- non_competitors: tuple[str | type[scml.oneshot.agent.OneShotAgent], Ellipsis]
- class scml.std.StrongSupplierStdContext(*args, **kwargs)[source]
Bases:
SupplierStdContext
A supplier with almost many consumers relative to competitors
- class scml.std.BalancedSupplierStdContext(*args, **kwargs)[source]
Bases:
SupplierStdContext
A supplier with almost same number of consumers as competitors
- class scml.std.WeakSupplierStdContext(*args, **kwargs)[source]
Bases:
SupplierStdContext
A supplier with few consumers relative to competitors
- class scml.std.MiddleManStdContext(*args, **kwargs)[source]
Bases:
scml.oneshot.context.LimitedPartnerNumbersOneShotContext
A world context that can generate any world compatible with the observation manager
- class scml.std.StrongMiddleManStdContext(*args, **kwargs)[source]
Bases:
MiddleManStdContext
A supplier with almost many consumers relative to competitors
- class scml.std.BalancedMiddleManStdContext(*args, **kwargs)[source]
Bases:
MiddleManStdContext
A supplier with almost same number of consumers as competitors
- class scml.std.WeakMiddleManStdContext(*args, **kwargs)[source]
Bases:
MiddleManStdContext
A supplier with few consumers relative to competitors
- class scml.std.ConsumerStdContext(*args, **kwargs)[source]
Bases:
scml.oneshot.context.ConsumerContext
A world context that can generate any world compatible with the observation manager
- world_type: type[scml.oneshot.world.SCMLBaseWorld]
- non_competitors: tuple[str | type[scml.oneshot.agent.OneShotAgent], Ellipsis]
- class scml.std.StrongConsumerStdContext(*args, **kwargs)[source]
Bases:
ConsumerStdContext
A consumer with almost many suppliers relative to competitors
- class scml.std.BalancedConsumerStdContext(*args, **kwargs)[source]
Bases:
ConsumerStdContext
A consumer with almost same number of suppliers as competitors
- class scml.std.WeakConsumerStdContext(*args, **kwargs)[source]
Bases:
ConsumerStdContext
A consumer with few suppliers relative to competitors
- class scml.std.StdContext[source]
Bases:
GeneralStdContext
A basic context fixing stationary world config parameters
- class scml.std.RepeatingStdContext[source]
Bases:
scml.oneshot.context.RepeatingContext
Encapsulates one or more configs and switches between them when asked to generate or make something.
- world_type: type[scml.oneshot.world.SCMLBaseWorld]
- non_competitors: tuple[str | type[scml.oneshot.agent.OneShotAgent], Ellipsis]
- class scml.std.StdPolicy(*args, **kwargs)[source]
Bases:
scml.oneshot.policy.OneShotPolicy
A std agent structured in three components, state encoder, policy (action) and action decoder.
The agent is divided into three components:
State encoder (encode_state()) which takes the current state of all negotiation mechanisms, access the awi as needed, and generates a state which can be of any type to be passed to the next component.
Policy (act()) which takes the state generated from the state encoder and returns an action which may be encoded as any type to be passed to the next component. The policy (i.e. `act` () method) is not supposed to access the AWI or any other members of the class. It is preferred to be a pure function. This makes it easy to test the policy at predefined conditions (i.e. states) without having to construct a simulation.
Action decoder (decode_action()) which takes the action generated from the policy and generates the appropriate set of responses to all partners.
- Remarks:
The simplest form of state encoder which is implemented by default is to return the
state
member of the AWI.The simplest form of action encoding is to simply return the responses as a
dict[str, SAOResponse]
fromact
which is then passed as it is bydecode_action
. This is the default implementation ofdecode_action
- class scml.std.ActionManager[source]
Bases:
abc.ABC
Manges actions of an agent in an RL environment.
- context: scml.oneshot.context.BaseContext
- abstract decode(awi: scml.oneshot.awi.OneShotAWI, action: numpy.ndarray) dict[str, negmas.sao.common.SAOResponse] [source]
Decodes an action from an array to a
PurchaseOrder
and aCounterMessage
.
- encode(awi: scml.oneshot.awi.OneShotAWI, responses: dict[str, negmas.sao.common.SAOResponse]) numpy.ndarray [source]
Encodes an action as an array. This is only used for testing so it is optional
- class scml.std.FlexibleActionManager[source]
Bases:
ActionManager
An action manager that matches any context.
- Parameters:
n_prices – Number of distinct prices allowed in the action.
max_quantity – Maximum allowed quantity to offer in any negotiation. The number of quantities is one plus that because zero is allowed to model ending negotiation.
n_partners – Maximum of partners allowed in the action.
- Remarks:
This action manager will always generate offers that are within the price and quantity limits given in its parameters. Wen decoding them, it will scale them up so that the maximum corresponds to the actual value in the world it finds itself. For example, if
n_prices
is 10 and the world has only two prices currently in the price issue, it will use any value less than 5 as the minimum price and any value above 5 as the maximum price. If on the other hand the current price issue has 20 values, then it will scale by multiplying the number given in the encoded action (ranging from 0 to 9) by 19/9 which makes it range from 0 to 19 which is what is expected by the world.This action manager will adjust offers for different number of partners as follows: - If the true number of partners is larger than
n_partners
used by this action manager,it will simply use
n_partners
of them and always end negotiations with the rest of them.If the true number of partners is smaller than
n_partners
, it will use the firstn_partners
values in the encoded action and increase the quantities of any counter offers (i.e. ones in which the response is REJECT_OFFER) by the amount missing from the ignored partners in the encoded action up to the maximum quantities allowed by the current negotiation context. For example, ifn_partneers
is 4 and we have only 2 partners in reality, and the received quantities from partners were [4, 3] while the maximum quantity allowed is 10 and the encoded action was [2, *, 3, *, 2, *, 1, *] (where we ignored prices), then the encoded action will be converted to [(Reject, 5, *), (Accept, 3, *)] where the 3 extra units that were supposed to be offered to the last two partners are moved to the first partner. If the maximum quantity allowed was 4 in that example, the result will be [(Reject, 4, *), (Accept, 3, *)].
- make_space() gymnasium.spaces.MultiDiscrete | gymnasium.spaces.Box [source]
Creates the action space
- decode(awi: scml.oneshot.awi.OneShotAWI, action: numpy.ndarray) dict[str, negmas.sao.common.SAOResponse] [source]
Generates offers to all partners from an encoded action. Default is to return the action as it is assuming it is a
dict[str, SAOResponse]
- encode(awi: scml.oneshot.awi.OneShotAWI, responses: dict[str, negmas.sao.common.SAOResponse]) numpy.ndarray [source]
Receives offers for all partners and generates the corresponding action. Used mostly for debugging and testing.
- scml.std.model_wrapper(model, deterministic: bool = False) RLModel [source]
Wraps a stable_baselines3 model as an RL model
- class scml.std.StdEnv(action_manager: scml.oneshot.rl.action.ActionManager, observation_manager: scml.oneshot.rl.observation.ObservationManager, reward_function: scml.oneshot.rl.reward.RewardFunction = DefaultRewardFunction(), render_mode=None, context: scml.oneshot.context.GeneralContext = FixedPartnerNumbersStdContext(), agent_type: type[scml.std.agent.StdAgent] = StdPlaceholder, agent_params: dict[str, Any] | None = None, extra_checks: bool = True, skip_after_negotiations: bool = True)[source]
Bases:
scml.oneshot.rl.env.OneShotEnv
The main Gymnasium class for implementing Reinforcement Learning Agents environments.
The class encapsulates an environment with arbitrary behind-the-scenes dynamics through the
step()
andreset()
functions. An environment can be partially or fully observed by single agents. For multi-agent environments, see PettingZoo.The main API methods that users of this class need to know are:
step()
- Updates an environment with actions returning the next agent observation, the reward for taking that actions, if the environment has terminated or truncated due to the latest action and information from the environment about the step, i.e. metrics, debug info.reset()
- Resets the environment to an initial state, required before calling step. Returns the first agent observation for an episode and information, i.e. metrics, debug info.render()
- Renders the environments to help visualise what the agent see, examples modes are “human”, “rgb_array”, “ansi” for text.close()
- Closes the environment, important when external software is used, i.e. pygame for rendering, databases
Environments have additional attributes for users to understand the implementation
action_space
- The Space object corresponding to valid actions, all valid actions should be contained within the space.observation_space
- The Space object corresponding to valid observations, all valid observations should be contained within the space.spec
- An environment spec that contains the information used to initialize the environment fromgymnasium.make()
metadata
- The metadata of the environment, e.g.{"render_modes": ["rgb_array", "human"], "render_fps": 30}
. For Jax or Torch, this can be indicated to users with"jax"=True
or"torch"=True
.np_random
- The random number generator for the environment. This is automatically assigned duringsuper().reset(seed=seed)
and when assessingnp_random
.
See also
For modifying or extending environments use the
gymnasium.Wrapper
classNote
To get reproducible sampling of actions, a seed can be set with
env.action_space.seed(123)
.Note
For strict type checking (e.g. mypy or pyright),
Env
is a generic class with two parameterized types:ObsType
andActType
. TheObsType
andActType
are the expected types of the observations and actions used inreset()
andstep()
. The environment’sobservation_space
andaction_space
should have typeSpace[ObsType]
andSpace[ActType]
, see a space’s implementation to find its parameterized type.
- class scml.std.ObservationManager[source]
Bases:
Protocol
Manages the observations of an agent in an RL environment
- property context: scml.oneshot.context.BaseContext
- encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Encodes an observation from the agent’s awi
- make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Creates the initial observation (returned from gym’s reset())
- get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None] [source]
Gets the offers from an encoded awi
- class scml.std.FlexibleObservationManager[source]
Bases:
BaseObservationManager
An observation manager that can be used with any SCML world.
- Parameters:
capacity_multiplier – A factor to multiply by the number of lines to give the maximum quantity allowed in offers
exogenous_multiplier – A factor to multiply maximum production capacity with when encoding exogenous quantities
continuous – If given the observation space will be a Box otherwise it will be a MultiDiscrete
n_prices – The number of prices to use for encoding the unit price (if not
continuous
)max_production_cost – The limit for production cost. Anything above that will be mapped to this max
max_group_size – Maximum size used for grouping observations from multiple partners. This will be used in the number of partners in the simulation is larger than the number used for training.
n_past_received_offers – Number of past received offers to add to the observation.
n_bins –
bins to use for discretization (if not
continuous
)
n_sigmas – The number of sigmas used for limiting the range of randomly distributed variables
extra_checks – If given, extra checks are applied to make sure encoding and decoding make sense
- Remarks:
…
- _previous_offers: collections.deque
- get_dims() list[int] [source]
Get the sizes of all dimensions in the observation space. Used if not continuous.
- make_space() gymnasium.spaces.MultiDiscrete | gymnasium.spaces.Box [source]
Creates the action space
- make_first_observation(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Creates the initial observation (returned from gym’s reset())
- encode(awi: scml.oneshot.awi.OneShotAWI) numpy.ndarray [source]
Encodes the awi as an array
- extra_obs(awi: scml.oneshot.awi.OneShotAWI) list[tuple[float, int] | float] [source]
The observation values other than offers and previous offers.
- Returns:
A list of tuples. Each is some observation variable as a real number between zero and one and a number of bins to use for discrediting this variable. If a single value, the number of bins will be self.n_bin
- get_offers(awi: scml.oneshot.awi.OneShotAWI, encoded: numpy.ndarray) dict[str, negmas.outcomes.Outcome | None] [source]
Gets offers from an encoded awi.
- scml.std.random_action(obs: numpy.ndarray, env: scml.oneshot.rl.env.OneShotEnv) numpy.ndarray [source]
Samples a random action from the action space of the
- scml.std.random_policy(obs: numpy.ndarray, env: scml.oneshot.rl.env.OneShotEnv, pend: float = 0.05, paccept: float = 0.15) numpy.ndarray [source]
Ends the negotiation or accepts with a predefined probability or samples a random response.
- scml.std.greedy_policy(obs: numpy.ndarray, awi: scml.oneshot.awi.OneShotAWI, obs_manager: scml.oneshot.rl.observation.ObservationManager, action_manager: scml.oneshot.rl.action.ActionManager = FlexibleActionManager(ANACOneShotContext()), debug=False, distributor: Callable[[int, int], list[int]] = all_but_concentrated) numpy.ndarray [source]
A simple greedy policy.
- Parameters:
obs – The current observation
awi – The AWI of the agent running the policy
obs_manager – The observation manager used to encode the observation
action_manager – The action manager to be used to encode the action
debug – If True, extra assertions are tested
distributor – A callable that receives a total quantity to be distributed over n partners and returns a list of n values that sum to this total quantity
- Remarks:
Accepts the subset of offers with maximum total quantity under current needs.
The remaining quantity is distributed over the remaining partners using the distributor function
Prices are set to the worst for the agent if the price range is small else they are set randomly
- class scml.std.RewardFunction[source]
Bases:
Protocol
Represents a reward function.
- Remarks:
before_action
is called before the action is executed for initialization and should return info to be passed to the call__call__
is called with the awi (to get the state), action and info and should return the reward
- before_action(awi: scml.oneshot.awi.OneShotAWI) Any [source]
Called before executing the action from the RL agent to save any required information for calculating the reward in its return
- Remarks:
The returned value will be passed as
info
to__call__()
when it is time to calculate the reward.
- __call__(awi: scml.oneshot.awi.OneShotAWI, action: dict[str, negmas.SAOResponse], info: Any) float [source]
Called to calculate the reward to be given to the agent at the end of a step.
- Parameters:
awi –
OneShotAWI
to access the agent’s stateaction – The action (decoded) as a mapping from partner ID to responses to their last offer.
info – Information generated from
before_action()
. You an use this to store baselines for calculating the reward
- Returns:
The reward (a number) to be given to the agent at the end of the step.
- class scml.std.DefaultRewardFunction[source]
Bases:
RewardFunction
The default reward function of SCML
- Remarks:
The reward is the difference between the balance before the action and after it.
- before_action(awi: scml.oneshot.awi.OneShotAWI) float [source]
Called before executing the action from the RL agent to save any required information for calculating the reward in its return
- Remarks:
The returned value will be passed as
info
to__call__()
when it is time to calculate the reward.
- __call__(awi: scml.oneshot.awi.OneShotAWI, action: dict[str, negmas.SAOResponse], info: float)[source]
Called to calculate the reward to be given to the agent at the end of a step.
- Parameters:
awi –
OneShotAWI
to access the agent’s stateaction – The action (decoded) as a mapping from partner ID to responses to their last offer.
info – Information generated from
before_action()
. You an use this to store baselines for calculating the reward
- Returns:
The reward (a number) to be given to the agent at the end of the step.
- class scml.std.DefaultStdAdapter(*args, **kwargs)[source]
Bases:
scml.oneshot.sysagents.DefaultOneShotAdapter
The base class of all agents running in Std based on StdAgent.
Remarks:
It inherits from
Adapter
allowing it to just pass any calls not defined explicity in it to the internal_obj
object representing the StdAgent.
- class scml.std._StdSystemAgent(*args, role, **kwargs)[source]
Bases:
DefaultOneShotAdapter
Implements an agent for handling system operations
- id
The unique ID of this entity
- name
A convenient name of the entity (intended primarily for printing/logging/debugging).
- profile = None
- property type_name
Returns a short name of the type of this entity
- property short_type_name
Returns a short name of the type of this entity
- respond_to_negotiation_request(initiator: str, issues: list[negmas.Issue], annotation: dict[str, Any], mechanism: negmas.NegotiatorMechanismInterface) negmas.Negotiator | None [source]
- on_negotiation_failure(partners: list[str], annotation: dict[str, Any], mechanism: negmas.NegotiatorMechanismInterface, state: negmas.MechanismState) None [source]
Called whenever a negotiation ends without agreement
- class scml.std.StdUFun(ex_pin: int, ex_qin: int, ex_pout: int, ex_qout: int, input_product: int, input_agent: bool, output_agent: bool, production_cost: float, disposal_cost: float, storage_cost: float, shortfall_penalty: float, input_penalty_scale: float | None, output_penalty_scale: float | None, storage_penalty_scale: float | None, n_input_negs: int, n_output_negs: int, current_step: int, agent_id: str | None, time_range: tuple[int, int], inventory_in: int = 0, inventory_out: int = 0, input_qrange: tuple[int, int] = (0, 0), input_prange: tuple[int, int] = (0, 0), output_qrange: tuple[int, int] = (0, 0), output_prange: tuple[int, int] = (0, 0), force_exogenous: bool = True, n_lines: int = 10, normalized: bool = False, current_balance: int | float = float('inf'), suppliers: set[str] = set(), consumers: set[str] = set(), perishable=True, **kwargs)[source]
Bases:
scml.oneshot.ufun.OneShotUFun
Calculates the utility function of a list of contracts or offers.
- Parameters:
force_exogenous – Is the agent forced to accept exogenous contracts given through
ex_*
arguments?ex_pin – total price of exogenous inputs for this agent
ex_qin – total quantity of exogenous inputs for this agent
ex_pout – total price of exogenous outputs for this agent
ex_qout – total quantity of exogenous outputs for this agent.
cost – production cost of the agent.
disposal_cost – disposal cost per unit of input/output.
shortfall_penalty – penalty for failure to deliver one unit of output.
input_agent – Is the agent an input agent which means that its input product is the raw material
output_agent – Is the agent an output agent which means that its output product is the final product
n_lines – Number of production lines. If None, will be read through the AWI.
input_product – Index of the input product. If None, will be read through the AWI
input_qrange – A 2-int tuple giving the range of input quantities negotiated. If not given will be read through the AWI
input_prange – A 2-int tuple giving the range of input unit prices negotiated. If not given will be read through the AWI
output_qrange – A 2-int tuple giving the range of output quantities negotiated. If not given will be read through the AWI
output_prange – A 2-int tuple giving the range of output unit prices negotiated. If not given will be read through the AWI
n_input_negs – How many input negotiations are allowed. If not given, it will be the number of suppliers as given by the AWI
n_output_negs – How many output negotiations are allowed. If not given, it will be the number of consumers as given by the AWI
current_step – Current simulation step. Needed only for
ufun_range
when returning best outcomesnormalized – If given the values returned by
from_*
,utility_range
and__call__
will all be normalized between zero and one.
- Remarks:
The utility function assumes that the agent will have to pay for all its input products but will receive money only for the output products it could generate and sell.
The utility function respects production capacity (n. lines). The agent cannot produce more than the number of lines it has.
disposal cost is paid for items bought but not produced only. Items consumed in production (i.e. sold) are not counted.
- class scml.std.UFunLimit[source]
Bases:
tuple
- utility
- input_quantity
- input_price
- output_quantity
- output_price
- exogenous_input_quantity
- exogenous_input_price
- exogenous_output_quantity
- exogenous_output_price
- inventory_input
- inventory_output
- producible
- class scml.std.StdWorld(*args, horizon=STD_DEFAULT_PARAMS['horizon'], price_range_fraction=STD_DEFAULT_PARAMS['price_range_fraction'], price_multiplier=STD_DEFAULT_PARAMS['price_multiplier'], wide_price_range=STD_DEFAULT_PARAMS['wide_price_range'], one_time_per_negotiation=STD_DEFAULT_PARAMS['one_time_per_negotiation'], perishable=STD_DEFAULT_PARAMS['perishable'], quantity_multiplier=STD_DEFAULT_PARAMS['quantity_multiplier'], **kwargs)[source]
Bases:
scml.oneshot.world.SCMLBaseWorld
The world representing the base standard simulation (starting SCML 2024)
- classmethod generate(*args, n_processes=STD_DEFAULT_PARAMS['n_processes'], disposal_cost=STD_DEFAULT_PARAMS['disposal_cost'], disposal_cost_dev=STD_DEFAULT_PARAMS['disposal_cost_dev'], storage_cost=STD_DEFAULT_PARAMS['storage_cost'], storage_cost_dev=STD_DEFAULT_PARAMS['storage_cost_dev'], perishable=STD_DEFAULT_PARAMS['perishable'], max_productivity=STD_DEFAULT_PARAMS['max_productivity'], max_supply=STD_DEFAULT_PARAMS['max_supply'], exogenous_supply_predictability=STD_DEFAULT_PARAMS['exogenous_supply_predictability'], exogenous_sales_predictability=STD_DEFAULT_PARAMS['exogenous_sales_predictability'], cap_exogenous_quantities=STD_DEFAULT_PARAMS['cap_exogenous_quantities'], **kwargs) dict[str, Any] [source]
Generates the configuration for a world
- Remarks:
This method just sets the defaults differently to create a std instead of a oneshot world.
- class scml.std.SCML2024StdWorld(*args, horizon=STD_DEFAULT_PARAMS['horizon'], price_range_fraction=STD_DEFAULT_PARAMS['price_range_fraction'], price_multiplier=STD_DEFAULT_PARAMS['price_multiplier'], wide_price_range=STD_DEFAULT_PARAMS['wide_price_range'], one_time_per_negotiation=STD_DEFAULT_PARAMS['one_time_per_negotiation'], perishable=STD_DEFAULT_PARAMS['perishable'], quantity_multiplier=STD_DEFAULT_PARAMS['quantity_multiplier'], **kwargs)[source]
Bases:
StdWorld
The SCML-standard simulation as used in [SCML 2024](https://scml.cs.brown.edu)