Spaces#

class gym.spaces.Space(shape: Optional[Sequence[int]] = None, dtype: Optional[Union[Type, str, dtype]] = None, seed: Optional[Union[int, RandomNumberGenerator]] = None)#

Superclass that is used to define observation and action spaces.

Spaces are crucially used in Gym to define the format of valid actions and observations. They serve various purposes:

  • They clearly define how to interact with environments, i.e. they specify what actions need to look like and what observations will look like

  • They allow us to work with highly structured data (e.g. in the form of elements of Dict spaces) and painlessly transform them into flat arrays that can be used in learning code

  • They provide a method to sample random elements. This is especially useful for exploration and debugging.

Different spaces can be combined hierarchically via container spaces (Tuple and Dict) to build a more expressive space

Warning

Custom observation & action spaces can inherit from the Space class. However, most use-cases should be covered by the existing space classes (e.g. Box, Discrete, etc…), and container classes (:class`Tuple` & Dict). Note that parametrized probability distributions (through the Space.sample() method), and batching functions (in gym.vector.VectorEnv), are only well-defined for instances of spaces provided in gym by default. Moreover, some implementations of Reinforcement Learning algorithms might not handle custom spaces properly. Use custom spaces with care.

General Functions#

Each space implements the following functions:

gym.spaces.Space.sample(self) T_cov#

Randomly sample an element of this space. Can be uniform or non-uniform sampling based on boundedness of space.

gym.spaces.Space.contains(self, x) bool#

Return boolean specifying if x is a valid member of this space.

property Space.shape: Optional[Tuple[int, ...]]#

Return the shape of the space as an immutable property.

property gym.spaces.Space.dtype#

Return the data type of this space.

gym.spaces.Space.seed(self, seed: Optional[int] = None) list#

Seed the PRNG of this space and possibly the PRNGs of subspaces.

gym.spaces.Space.to_jsonable(self, sample_n: Sequence[T_cov]) list#

Convert a batch of samples from this space to a JSONable data type.

gym.spaces.Space.from_jsonable(self, sample_n: list) List[T_cov]#

Convert a JSONable data type to a batch of samples from this space.

Box#

class gym.spaces.Box(low: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], high: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], shape: ~typing.Optional[~typing.Sequence[int]] = None, dtype: ~typing.Type = <class 'numpy.float32'>, seed: ~typing.Optional[~typing.Union[int, ~gym.utils.seeding.RandomNumberGenerator]] = None)#

A (possibly unbounded) box in \(\mathbb{R}^n\).

Specifically, a Box represents the Cartesian product of n closed intervals. Each interval has the form of one of \([a, b]\), \((-\infty, b]\), \([a, \infty)\), or \((-\infty, \infty)\).

There are two common use cases:

  • Identical bound for each dimension:

    >>> Box(low=-1.0, high=2.0, shape=(3, 4), dtype=np.float32)
    Box(3, 4)
    
  • Independent bound for each dimension:

    >>> Box(low=np.array([-1.0, -2.0]), high=np.array([2.0, 4.0]), dtype=np.float32)
    Box(2,)
    
__init__(low: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], high: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], shape: ~typing.Optional[~typing.Sequence[int]] = None, dtype: ~typing.Type = <class 'numpy.float32'>, seed: ~typing.Optional[~typing.Union[int, ~gym.utils.seeding.RandomNumberGenerator]] = None)#

Constructor of Box.

The argument low specifies the lower bound of each dimension and high specifies the upper bounds. I.e., the space that is constructed will be the product of the intervals \([\text{low}[i], \text{high}[i]]\).

If low (or high) is a scalar, the lower bound (or upper bound, respectively) will be assumed to be this value across all dimensions.

Parameters:
  • low (Union[SupportsFloat, np.ndarray]) – Lower bounds of the intervals.

  • high (Union[SupportsFloat, np.ndarray]) – Upper bounds of the intervals.

  • shape (Optional[Sequence[int]]) – This only needs to be specified if both low and high are scalars and determines the shape of the space. Otherwise, the shape is inferred from the shape of low or high.

  • dtype – The dtype of the elements of the space. If this is an integer type, the Box is essentially a discrete space.

  • seed – Optionally, you can use this argument to seed the RNG that is used to sample from the space.

Raises:

ValueError – If no shape information is provided (shape is None, low is None and high is None) then a value error is raised.

is_bounded(manner: str = 'both') bool#

Checks whether the box is bounded in some sense.

Parameters:

manner (str) – One of "both", "below", "above".

Returns:

If the space is bounded

Raises:

ValueError – If manner is neither "both" nor "below" or "above"

sample() ndarray#

Generates a single random sample inside the Box.

In creating a sample of the box, each coordinate is sampled (independently) from a distribution that is chosen according to the form of the interval:

  • \([a, b]\) : uniform distribution

  • \([a, \infty)\) : shifted exponential distribution

  • \((-\infty, b]\) : shifted negative exponential distribution

  • \((-\infty, \infty)\) : normal distribution

Returns:

A sampled value from the Box

Discrete#

class gym.spaces.Discrete(n: int, seed: Optional[Union[int, RandomNumberGenerator]] = None, start: int = 0)#

A space consisting of finitely many elements.

This class represents a finite subset of integers, more specifically a set of the form \(\{ a, a+1, \dots, a+n-1 \}\).

Example:

>>> Discrete(2)            # {0, 1}
>>> Discrete(3, start=-1)  # {-1, 0, 1}
class __init__(*args, **kwargs)#

Initialize self. See help(type(self)) for accurate signature.

MultiBinary#

class gym.spaces.MultiBinary(n: Union[ndarray, Sequence[int], int], seed: Optional[Union[int, RandomNumberGenerator]] = None)#

An n-shape binary space.

Elements of this space are binary arrays of a shape that is fixed during construction.

Example Usage:

>>> observation_space = MultiBinary(5)
>>> observation_space.sample()
    array([0, 1, 0, 1, 0], dtype=int8)
>>> observation_space = MultiBinary([3, 2])
>>> observation_space.sample()
    array([[0, 0],
        [0, 1],
        [1, 1]], dtype=int8)

MultiDiscrete#

class gym.spaces.MultiDiscrete(nvec: ~typing.Union[~numpy.ndarray, ~typing.List[int]], dtype=<class 'numpy.int64'>, seed: ~typing.Optional[~typing.Union[int, ~gym.utils.seeding.RandomNumberGenerator]] = None)#

This represents the cartesian product of arbitrary Discrete spaces.

It is useful to represent game controllers or keyboards where each key can be represented as a discrete action space.

Note

Some environment wrappers assume a value of 0 always represents the NOOP action.

e.g. Nintendo Game Controller - Can be conceptualized as 3 discrete action spaces:

  1. Arrow Keys: Discrete 5 - NOOP[0], UP[1], RIGHT[2], DOWN[3], LEFT[4] - params: min: 0, max: 4

  2. Button A: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1

  3. Button B: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1

It can be initialized as MultiDiscrete([ 5, 2, 2 ])

__init__(nvec: ~typing.Union[~numpy.ndarray, ~typing.List[int]], dtype=<class 'numpy.int64'>, seed: ~typing.Optional[~typing.Union[int, ~gym.utils.seeding.RandomNumberGenerator]] = None)#

Constructor of MultiDiscrete space.

The argument nvec will determine the number of values each categorical variable can take.

Although this feature is rarely used, MultiDiscrete spaces may also have several axes if nvec has several axes:

Example:

>> d = MultiDiscrete(np.array([[1, 2], [3, 4]]))
>> d.sample()
array([[0, 0],
       [2, 3]])
Parameters:
  • nvec – vector of counts of each categorical variable. This will usually be a list of integers. However, you may also pass a more complicated numpy array if you’d like the space to have several axes.

  • dtype – This should be some kind of integer type.

  • seed – Optionally, you can use this argument to seed the RNG that is used to sample from the space.

Dict#

class gym.spaces.Dict(spaces: Optional[Dict[str, Space]] = None, seed: Optional[Union[dict, int, RandomNumberGenerator]] = None, **spaces_kwargs: Space)#

A dictionary of Space instances.

Elements of this space are (ordered) dictionaries of elements from the constituent spaces.

Example usage:

>>> from gym.spaces import Dict, Discrete
>>> observation_space = Dict({"position": Discrete(2), "velocity": Discrete(3)})
>>> observation_space.sample()
OrderedDict([('position', 1), ('velocity', 2)])

Example usage [nested]:

>>> from gym.spaces import Box, Dict, Discrete, MultiBinary, MultiDiscrete
>>> Dict(
...     {
...         "ext_controller": MultiDiscrete([5, 2, 2]),
...         "inner_state": Dict(
...             {
...                 "charge": Discrete(100),
...                 "system_checks": MultiBinary(10),
...                 "job_status": Dict(
...                     {
...                         "task": Discrete(5),
...                         "progress": Box(low=0, high=100, shape=()),
...                     }
...                 ),
...             }
...         ),
...     }
... )

It can be convenient to use Dict spaces if you want to make complex observations or actions more human-readable. Usually, it will be not be possible to use elements of this space directly in learning code. However, you can easily convert Dict observations to flat arrays by using a gym.wrappers.FlattenObservation wrapper. Similar wrappers can be implemented to deal with Dict actions.

__init__(spaces: Optional[Dict[str, Space]] = None, seed: Optional[Union[dict, int, RandomNumberGenerator]] = None, **spaces_kwargs: Space)#

Constructor of Dict space.

This space can be instantiated in one of two ways: Either you pass a dictionary of spaces to __init__() via the spaces argument, or you pass the spaces as separate keyword arguments (where you will need to avoid the keys spaces and seed)

Example:

>>> from gym.spaces import Box, Discrete
>>> Dict({"position": Box(-1, 1, shape=(2,)), "color": Discrete(3)})
Dict(color:Discrete(3), position:Box(-1.0, 1.0, (2,), float32))
>>> Dict(position=Box(-1, 1, shape=(2,)), color=Discrete(3))
Dict(color:Discrete(3), position:Box(-1.0, 1.0, (2,), float32))
Parameters:
  • spaces – A dictionary of spaces. This specifies the structure of the Dict space

  • seed – Optionally, you can use this argument to seed the RNGs of the spaces that make up the Dict space.

  • **spaces_kwargs – If spaces is None, you need to pass the constituent spaces as keyword arguments, as described above.

Tuple#

class gym.spaces.Tuple(spaces: Iterable[Space], seed: Optional[Union[int, List[int], RandomNumberGenerator]] = None)#

A tuple (more precisely: the cartesian product) of Space instances.

Elements of this space are tuples of elements of the constituent spaces.

Example usage:

>>> from gym.spaces import Box, Discrete
>>> observation_space = Tuple((Discrete(2), Box(-1, 1, shape=(2,))))
>>> observation_space.sample()
(0, array([0.03633198, 0.42370757], dtype=float32))
__init__(spaces: Iterable[Space], seed: Optional[Union[int, List[int], RandomNumberGenerator]] = None)#

Constructor of Tuple space.

The generated instance will represent the cartesian product \(\text{spaces}[0] \times ... \times \text{spaces}[-1]\).

Parameters:
  • spaces (Iterable[Space]) – The spaces that are involved in the cartesian product.

  • seed – Optionally, you can use this argument to seed the RNGs of the spaces to ensure reproducible sampling.

Utility Functions#

gym.spaces.utils.flatdim(space: Space) int#
gym.spaces.utils.flatdim(space: Union[Box, MultiBinary]) int
gym.spaces.utils.flatdim(space: Union[Box, MultiBinary]) int
gym.spaces.utils.flatdim(space: Discrete) int
gym.spaces.utils.flatdim(space: MultiDiscrete) int
gym.spaces.utils.flatdim(space: Tuple) int
gym.spaces.utils.flatdim(space: Dict) int

Return the number of dimensions a flattened equivalent of this space would have.

Example usage:

>>> from gym.spaces import Discrete
>>> space = Dict({"position": Discrete(2), "velocity": Discrete(3)})
>>> flatdim(space)
5
Parameters:

space – The space to return the number of dimensions of the flattened spaces

Returns:

The number of dimensions for the flattened spaces

Raises:

NotImplementedError – if the space is not defined in gym.spaces.

gym.spaces.utils.flatten_space(space: Space) Box#
gym.spaces.utils.flatten_space(space: Box) Box
gym.spaces.utils.flatten_space(space: Union[Discrete, MultiBinary, MultiDiscrete]) Box
gym.spaces.utils.flatten_space(space: Union[Discrete, MultiBinary, MultiDiscrete]) Box
gym.spaces.utils.flatten_space(space: Union[Discrete, MultiBinary, MultiDiscrete]) Box
gym.spaces.utils.flatten_space(space: Tuple) Box
gym.spaces.utils.flatten_space(space: Dict) Box
gym.spaces.utils.flatten_space(space: Graph) Graph

Flatten a space into a single Box.

This is equivalent to flatten(), but operates on the space itself. The result for non-graph spaces is always a Box with flat boundaries. While the result for graph spaces is always a Graph with node_space being a Box with flat boundaries and edge_space being a Box with flat boundaries or None. The box has exactly flatdim() dimensions. Flattening a sample of the original space has the same effect as taking a sample of the flattenend space.

Example:

>>> box = Box(0.0, 1.0, shape=(3, 4, 5))
>>> box
Box(3, 4, 5)
>>> flatten_space(box)
Box(60,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that flattens a discrete space:

>>> discrete = Discrete(5)
>>> flatten_space(discrete)
Box(5,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that recursively flattens a dict:

>>> space = Dict({"position": Discrete(2), "velocity": Box(0, 1, shape=(2, 2))})
>>> flatten_space(space)
Box(6,)
>>> flatten(space, space.sample()) in flatten_space(space)
True

Example that flattens a graph:

>>> space = Graph(node_space=Box(low=-100, high=100, shape=(3, 4)), edge_space=Discrete(5))
>>> flatten_space(space)
Graph(Box(-100.0, 100.0, (12,), float32), Box(0, 1, (5,), int64))
>>> flatten(space, space.sample()) in flatten_space(space)
True
Parameters:

space – The space to flatten

Returns:

A flattened Box

Raises:

NotImplementedError – if the space is not defined in gym.spaces.

gym.spaces.utils.flatten(space: Space[T], x: T) ndarray#
gym.spaces.utils.flatten(space: MultiBinary, x) ndarray
gym.spaces.utils.flatten(space: Box, x) ndarray
gym.spaces.utils.flatten(space: Discrete, x) ndarray
gym.spaces.utils.flatten(space: MultiDiscrete, x) ndarray
gym.spaces.utils.flatten(space: Tuple, x) ndarray
gym.spaces.utils.flatten(space: Dict, x) ndarray
gym.spaces.utils.flatten(space: Graph, x) ndarray

Flatten a data point from a space.

This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.

Parameters:
  • space – The space that x is flattened by

  • x – The value to flatten

Returns:
  • - The flattened ``x``, always returns a 1D array for non-graph spaces.

  • - For graph spaces, returns `GraphInstance` where

    • nodes are n x k arrays

    • edges are either:
      • m x k arrays

      • None

    • edge_links are either:
      • m x 2 arrays

      • None

Raises:

NotImplementedError – If the space is not defined in gym.spaces.

gym.spaces.utils.unflatten(space: Space[T], x: ndarray) T#
gym.spaces.utils.unflatten(space: Union[Box, MultiBinary], x: ndarray) ndarray
gym.spaces.utils.unflatten(space: Union[Box, MultiBinary], x: ndarray) ndarray
gym.spaces.utils.unflatten(space: Discrete, x: ndarray) int
gym.spaces.utils.unflatten(space: MultiDiscrete, x: ndarray) ndarray
gym.spaces.utils.unflatten(space: Tuple, x: ndarray) tuple
gym.spaces.utils.unflatten(space: Dict, x: ndarray) dict
gym.spaces.utils.unflatten(space: Graph, x: GraphInstance) GraphInstance

Unflatten a data point from a space.

This reverses the transformation applied by flatten(). You must ensure that the space argument is the same as for the flatten() call.

Parameters:
  • space – The space used to unflatten x

  • x – The array to unflatten

Returns:

A point with a structure that matches the space.

Raises:

NotImplementedError – if the space is not defined in gym.spaces.