control.reinforcement_learning.Environments package

Submodules

control.reinforcement_learning.Environments.FakeEnv module

class control.reinforcement_learning.Environments.FakeEnv.FakeEnv(nbJoint=1)

Bases: object

Fake environment for testing purposes

render(debug=False)

Print the current state

Parameters:: debug (bool) – Whether to print the state or not
Returns:: None

reset()

Reset the environment to the initial state (random)

Parameters:: None –
Returns:: list of joint angles and velocities
Return type:: state (np.array)

step(action)

Take a step in the environment (random)

Parameters:: action (float) – The action to take (it is not used)
Returns:: list of joint angles and velocities reward (float): The reward for the action taken (it is not used) done (bool): Whether the episode is done or not
Return type:: state (np.array)

control.reinforcement_learning.Environments.PyBulletPendulumEnv module

class control.reinforcement_learning.Environments.PyBulletPendulumEnv.PyBulletPendulumEnv(render_mode='human')

Bases: Env

PyBullet Rotary Pendulum

action_space: spaces.Space[ActType]

calculate_reward(state)

Calculate the reward for the current state

Parameters:: state (np.array) – [bar angle, bar angular velocity]
Returns:: Reward for the current state
Return type:: reward (float)

close()

Close the PyBullet connection

Parameters:: None –
Returns:: None

get_state()

Read the state from the pendulum, simulating a fake serial connection

Parameters:: None –
Returns:: [bar angle, bar angular velocity] motor_angle (float): Motor angle in degrees done (bool): Episode done flag
Return type:: state (np.array)

load_pendulum_urdf()

Load the pendulum URDF into the environment.

Parameters:: None –
Returns:: None

metadata: dict[str, Any] = {'render_modes': ['human']}

observation_space: spaces.Space[ObsType]

render(fps=240.0)

Render the pendulum in PyBullet

Parameters:: fps (float, optional) – Number of frames per second. Defaults to 240.0.
Returns:: None

reset(seed=None, options=None)

Reset the environment to a random state

Parameters:: None –
Returns:: [bar_angle, bar_angular_velocity]
Return type:: state (np.array)

reset_policy(reset_count=200)

Policy to reset the environment

Parameters:: reset_count (int, optional) – Number of iterations to wait before resetting the system. Defaults to 200.
Returns:: None

reset_robot(mode='random')

Reset the robot state

Parameters:: mode (str, optional) – Mode to reset the robot. Defaults to “random”.
Returns:: [bar angle, bar angular velocity]
Return type:: state (np.array)

send_fake_serial(command)

Send a command to the pendulum, simulating a fake serial connection

Parameters:: command (list) – [motor speed percentage, episode done flag]
Returns:: None

step(action)

Take a step in the environment

Parameters:: action (float) – Motor speed percentage [-100, 100]
Returns:: [bar angle, bar angular velocity]
Return type:: state (np.array)

control.reinforcement_learning.Environments.RealPendulumEnv module

class control.reinforcement_learning.Environments.RealPendulumEnv.RealPendulumEnv(port, baudrate, render_mode='human')

Bases: Env

Real rotary pendulum with ESP32

action_space: spaces.Space[ActType]

calculate_reward(state)

Calculate the reward for the current state

Parameters:: state (np.array) – [bar angle, bar angular velocity]
Returns:: Reward for the current state
Return type:: reward (float)

close()

Close the serial connection

Parameters:: None –
Returns:: None

metadata: dict[str, Any] = {'render_modes': ['human']}

observation_space: spaces.Space[ObsType]

render(camera=False): Render the state (optional), e.g. display the video stream

reset(seed=None, options=None)

Reset the environment to the initial state.

Parameters:: None –
Returns:: [bar angle, bar angular velocity] info (dict): Episode information
Return type:: state (np.array)

reset_policy(reset_count=200)

Policy to reset the environment

Parameters:: reset_count (int, optional) – Number of iterations to wait before resetting the system. Defaults to 200.
Returns:: None

send_serial(command)

Send a command to the pendulum over serial

Parameters:: command (str) – [motor speed percentage, reset flag]
Returns:: None

step(action)

Take a step in the environment

Parameters:: action (float) – Motor speed percentage [-100, 100]
Returns:: [bar angle, bar angular velocity] reward (float): Reward for the current state terminated (bool): Whether the episode is done or not truncated (bool): Whether the episode is truncated or not info (dict): Episode information
Return type:: state (np.array)

control.reinforcement_learning.Environments.SerialReader module

class control.reinforcement_learning.Environments.SerialReader.SerialReader(ser, simulation=False)

Bases: Thread

get_state()

read_serial()

run()

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

control.reinforcement_learning.Environments package

Submodules

control.reinforcement_learning.Environments.FakeEnv module

control.reinforcement_learning.Environments.PyBulletPendulumEnv module

control.reinforcement_learning.Environments.RealPendulumEnv module

control.reinforcement_learning.Environments.SerialReader module

Module contents