control.reinforcement_learning.Environments package

Submodules

control.reinforcement_learning.Environments.FakeEnv module

class control.reinforcement_learning.Environments.FakeEnv.FakeEnv(nbJoint=1)

Bases: object

Fake environment for testing purposes

render(debug=False)

Print the current state

Parameters:

debug (bool) – Whether to print the state or not

Returns:

None

reset()

Reset the environment to the initial state (random)

Parameters:

None

Returns:

list of joint angles and velocities

Return type:

state (np.array)

step(action)

Take a step in the environment (random)

Parameters:

action (float) – The action to take (it is not used)

Returns:

list of joint angles and velocities reward (float): The reward for the action taken (it is not used) done (bool): Whether the episode is done or not

Return type:

state (np.array)

control.reinforcement_learning.Environments.PyBulletPendulumEnv module

class control.reinforcement_learning.Environments.PyBulletPendulumEnv.PyBulletPendulumEnv(render_mode='human')

Bases: Env

PyBullet Rotary Pendulum

action_space: spaces.Space[ActType]
calculate_reward(state)

Calculate the reward for the current state

Parameters:

state (np.array) – [bar angle, bar angular velocity]

Returns:

Reward for the current state

Return type:

reward (float)

close()

Close the PyBullet connection

Parameters:

None

Returns:

None

get_state()

Read the state from the pendulum, simulating a fake serial connection

Parameters:

None

Returns:

[bar angle, bar angular velocity] motor_angle (float): Motor angle in degrees done (bool): Episode done flag

Return type:

state (np.array)

load_pendulum_urdf()

Load the pendulum URDF into the environment.

Parameters:

None

Returns:

None

metadata: dict[str, Any] = {'render_modes': ['human']}
observation_space: spaces.Space[ObsType]
render(fps=240.0)

Render the pendulum in PyBullet

Parameters:

fps (float, optional) – Number of frames per second. Defaults to 240.0.

Returns:

None

reset(seed=None, options=None)

Reset the environment to a random state

Parameters:

None

Returns:

[bar_angle, bar_angular_velocity]

Return type:

state (np.array)

reset_policy(reset_count=200)

Policy to reset the environment

Parameters:

reset_count (int, optional) – Number of iterations to wait before resetting the system. Defaults to 200.

Returns:

None

reset_robot(mode='random')

Reset the robot state

Parameters:

mode (str, optional) – Mode to reset the robot. Defaults to “random”.

Returns:

[bar angle, bar angular velocity]

Return type:

state (np.array)

send_fake_serial(command)

Send a command to the pendulum, simulating a fake serial connection

Parameters:

command (list) – [motor speed percentage, episode done flag]

Returns:

None

step(action)

Take a step in the environment

Parameters:

action (float) – Motor speed percentage [-100, 100]

Returns:

[bar angle, bar angular velocity]

Return type:

state (np.array)

control.reinforcement_learning.Environments.RealPendulumEnv module

class control.reinforcement_learning.Environments.RealPendulumEnv.RealPendulumEnv(port, baudrate, render_mode='human')

Bases: Env

Real rotary pendulum with ESP32

action_space: spaces.Space[ActType]
calculate_reward(state)

Calculate the reward for the current state

Parameters:

state (np.array) – [bar angle, bar angular velocity]

Returns:

Reward for the current state

Return type:

reward (float)

close()

Close the serial connection

Parameters:

None

Returns:

None

metadata: dict[str, Any] = {'render_modes': ['human']}
observation_space: spaces.Space[ObsType]
render(camera=False)

Render the state (optional), e.g. display the video stream

reset(seed=None, options=None)

Reset the environment to the initial state.

Parameters:

None

Returns:

[bar angle, bar angular velocity] info (dict): Episode information

Return type:

state (np.array)

reset_policy(reset_count=200)

Policy to reset the environment

Parameters:

reset_count (int, optional) – Number of iterations to wait before resetting the system. Defaults to 200.

Returns:

None

send_serial(command)

Send a command to the pendulum over serial

Parameters:

command (str) – [motor speed percentage, reset flag]

Returns:

None

step(action)

Take a step in the environment

Parameters:

action (float) – Motor speed percentage [-100, 100]

Returns:

[bar angle, bar angular velocity] reward (float): Reward for the current state terminated (bool): Whether the episode is done or not truncated (bool): Whether the episode is truncated or not info (dict): Episode information

Return type:

state (np.array)

control.reinforcement_learning.Environments.SerialReader module

class control.reinforcement_learning.Environments.SerialReader.SerialReader(ser, simulation=False)

Bases: Thread

get_state()
read_serial()
run()

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

Module contents