rte-france / l2rpn-baselines Goto Github PK
View Code? Open in Web Editor NEWL2RPN Baselines a repository to host baselines for l2rpn competitions.
Home Page: https://l2rpn-baselines.readthedocs.io/en/stable/
License: Mozilla Public License 2.0
L2RPN Baselines a repository to host baselines for l2rpn competitions.
Home Page: https://l2rpn-baselines.readthedocs.io/en/stable/
License: Mozilla Public License 2.0
1.8.1
0.6.0.post1
mac osx, ubuntu16.04, ...
PPO_RLLIB
When I am evaluating trained PPO_RLLIB agent the total score for chronics is getting printed out as 0. Even if the PPO_RLLIB agent didn't get trained properly, but total_score should still be non-zero.
Output I am getting is
Evaluation summary:
chronics at: 0000 total score: 0.000000 time steps: 1091/8064
chronics at: 0001 total score: 0.000000 time steps: 807/8064
chronics at: 0002 total score: 0.000000 time steps: 3001/8064
chronics at: 0003 total score: 0.000000 time steps: 3/8064
chronics at: 0004 total score: 0.000000 time steps: 804/8064
Evaluation summary for Do Nothing Agent:
chronics at: 0000 total score: 622.306925 time steps: 1091/8064
chronics at: 0001 total score: 464.387165 time steps: 807/8064
chronics at: 0002 total score: 1759.294096 time steps: 3001/8064
chronics at: 0003 total score: 1.020729 time steps: 3/8064
chronics at: 0004 total score: 479.332989 time steps: 804/8064
The training script I used
import grid2op
from grid2op.gym_compat import GymEnv, BoxGymObsSpace, BoxGymActSpace
from grid2op.Backend import PandaPowerBackend
from lightsim2grid import LightSimBackend
from l2rpn_baselines.PPO_RLLIB import PPO_RLLIB, train
from l2rpn_baselines.PPO_RLLIB.rllibagent import RLLIBAgent
from grid2op.Reward import LinesCapacityReward # or any other rewards
from grid2op.Chronics import MultifolderWithCache # highly recommended
import copy
import re
import ray
env_name = "l2rpn_case14_sandbox" # or any other name
obs_attr_to_keep = ["day_of_week", "hour_of_day", "minute_of_hour", "prod_p", "prod_v", "load_p", "load_q",
"actual_dispatch", "target_dispatch", "topo_vect", "time_before_cooldown_line",
"time_before_cooldown_sub", "rho", "timestep_overflow", "line_status",
"storage_power", "storage_charge"]
act_attr_to_keep = ["change_line_status", "change_bus", "redispatch"]
env = grid2op.make(env_name, backend=LightSimBackend())
ray.init()
train(env,
iterations=100, # any number of iterations you want
learning_rate=1e-4, # set learning rate
save_path="./saved_model/PPO_RLLIB3", # where the NN weights will be saved
# load_path="./saved_model/PPO_RLLIB/test", # resuming from previous saved training
name="test", # name of the baseline
net_arch=[100, 100, 100], # architecture of the NN
save_every_xxx_steps=10, # save the NN every 2 training steps
env_kwargs={"reward_class": LinesCapacityReward,
"chronics_class": MultifolderWithCache, # highly recommended
"data_feeding_kwargs": {
'filter_func': lambda x: re.match(".*00$", x) is not None #use one over 100 chronics to train (for speed)
}
},
obs_attr_to_keep=copy.deepcopy(obs_attr_to_keep),
act_attr_to_keep=copy.deepcopy(act_attr_to_keep),
verbose=True)
env.close()
ray.shutdown()
The evaluation script used
import grid2op
from grid2op.Reward import LinesCapacityReward # or any other rewards
from lightsim2grid import LightSimBackend # highly recommended !
from l2rpn_baselines.PPO_RLLIB import evaluate
from grid2op.Runner import Runner
nb_episode = 5
nb_process = 1
verbose = True
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name,
reward_class=LinesCapacityReward,
backend=LightSimBackend()
)
try:
evaluate(env,
nb_episode=nb_episode,
load_path="./saved_model/PPO_RLLIB3", # should be the same as what has been called in the train function !
name="test", # should be the same as what has been called in the train function !
logs_path = "./logs/PPO_RLLIB3/",
nb_process=1,
verbose=verbose,
)
# you can also compare your agent with the do nothing agent relatively
# easily
runner_params = env.get_params_for_runner()
runner = Runner(**runner_params)
res = runner.run(nb_episode=nb_episode,
nb_process=nb_process
)
# Print summary
if verbose:
print("Evaluation summary for Do Nothing Agent:")
for _, chron_name, cum_reward, nb_time_step, max_ts in res:
msg_tmp = "chronics at: {}".format(chron_name)
msg_tmp += "\ttotal score: {:.6f}".format(cum_reward)
msg_tmp += "\ttime steps: {:.0f}/{:.0f}".format(nb_time_step, max_ts)
print(msg_tmp)
finally:
env.close()
Line 79 of l2rpn_baselines\DoubleDuelingDQN\train.py makes it mandatory to have a GPU.
Should be fixed in order to allow participants to use their CPU instead.
Hello! I am trying to run the baselines by import the train function but I keep getting the following error. Could someone please advice me on what I need to change.
The code:
from l2rpn_baselines.DoubleDuelingRDQN import train
env = grid2op.make()
train(env)
The error:
in _save_hyperparameters(self, logpath, env, steps)
99
100 def _save_hyperparameters(self, logpath, env, steps):
--> 101 r_instance = env.reward_helper.template_reward
102 hp = {
103 "lr": cfg.LR,
AttributeError: 'Environment_rte_case14_realistic' object has no attribute 'reward_helper'
1.8.1
0.6.0.post1
mac osx
1.7.0
PPO_SB3
When I am trying to resume training from a saved agent, I am getting some errors. The saved agent however is working properly with evaluate function
import re
import copy
import grid2op
from grid2op.Reward import LinesCapacityReward # or any other rewards
from grid2op.Chronics import MultifolderWithCache # highly recommended
from lightsim2grid import LightSimBackend # highly recommended for training !
from l2rpn_baselines.PPO_SB3 import train, evaluate
env_name = "l2rpn_case14_sandbox"
obs_attr_to_keep = ["day_of_week", "hour_of_day", "minute_of_hour", "prod_p", "prod_v", "load_p", "load_q",
"actual_dispatch", "target_dispatch", "topo_vect", "time_before_cooldown_line",
"time_before_cooldown_sub", "rho", "timestep_overflow", "line_status",
"storage_power", "storage_charge"]
act_attr_to_keep = ["redispatch"]
env = grid2op.make(env_name,
reward_class=LinesCapacityReward,
backend=LightSimBackend(),
chronics_class=MultifolderWithCache)
env.chronics_handler.real_data.set_filter(lambda x: re.match(".*00$", x) is not None)
env.chronics_handler.real_data.reset()
train(env,
iterations=1000, # any number of iterations you want
logs_dir="./logs/PPO_SB3_test", # where the tensorboard logs will be put
save_path="./saved_model/PPO_SB3_test", # where the NN weights will be saved
name="Reload_test", # name of the baseline
net_arch=[200, 200, 200], # architecture of the NN
obs_attr_to_keep=copy.deepcopy(obs_attr_to_keep),
act_attr_to_keep=copy.deepcopy(act_attr_to_keep),
normalize_obs=True,
)
evaluate(env,
nb_episode=3,
load_path="./saved_model/PPO_SB3_test/", # should be the same as what has been called in the train function !
name="Reload_test", # should be the same as what has been called in the train function !
logs_path = "./logs/PPO_SB3/",
nb_process=1,
verbose=True,
)
train(env,
iterations=1000, # any number of iterations you want
logs_dir="./logs/PPO_SB3_test", # where the tensorboard logs will be put
load_path="./saved_model/PPO_SB3_test/Reload_test",
save_path="./saved_model/PPO_SB3_test", # where the NN weights will be saved
name="Reload_test.zip", # name of the baseline
obs_attr_to_keep=copy.deepcopy(obs_attr_to_keep),
act_attr_to_keep=copy.deepcopy(act_attr_to_keep),
normalize_obs=True,
)
I am getting the following error message
/Users/paula/Desktop/Projects/venvs/L2PRN/lib/python3.10/site-packages/grid2op/gym_compat/box_gym_obsspace.py:765: UserWarning: The normalization of attribute "[False False False False False False]" cannot be performed entirely as there are some non finite value, or `high == `low` for some components.
warnings.warn(f"The normalization of attribute \"{both_finite}\" cannot be performed entirely as "
/Users/paula/Desktop/Projects/venvs/L2PRN/lib/python3.10/site-packages/grid2op/gym_compat/box_gym_obsspace.py:765: UserWarning: The normalization of attribute "[False False False False False False False False False False False]" cannot be performed entirely as there are some non finite value, or `high == `low` for some components.
warnings.warn(f"The normalization of attribute \"{both_finite}\" cannot be performed entirely as "
/Users/paula/Desktop/Projects/venvs/L2PRN/lib/python3.10/site-packages/grid2op/gym_compat/box_gym_obsspace.py:765: UserWarning: The normalization of attribute "[False False False False False False False False False False False False
False False False False False False False False]" cannot be performed entirely as there are some non finite value, or `high == `low` for some components.
warnings.warn(f"The normalization of attribute \"{both_finite}\" cannot be performed entirely as "
Traceback (most recent call last):
File "/Users/paula/Desktop/Projects/RL Practice/L2RPN Aspen/Demo Notebooks/PPO_SB3_train_reload.py", line 45, in <module>
train(env,
File "/............../L2PRN/lib/python3.10/site-packages/l2rpn_baselines/PPO_SB3/train.py", line 305, in train
agent.nn_model.learn(total_timesteps=iterations,
File "/............../L2PRN/lib/python3.10/site-packages/stable_baselines3/ppo/ppo.py", line 307, in learn
return super().learn(
File "/............../L2PRN/lib/python3.10/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 236, in learn
total_timesteps, callback = self._setup_learn(
File "/............../L2PRN/lib/python3.10/site-packages/stable_baselines3/common/base_class.py", line 408, in _setup_learn
self._last_obs = self.env.reset() # pytype: disable=annotation-type-mismatch
AttributeError: 'NoneType' object has no attribute 'reset'
Hi
In this line
Shouldn't it be
next_a = np.argmax(target_next, axis=-1)
instead of
next_a = np.argmax(fut_action, axis=-1)
My understanding is that we pick the action that maximize the action value from target_next
but it looks like we pick the action from fut_action
and then get the action value from target_next
, why is that? or it doesn't matter?
Thanks for reading
link to acme: https://github.com/deepmind/acme
Today, if you use "nb_env > 1" in the train
function of all agents inheriting from DeepQAgent
it is not clear at all that the environment provided should be an instance of MultiEnvironment
and not an instance of the natural environment itself Environment
.
This is confusing and should be clarified in this version, and the redundancy should be removed in future major release:
Environment
is passed as env
argument of the train functionMultiEnvironment
This should solve the issue, and will be properly documented in the doc (l2rpn-baselines + grid2op and in the getting_started
notebook of grid2op)
Hello! I am trying to implement other reinforcement learning method to deal with the l2rpn problem, but I find that my result cannot match the performance with the DQN implementation in l2rpn-baselines.
Even I use the other library to implement the DQN, when I change the hyperparameters a little, I cannot get the reasonable result.
So I want to know how you fine-tune the hyperparameters of DQN in the l2rpn problem. Is there any tricks?
When copy pasting the documentation of the SAC train function (https://l2rpn-baselines.readthedocs.io/en/stable/SAC.html#l2rpn_baselines.SAC.train) the program does not work.
The documentation should be adapted, for the SAC as:
import grid2op
from grid2op.Reward import L2RPNReward
from l2rpn_baselines.utils import TrainingParam
from l2rpn_baselines.SAC import train
from l2rpn_baselines.utils import NNParam
# define the environment
env = grid2op.make("l2rpn_case14_sandbox",
reward_class=L2RPNReward)
# use the default training parameters
tp = TrainingParam()
# this will be the list of what part of the observation I want to keep
# more information on https://grid2op.readthedocs.io/en/latest/observation.html#main-observation-attributes
li_attr_obs_X = ["day_of_week", "hour_of_day", "minute_of_hour", "prod_p", "prod_v", "load_p", "load_q",
"actual_dispatch", "target_dispatch", "topo_vect", "time_before_cooldown_line",
"time_before_cooldown_sub", "rho", "timestep_overflow", "line_status"]
# neural network architecture
observation_size = NNParam.get_obs_size(env, li_attr_obs_X)
sizes_q = [800, 800, 800, 494, 494, 494] # sizes of each hidden layers
sizes_v = [800, 800] # sizes of each hidden layers
sizes_pol = [800, 800, 800, 494, 494, 494] # sizes of each hidden layers
kwargs_archi = {'observation_size': observation_size,
'sizes': sizes_q,
'activs': ["relu" for _ in range(len(sizes_q))],
"list_attr_obs": li_attr_obs_X,
"sizes_value": sizes_v,
"activs_value": ["relu" for _ in range(len(sizes_v))],
"sizes_policy": sizes_pol,
"activs_policy": ["relu" for _ in range(len(sizes_pol))]
}
# select some part of the action
# more information at https://grid2op.readthedocs.io/en/latest/converter.html#grid2op.Converter.IdToAct.init_converter
kwargs_converters = {"all_actions": None,
"set_line_status": False,
"change_bus_vect": True,
"set_topo_vect": False
}
# define the name of the model
nm_ = "AnneOnymous"
save_path="/WHERE/I/SAVED/THE/MODEL"
logs_dir="/WHERE/I/SAVED/THE/LOGS"
try:
train(env,
name=nm_,
iterations=10000,
save_path=save_path,
load_path=None,
logs_dir=logs_dir,
nb_env=1,
training_param=tp,
kwargs_converters=kwargs_converters,
kwargs_archi=kwargs_archi)
finally:
env.close()
Hi,
I tried to install the package via pip on a windows station and got an error for torch.
Collecting l2rpn_baselines
Using cached l2rpn_baselines-0.5.0.tar.gz (145 kB)
Requirement already satisfied: grid2op[optional]>=0.9.1.post1 in d:\projects\rte-grid2viz\grid2op (from l2rpn_baselines) (1.2.3)
Collecting tensorflow>=2.2.0
Using cached tensorflow-2.3.1-cp38-cp38-win_amd64.whl (342.5 MB)
Collecting Keras>=2.3.1
Using cached Keras-2.4.3-py2.py3-none-any.whl (36 kB)
ERROR: Could not find a version that satisfies the requirement torch>=1.4.0 (from l2rpn_baselines) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)
ERROR: No matching distribution found for torch>=1.4.0 (from l2rpn_baselines)
I'm on a Windows 10 machine [version 10.0.19041.508] with Python 3.8.6 running inside a virtualenv.
Cheers
GymEnvWithHeuristics: fix_action is missing the "observation" as argument ! Otherwise it cannot be used, for example to "limit_curtail_storage"... Which was its main usecase...
When I run the following code:
import grid2op
from l2rpn_baselines.DoubleDuelingDQN import train
env = grid2op.make()
train(env, save_path='../checkpoints', iterations=1000, verbose=True)
I get the following error:
Traceback (most recent call last):
File "/home/alwin/PycharmProjects/l2rpn-challenge/l2rpn_baseline/train.py", line 36, in <module>
main()
File "/home/alwin/PycharmProjects/l2rpn-challenge/l2rpn_baseline/train.py", line 32, in main
train(env, save_path=args.save_path, iterations=args.iterations, verbose=True)
File "/home/alwin/miniconda3/envs/l2rpn_challenge/lib/python3.7/site-packages/l2rpn_baselines/DoubleDuelingDQN/train.py", line 96, in train
logs_path)
File "/home/alwin/miniconda3/envs/l2rpn_challenge/lib/python3.7/site-packages/l2rpn_baselines/DoubleDuelingDQN/DoubleDuelingDQN.py", line 274, in train
self._batch_train(training_step, step)
File "/home/alwin/miniconda3/envs/l2rpn_challenge/lib/python3.7/site-packages/l2rpn_baselines/DoubleDuelingDQN/DoubleDuelingDQN.py", line 355, in _batch_train
loss = self.Qmain.train_on_batch(input_t, Q, w_batch)
File "/home/alwin/miniconda3/envs/l2rpn_challenge/lib/python3.7/site-packages/l2rpn_baselines/DoubleDuelingDQN/DoubleDuelingDQN_NN.py", line 83, in train_on_batch
batch_loss = self._batch_loss(y_true, y_pred)
File "/home/alwin/miniconda3/envs/l2rpn_challenge/lib/python3.7/site-packages/l2rpn_baselines/DoubleDuelingDQN/DoubleDuelingDQN_NN.py", line 113, in _batch_loss
self.batch_sq_error = batch_sq_error.numpy()
AttributeError: 'Tensor' object has no attribute 'numpy'
It appears after approximately 15 seconds while training. I also have the problem with DoubleDuelingRDQN. I am using version 0.4.4 installed via pip.
Inspired from lightsim2grid or grid2op to make it easier for people to submit consistent and easy to fix issue.
An option should be added to access or modify the variables of objects in the grid (powerlines / loads / generator) from the names of the objects instead of their ID.
For example instead of the following :
change_status = action_space.get_change_line_status_vect()
change_status[0] = True
, it should be possible to do :
change_status = action_space.get_change_line_status_vect()
change_status["0_3_0"] = True
Maybe with something like:
class CallableVector(np.ndarray):
def __getitem__(self, x):
try:
return super().__getitem__(x)
except KeyError: # x is a string
idx = np.argmin(env.name_line == x) # env has to be defined in order to get the names
return super().__getitem__(idx)
And this wouldn't add anything to the computation time.
The same goes for the vectors in the Observation objects.
In the code of the BaseDeepQ
class the attribute optimizer_model
is private, whereas it is incorrectly used in the class DeepQAgent
(accessed through optimizer_model
)
To fix this issue, just update this line https://github.com/rte-france/l2rpn-baselines/blob/master/l2rpn_baselines/utils/DeepQAgent.py#L591
with:
self._train_lr = self.deep_q._optimizer_model._decayed_lr('float32').numpy()
Also this entails that the tests do not attempt to train the baselines at all. This would be nice to fix them.
When I run the code train_it, this error occurs. Please tell me how to solve it. Thank you very much.
AttributeError: module 'gym.spaces' has no attribute 'dict'
1.10.0
Windows 11
3.11.3
When using the LightSim2Grid backend the attribute '_missing_two_busbars_support_info' is False since LightSim2Grid currently does not support more than 2 busbars per substation. Normally, this only results in a warning. However, when using the train() method from l2rpn_baselines, it seems this attribute is lost somewhere. As a result when backend.assert_grid_correct() is run an attribute error is thrown:
'LightSimBackend_rte_case14_realistic_train' object has no attribute '_missing_two_busbars_support_info'
import grid2op
import lightsim2grid
import l2rpn_baselines.PPO_SB3 as PPO_SB3
ENV_NAME = "rte_case14_realistic_train"
env = grid2op.make(ENV_NAME, backend=lightsim2grid.LightSimBackend())
obs = env.reset()
agent = PPO_SB3.train(env, iterations=1, logs_dir=None, save_path=None, net_arch=[100,100,100])
AttributeError Traceback (most recent call last)
Cell In[3], line 24
22 env = grid2op.make(ENV_NAME, backend=lightsim2grid.LightSimBackend())
23 obs = env.reset()
---> 24 agent = PPO_SB3.train(env, iterations=1, logs_dir=None, save_path=None, verbose=True, net_arch=[100,100,100])
File ...\.venv\Lib\site-packages\l2rpn_baselines\PPO_SB3\train.py:306, in train(env, name, iterations, save_path, load_path, net_arch, logs_dir, learning_rate, checkpoint_callback, save_every_xxx_steps, model_policy, obs_attr_to_keep, obs_space_kwargs, act_attr_to_keep, act_space_kwargs, policy_kwargs, normalize_obs, normalize_act, gymenv_class, gymenv_kwargs, verbose, seed, eval_env, **kwargs)
299 agent = SB3Agent(env.action_space,
300 env_gym.action_space,
301 env_gym.observation_space,
302 nn_path=os.path.join(load_path, name)
303 )
305 # train it
--> 306 agent.nn_model.learn(total_timesteps=iterations,
307 callback=checkpoint_callback,
308 # eval_env=eval_env # TODO
309 )
311 # save it
312 if save_path is not None:
File ...\.venv\Lib\site-packages\stable_baselines3\ppo\ppo.py:315, in PPO.learn(self, total_timesteps, callback, log_interval, tb_log_name, reset_num_timesteps, progress_bar)
306 def learn(
307 self: SelfPPO,
308 total_timesteps: int,
(...)
313 progress_bar: bool = False,
314 ) -> SelfPPO:
--> 315 return super().learn(
316 total_timesteps=total_timesteps,
317 callback=callback,
318 log_interval=log_interval,
319 tb_log_name=tb_log_name,
320 reset_num_timesteps=reset_num_timesteps,
321 progress_bar=progress_bar,
322 )
File ...\.venv\Lib\site-packages\stable_baselines3\common\on_policy_algorithm.py:264, in OnPolicyAlgorithm.learn(self, total_timesteps, callback, log_interval, tb_log_name, reset_num_timesteps, progress_bar)
253 def learn(
254 self: SelfOnPolicyAlgorithm,
255 total_timesteps: int,
(...)
260 progress_bar: bool = False,
261 ) -> SelfOnPolicyAlgorithm:
262 iteration = 0
--> 264 total_timesteps, callback = self._setup_learn(
265 total_timesteps,
266 callback,
267 reset_num_timesteps,
268 tb_log_name,
269 progress_bar,
270 )
272 callback.on_training_start(locals(), globals())
274 assert self.env is not None
File ...\.venv\Lib\site-packages\stable_baselines3\common\base_class.py:423, in BaseAlgorithm._setup_learn(self, total_timesteps, callback, reset_num_timesteps, tb_log_name, progress_bar)
421 if reset_num_timesteps or self._last_obs is None:
422 assert self.env is not None
--> 423 self._last_obs = self.env.reset() # type: ignore[assignment]
424 self._last_episode_starts = np.ones((self.env.num_envs,), dtype=bool)
425 # Retrieve unnormalized observation for saving into the buffer
File ...\.venv\Lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py:77, in DummyVecEnv.reset(self)
75 for env_idx in range(self.num_envs):
76 maybe_options = {"options": self._options[env_idx]} if self._options[env_idx] else {}
---> 77 obs, self.reset_infos[env_idx] = self.envs[env_idx].reset(seed=self._seeds[env_idx], **maybe_options)
78 self._save_obs(env_idx, obs)
79 # Seeds and options are only used once
File ...\.venv\Lib\site-packages\stable_baselines3\common\monitor.py:83, in Monitor.reset(self, **kwargs)
81 raise ValueError(f"Expected you to pass keyword argument {key} into reset")
82 self.current_reset_info[key] = value
---> 83 return self.env.reset(**kwargs)
File ...\.venv\Lib\site-packages\grid2op\gym_compat\gymenv.py:303, in GymnasiumEnv.reset(self, seed, options)
296 def reset(self,
297 *,
298 seed: Optional[int]=None,
(...)
301 RESET_INFO_GYM_TYPING
302 ]:
--> 303 return self._aux_reset_new(seed, options)
File ...\.venv\Lib\site-packages\grid2op\gym_compat\gymenv.py:184, in __AuxGymEnv._aux_reset_new(self, seed, options)
180 seed, next_seed, underlying_env_seeds = self._aux_seed_g2op(seed)
182 # we don't seed grid2op with reset as it is done
183 # earlier
--> 184 g2op_obs = self.init_env.reset(seed=None, options=options)
185 gym_obs = self.observation_space.to_gym(g2op_obs)
187 chron_id = self.init_env.chronics_handler.get_id()
File ...\.venv\Lib\site-packages\grid2op\Environment\environment.py:988, in Environment.reset(self, seed, options)
986 self._reset_redispatching()
987 self._reset_vectors_and_timings() # it need to be done BEFORE to prevent cascading failure when there has been
--> 988 self.reset_grid()
989 if self.viewer_fig is not None:
990 del self.viewer_fig
File ...\.venv\Lib\site-packages\grid2op\Environment\environment.py:868, in Environment.reset_grid(self)
852 """
853 INTERNAL
854
(...)
863
864 """
865 self.backend.reset(
866 self._init_grid_path
867 ) # the real powergrid of the environment
--> 868 self.backend.assert_grid_correct()
870 if self._thermal_limit_a is not None:
871 self.backend.set_thermal_limit(self._thermal_limit_a.astype(dt_float))
File ...\.venv\Lib\site-packages\grid2op\Backend\backend.py:1947, in Backend.assert_grid_correct(self)
1944 from grid2op.Action import CompleteAction
1945 from grid2op.Action._backendAction import _BackendAction
-> 1947 if self._missing_two_busbars_support_info:
1948 warnings.warn("The backend implementation you are using is probably too old to take advantage of the "
1949 "new feature added in grid2op 1.10.0: the possibility "
1950 "to have more than 2 busbars per substations (or not). "
(...)
1958 "handle more than 2 busbars per substation, then change it :-)\n"
1959 "Your backend will behave as if it did not support it.")
1960 self._missing_two_busbars_support_info = False
Link to maze: https://github.com/enlite-ai/maze
Hi Benjamin,
Could you please check if GEIRINA baseline is ready to be included into this repo?
The link is
Thanks,
Jiajun
Hi, when I try yo train the "DoubleDuelingDQN" baseline using the example provided in the README,I encounter the following error:(in the DoubleDuelingDQN.py file)
AttributeError: 'Environment_rte_case14_realistic' object has no attribute 'reward_helper'
I can't find the method “reward_helper” anywhere.The only method related to it is "_reward_helper".
Could you help me explain this? Thank you very much!
1.9.5
0.8.0
osx
PPO_RLLIB
The PPO_RLLIB code has been updated but there are couple of issues
Missing the following line self.env_glop.chronics_handler.reset()
after
There environment seems to be getting created twice. First one just to convert the environment observation and action space into gym format and then pass into the RLLIBAgent class where the environment is built again through rllib library. If I understand correctly this takes more memory for two environments and rewriting to just make one will help with memory.
The environment for the l2rpn_neurips_2020_track1_small
take a very long time to do 100 iterations with train_batch_size
of 20,000
added to env_config_ppo. These two parameters may even need to be higher to get good results. If something can be done to speed up the training that would be helpful for scaling to bigger networks.
Execute the train and eval script here
Train script should run without any issues and memory requirement is lower and faster training
When call l2rpn_baselines.utils.zip_for_codalab
, following error raises:
AttributeError: module 'grid2op' has no attribute 'make_new'
I think this should be grid2op.make
.
grid2op: 1.0.0 & 1.1.0
l2rpn_baselines: 0.4.3
Currently they are global in DDDQN baseline (and possibly others)
1.9.6
0.8.0
windows 11
CurriculumAgent
<Grid2OpException(grid2op.Exceptions.Grid2OpException.Grid2OpException: Grid2OpException "Impossible to use the RedispReward reward with an environment without generators cost. Please make sure env.redispatching_unit_commitment_availble is available.">
<Run the CurriculumAgent evaluate.py with the l2rpn_wcci_2022 environment using the train_full_pipeline
<Line 105 of evaluate.py
### Code snippet (if any)
#!/usr/bin/env python3
# Copyright (c) 2020, RTE (https://www.rte-france.com)
# See AUTHORS.txt
# This Source Code Form is subject to the terms of the Mozilla Public License, version 2.0.
# If a copy of the Mozilla Public License, version 2.0 was not distributed with this file,
# you can obtain one at http://mozilla.org/MPL/2.0/.
# SPDX-License-Identifier: MPL-2.0
# This file is part of L2RPN Baselines, L2RPN Baselines a repository to host baselines for l2rpn competitions.
import logging
from pathlib import Path
from typing import Union, Optional
import grid2op
from grid2op.Reward import RedispReward
from grid2op.Runner import Runner
from l2rpn_baselines.utils.save_log_gif import save_log_gif
from curriculumagent.baseline.baseline import CurriculumAgent
def evaluate(
env: grid2op.Environment.BaseEnv,
load_path: Union[str, Path] = "C:\\Users\\mariana.souza\\data_grid2op\\l2rpn_wcci_2022",
logs_path: Optional[Union[str, Path]] = "C:\\Users\\mariana.souza\\data_grid2op",
nb_episode: int = 1,
nb_process: int = 1,
max_steps: int = -1,
verbose: Union[bool, int] = True,
save_gif: bool = True,
**kwargs,
) -> Runner:
"""This is the evaluate method for the Curriculum Agent.
Args:
env: The environment on which the baseline will be evaluated. The default is the IEEE14 Case. For other
environments please retrain the agent in advance.
load_path: The path where the model is stored. This is used by the agent when calling "agent.load()"
logs_path: The path where the agents results will be stored.
nb_episode: Number of episodes to run for the assessment of the performance. By default, it equals 1.
nb_process: Number of process to be used for the assessment of the performance. Should be an integer greater
than 1. By default, it's equals 1.
max_steps: Maximum number of timesteps each episode can last. It should be a positive integer or -1.
-1 means that the entire episode is run (until the chronics is out of data or until a game over).
By default,it equals -1.
verbose: Verbosity of the output.
save_gif: Whether to save a gif into each episode folder corresponding to the representation of the said
episode. Note, that depending on the environment (and the performance of your agent) this creation of the gif
might take quite a lot of time!
**kwargs:
Returns:
The experiment file consisting of the data.
"""
runner_params = env.get_params_for_runner()
runner_params["verbose"] = verbose
# Create the agent (this piece of code can change)
agent = CurriculumAgent(
action_space=env.action_space,
observation_space=env.observation_space,
name="Evaluation"
)
# Load weights from file (for example)
agent.load(load_path)
# Build runner
runner = Runner(**runner_params, agentClass=None, agentInstance=agent)
# you can do stuff with your model here
# start the runner
if nb_process > 1:
logging.warning(
f"Parallel execution is not yet available for keras model. Therefore, the number of processes is comuted with "
f"only one process."
)
nb_process = 1
res = runner.run(path_save=logs_path, nb_episode=nb_episode, nb_process=nb_process, max_iter=max_steps, pbar=False)
# Print summary
logging.info("Evaluation summary:")
for _, chron_name, cum_reward, nb_time_step, max_ts in res:
msg_tmp = "\tFor chronics located at {}\n".format(chron_name)
msg_tmp += "\t\t - cumulative reward: {:.6f}\n".format(cum_reward)
msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)
logging.info(msg_tmp)
if save_gif:
save_log_gif(logs_path, res)
return res
if __name__ == "__main__":
"""
This is a possible implementation of the eval script.
"""
from lightsim2grid import LightSimBackend
import grid2op
logging.basicConfig(level=logging.INFO)
env = grid2op.make('l2rpn_wcci_2022', backend=LightSimBackend())
env.redispatching_unit_commitment_availble = True
obs = env.reset()
path_of_model = Path("C:\\Users\\mariana.souza\\data_grid2op\\l2rpn_wcci_2022")
myagent = CurriculumAgent(
action_space=env.action_space,
observation_space=env.observation_space,
model_path=path_of_model,
path_to_data=path_of_model,
name="Test",
)
env = grid2op.make('l2rpn_wcci_2022')
out = evaluate(
env,
load_path=path_of_model,
logs_path=Path(__file__).parent / "logs",
nb_episode=10,
nb_process=1,
max_steps=-1,
verbose=0,
save_gif=True,
)
... # Some code
Grid2OpException(
grid2op.Exceptions.Grid2OpException.Grid2OpException: Grid2OpException "Impossible to use the RedispReward reward with an environment without generators cost. Please make sure env.redispatching_unit_commitment_availble is available."
The evaluate of the agent
GymEnvWithReco, GymEnvWIthRecoWithDN etc.
1.8.1
0.6.0.post1
osx
PPO_SB3
1.7.0
After training with train script with normalize_obs=True and normalize_act=True, and then trying to use the trained agent for evaluation leads to incorrect results.
The train script used
import re
import grid2op
from grid2op.Reward import LinesCapacityReward # or any other rewards
from grid2op.Chronics import MultifolderWithCache # highly recommended
from lightsim2grid import LightSimBackend # highly recommended for training !
from l2rpn_baselines.PPO_SB3 import train
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name,
reward_class=LinesCapacityReward,
backend=LightSimBackend(),
chronics_class=MultifolderWithCache)
env.chronics_handler.real_data.set_filter(lambda x: re.match(".*00$", x) is not None)
env.chronics_handler.real_data.reset()
try:
trained_agent = train(
env,
iterations=10_000, # any number of iterations you want
logs_dir="./logs", # where the tensorboard logs will be put
save_path="./saved_model", # where the NN weights will be saved
name="test", # name of the baseline
net_arch=[100, 100, 100], # architecture of the NN
normalize_act=True,
normalize_obs=True,
)
finally:
env.close()
Evaluation script
import grid2op
from grid2op.Reward import LinesCapacityReward # or any other rewards
from lightsim2grid import LightSimBackend # highly recommended !
from l2rpn_baselines.PPO_SB3 import evaluate
nb_episode = 7
nb_process = 1
verbose = True
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name,
reward_class=LinesCapacityReward,
backend=LightSimBackend()
)
try:
evaluate(env,
nb_episode=nb_episode,
load_path="./saved_model", # should be the same as what has been called in the train function !
name="test", # should be the same as what has been called in the train function !
nb_process=1,
verbose=verbose,
)
runner_params = env.get_params_for_runner()
runner = Runner(**runner_params)
res = runner.run(nb_episode=nb_episode,
nb_process=nb_process
)
# Print summary
if verbose:
print("Evaluation summary for DN:")
for _, chron_name, cum_reward, nb_time_step, max_ts in res:
msg_tmp = "chronics at: {}".format(chron_name)
msg_tmp += "\ttotal score: {:.6f}".format(cum_reward)
msg_tmp += "\ttime steps: {:.0f}/{:.0f}".format(nb_time_step, max_ts)
print(msg_tmp)
finally:
env.close()
The results are very similar to Do Nothing agent, which does not happen if during training normalise_obs and normalise_act is set to False
The issue is happening because of using load_path
instead of my_path
in the following two lines
Making this change resolved the issue for my case.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.