diff --git a/README.md b/README.md
index b91fc42f..ea9b3da6 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,8 @@
-<div align="center">
-<img src=docs/source/images/logo1.png width=65% />
-</div>
+[comment]: <> (<div align="center">)
+
+[comment]: <> (<img src=docs/source/images/logo1.png width=65% />)
+
+[comment]: <> (</div>)
 
 <h1 align="center"> MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library </h1>
 
@@ -10,10 +12,9 @@
 [![GitHub issues](https://img.shields.io/github/issues/Replicable-MARL/MARLlib)](https://github.com/Replicable-MARL/MARLlib/issues)
 [![PyPI version](https://badge.fury.io/py/marllib.svg)](https://badge.fury.io/py/marllib)
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Replicable-MARL/MARLlib/blob/sy_dev/marllib.ipynb)
-[![Awesome](https://awesome.re/badge.svg)](https://marllib.readthedocs.io/en/latest/resources/awesome.html)
 [![Organization](https://img.shields.io/badge/Organization-ReLER_RL-blue.svg)](https://github.com/Replicable-MARL/MARLlib)
 [![Organization](https://img.shields.io/badge/Organization-PKU_MARL-blue.svg)](https://github.com/Replicable-MARL/MARLlib)
-
+[![Awesome](https://awesome.re/badge.svg)](https://marllib.readthedocs.io/en/latest/resources/awesome.html)
 
 > __News__:
 > We are excited to announce that a major update has just been released. For detailed version information, please refer to the [version info](https://github.com/Replicable-MARL/MARLlib/releases/tag/1.0.2).
@@ -55,7 +56,7 @@ Here we provide a table for the comparison of MARLlib and existing work.
 | [MAPPO Benchmark](https://github.com/marlbenchmark/on-policy) |       4 cooperative       |      1     |          share + separate        |          MLP + GRU        |         :x:              |
 | [MAlib](https://github.com/sjtu-marl/malib) |  4 self-play  | 10 | share + group + separate | MLP + LSTM | [![Documentation Status](https://readthedocs.org/projects/malib/badge/?version=latest)](https://malib.readthedocs.io/en/latest/?badge=latest)
 |    [EPyMARL](https://github.com/uoe-agents/epymarl)|        4 cooperative      |    9    |        share + separate       |      GRU             |           :x:            |
-|    **[MARLlib](https://github.com/Replicable-MARL/MARLlib)** |       11 **no task mode restriction**     |    18     |   share + group + separate + **customizable**         |         MLP + CNN + GRU + LSTM          |           [![Documentation Status](https://readthedocs.org/projects/marllib/badge/?version=latest)](https://marllib.readthedocs.io/en/latest/) |
+|    **[MARLlib](https://github.com/Replicable-MARL/MARLlib)** |       12 **no task mode restriction**     |    18     |   share + group + separate + **customizable**         |         MLP + CNN + GRU + LSTM          |           [![Documentation Status](https://readthedocs.org/projects/marllib/badge/?version=latest)](https://marllib.readthedocs.io/en/latest/) |
 
 |   Library   | Github Stars  | Documentation | Issues Open | Activity | Last Update
 |:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
@@ -108,7 +109,7 @@ First, install MARLlib dependencies to guarantee basic usage.
 following [this guide](https://marllib.readthedocs.io/en/latest/handbook/env.html), finally install patches for RLlib.
 
 ```bash
-$ conda create -n marllib python=3.8
+$ conda create -n marllib python=3.8 # or 3.9
 $ conda activate marllib
 $ git clone https://github.com/Replicable-MARL/MARLlib.git && cd MARLlib
 $ pip install -r requirements.txt
@@ -185,6 +186,7 @@ Most of the popular environments in MARL research are supported by MARLlib:
 | **[GRF](https://github.com/google-research/football)**  | collaborative + mixed | Full | Discrete | 2D |
 | **[Hanabi](https://github.com/deepmind/hanabi-learning-environment)** | cooperative | Partial | Discrete | 1D |
 | **[MATE](https://github.com/XuehaiPan/mate)** | cooperative + mixed | Partial | Both | 1D |
+| **[GoBigger](https://github.com/opendilab/GoBigger)** | cooperative + mixed | Both | Continuous | 1D |
 
 Each environment has a readme file, standing as the instruction for this task, including env settings, installation, and
 important notes.
@@ -320,7 +322,11 @@ More tutorial documentations are available [here](https://marllib.readthedocs.io
 
 ## Awesome List
 
-A collection of research and review papers of multi-agent reinforcement learning (MARL) is available [here](https://marllib.readthedocs.io/en/latest/resources/awesome.html). The papers have been organized based on their publication date and their evaluation of the corresponding environments.
+A collection of research and review papers of multi-agent reinforcement learning (MARL) is available. The papers have been organized based on their publication date and their evaluation of the corresponding environments.
+
+Algorithms: [![Awesome](https://awesome.re/badge.svg)](https://marllib.readthedocs.io/en/latest/resources/awesome.html)
+Environments: [![Awesome](https://awesome.re/badge.svg)](https://marllib.readthedocs.io/en/latest/handbook/env.html)
+
 
 ## Community
 
diff --git a/ROADMAP.md b/ROADMAP.md
index 3ead2415..eed984ae 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -11,7 +11,8 @@ This list describes the planned features including breaking changes.
 - [ ] manual training, refer to issue: https://github.com/Replicable-MARL/MARLlib/issues/86#issuecomment-1468188682
 - [ ] new environments
   - [x] MATE: https://github.com/UnrealTracking/mate
-  - [ ] Go-Bigger: https://github.com/opendilab/GoBigger
+  - [x] Go-Bigger: https://github.com/opendilab/GoBigger
   - [ ] Voltage Control: https://github.com/Future-Power-Networks/MAPDN
   - [ ] Overcooked: https://github.com/HumanCompatibleAI/overcooked_ai
-- [ ] Support Transformer architecture
+  - [ ] CloseAirCombat: https://github.com/liuqh16/CloseAirCombat
+- [ ] Support Transformers
diff --git a/docs/source/handbook/env.rst b/docs/source/handbook/env.rst
index f3dcfd6c..baec9148 100644
--- a/docs/source/handbook/env.rst
+++ b/docs/source/handbook/env.rst
@@ -594,4 +594,52 @@ Installation
 
 .. code-block:: shell
 
-    pip3 install git+https://github.com/XuehaiPan/mate.git#egg=mate
\ No newline at end of file
+    pip3 install git+https://github.com/XuehaiPan/mate.git#egg=mate
+
+
+.. _GoBigger:
+
+GoBigger
+==============
+.. only:: html
+
+    .. figure:: images/env_gobigger.gif
+       :width: 320
+       :align: center
+
+
+GoBigger is a game engine that offers an efficient and easy-to-use platform for agar-like game development. It provides a variety of interfaces specifically designed for game AI development. The game mechanics of GoBigger are similar to those of Agar, a popular massive multiplayer online action game developed by Matheus Valadares of Brazil. The objective of GoBigger is for players to navigate one or more circular balls across a map, consuming Food Balls and smaller balls to increase their size while avoiding larger balls that can consume them. Each player starts with a single ball, but can divide it into two when it reaches a certain size, giving them control over multiple balls.
+Official Link: https://github.com/opendilab/GoBigger
+
+.. list-table::
+   :widths: 25 25
+   :header-rows: 0
+
+   * - ``Original Learning Mode``
+     - Cooperative + Mixed
+   * - ``MARLlib Learning Mode``
+     - Cooperative + Mixed
+   * - ``Observability``
+     - Partial + Full
+   * - ``Action Space``
+     - Continuous
+   * - ``Observation Space Dim``
+     - 1D
+   * - ``Action Mask``
+     - No
+   * - ``Global State``
+     - No
+   * - ``Global State Space Dim``
+     - /
+   * - ``Reward``
+     - Dense
+   * - ``Agent-Env Interact Mode``
+     - Simultaneous
+
+
+Installation
+-----------------
+
+.. code-block:: shell
+
+    conda install -c opendilab gobigger
\ No newline at end of file
diff --git a/docs/source/images/env_gobigger.gif b/docs/source/images/env_gobigger.gif
new file mode 100644
index 00000000..918f74fa
Binary files /dev/null and b/docs/source/images/env_gobigger.gif differ
diff --git a/marllib/envs/base_env/__init__.py b/marllib/envs/base_env/__init__.py
index c06d11be..c917fff3 100644
--- a/marllib/envs/base_env/__init__.py
+++ b/marllib/envs/base_env/__init__.py
@@ -88,3 +88,9 @@
 except Exception as e:
     ENV_REGISTRY["mate"] = str(e)
 
+try:
+    from marllib.envs.base_env.gobigger import RLlibGoBigger
+    ENV_REGISTRY["gobigger"] = RLlibGoBigger
+except Exception as e:
+    ENV_REGISTRY["gobigger"] = str(e)
+
diff --git a/marllib/envs/base_env/config/gobigger.yaml b/marllib/envs/base_env/config/gobigger.yaml
new file mode 100644
index 00000000..aee7b3c5
--- /dev/null
+++ b/marllib/envs/base_env/config/gobigger.yaml
@@ -0,0 +1,33 @@
+# MIT License
+
+# Copyright (c) 2023 Replicable-MARL
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+env: gobigger
+
+env_args:
+  map_name: "st_t1p2" #  st(andard)_t(eam)2p(layer)2
+  #num_teams: 1
+  #num_agents: 2
+  frame_limit: 1600
+mask_flag: False
+global_state_flag: False
+opp_action_in_cc: True
+fixed_batch_timesteps: 3200 # optional, all scenario will use this batch size, only valid for on-policy algorithms
diff --git a/marllib/envs/base_env/gobigger.py b/marllib/envs/base_env/gobigger.py
new file mode 100644
index 00000000..dbbb003a
--- /dev/null
+++ b/marllib/envs/base_env/gobigger.py
@@ -0,0 +1,202 @@
+# MIT License
+
+# Copyright (c) 2023 Replicable-MARL
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+import copy
+
+from gobigger.envs import create_env_custom
+from gym.spaces import Dict as GymDict, Box
+from ray.rllib.env.multi_agent_env import MultiAgentEnv
+import numpy as np
+
+
+policy_mapping_dict = {
+    "all_scenario": {
+        "description": "mixed scenarios to t>2 (num_teams > 1)",
+        "team_prefix": ("team0_", "team1_"),
+        "all_agents_one_policy": True,
+        "one_agent_one_policy": True,
+    },
+}
+
+
+class RLlibGoBigger(MultiAgentEnv):
+
+    def __init__(self, env_config):
+
+        map_name = env_config["map_name"]
+
+        env_config.pop("map_name", None)
+        self.num_agents_per_team = int(map_name.split("p")[-1][0])
+        self.num_teams = int(map_name.split("_t")[1][0])
+        if self.num_teams == 1:
+            policy_mapping_dict["all_scenario"]["team_prefix"] = ("team0_",)
+        self.num_agents = self.num_agents_per_team * self.num_teams
+        self.max_steps = env_config["frame_limit"]
+        self.env = create_env_custom(type='st', cfg=dict(
+            team_num=self.num_teams,
+            player_num_per_team=self.num_agents_per_team,
+            frame_limit=self.max_steps
+        ))
+
+        self.action_space = Box(low=-1,
+                                high=1,
+                                shape=(2,),
+                                dtype=float)
+
+        self.rectangle_dim = 4
+        self.food_dim = self.num_agents * 100
+        self.thorns_dim = self.num_agents * 6
+        self.clone_dim = self.num_agents * 10
+        self.team_name_dim = 1
+        self.score_dim = 1
+
+        self.obs_dim = self.rectangle_dim + self.food_dim + self.thorns_dim + \
+                       self.clone_dim + self.team_name_dim + self.score_dim
+
+        self.observation_space = GymDict({"obs": Box(
+            low=-1e6,
+            high=1e6,
+            shape=(self.obs_dim,),
+            dtype=float)})
+
+        self.agents = []
+        for team_index in range(self.num_teams):
+            for agent_index in range(self.num_agents_per_team):
+                self.agents.append("team{}_{}".format(team_index, agent_index))
+
+        env_config["map_name"] = map_name
+        self.env_config = env_config
+
+    def reset(self):
+        original_obs = self.env.reset()
+        obs = {}
+        for agent_index, agent_name in enumerate(self.agents):
+
+            rectangle = list(original_obs[1][agent_index]["rectangle"])
+
+            overlap_dict = original_obs[1][agent_index]["overlap"]
+
+            food = overlap_dict["food"]
+            if 4 * len(food) > self.food_dim:
+                food = food[:self.food_dim // 4]
+            else:
+                padding = [0] * (self.food_dim - 4 * len(food))
+                food.append(padding)
+            food = [item for sublist in food for item in sublist]
+
+            thorns = overlap_dict["thorns"]
+            if 6 * len(thorns) > self.thorns_dim:
+                thorns = thorns[:self.thorns_dim // 6]
+            else:
+                padding = [0] * (self.thorns_dim - 6 * len(thorns))
+                thorns.append(padding)
+            thorns = [item for sublist in thorns for item in sublist]
+
+            clone = overlap_dict["clone"]
+            if 10 * len(clone) > self.clone_dim:
+                clone = clone[:self.clone_dim // 10]
+            else:
+                padding = [0] * (self.clone_dim - 10 * len(clone))
+                clone.append(padding)
+            clone = [item for sublist in clone for item in sublist]
+
+            team = original_obs[1][agent_index]["team_name"]
+            score = original_obs[1][agent_index]["score"]
+
+            all_elements = rectangle + food + thorns + clone + [team] + [score]
+            all_elements = np.array(all_elements, dtype=float)
+
+            obs[agent_name] = {
+                "obs": all_elements
+            }
+
+        return obs
+
+    def step(self, action_dict):
+        actions = {}
+        for i, agent_name in enumerate(self.agents):
+            actions[i] = list(action_dict[agent_name])
+            actions[i].append(-1)
+
+        original_obs, team_rewards, done, info = self.env.step(actions)
+
+        rewards = {}
+        obs = {}
+        infos = {}
+
+        for agent_index, agent_name in enumerate(self.agents):
+
+            rectangle = list(original_obs[1][agent_index]["rectangle"])
+
+            overlap_dict = original_obs[1][agent_index]["overlap"]
+
+            food = overlap_dict["food"]
+            if 4 * len(food) > self.food_dim:
+                food = food[:self.food_dim // 4]
+            else:
+                padding = [0] * (self.food_dim - 4 * len(food))
+                food.append(padding)
+            food = [item for sublist in food for item in sublist]
+
+            thorns = overlap_dict["thorns"]
+            if 6 * len(thorns) > self.thorns_dim:
+                thorns = thorns[:self.thorns_dim // 6]
+            else:
+                padding = [0] * (self.thorns_dim - 6 * len(thorns))
+                thorns.append(padding)
+            thorns = [item for sublist in thorns for item in sublist]
+
+            clone = overlap_dict["clone"]
+            if 10 * len(clone) > self.clone_dim:
+                clone = clone[:self.clone_dim // 10]
+            else:
+                padding = [0] * (self.clone_dim - 10 * len(clone))
+                clone.append(padding)
+            clone = [item for sublist in clone for item in sublist]
+
+            team = original_obs[1][agent_index]["team_name"]
+            score = original_obs[1][agent_index]["score"]
+
+            all_elements = rectangle + food + thorns + clone + [team] + [score]
+            all_elements = np.array(all_elements, dtype=float)
+
+            obs[agent_name] = {
+                "obs": all_elements
+            }
+
+            rewards[agent_name] = team_rewards[team]
+
+        dones = {"__all__": done}
+        return obs, rewards, dones, infos
+
+    def get_env_info(self):
+        env_info = {
+            "space_obs": self.observation_space,
+            "space_act": self.action_space,
+            "num_agents": self.num_agents,
+            "episode_limit": self.max_steps,
+            "policy_mapping_info": policy_mapping_dict
+        }
+        return env_info
+
+    def close(self):
+        self.env.close()
diff --git a/marllib/envs/global_reward_env/__init__.py b/marllib/envs/global_reward_env/__init__.py
index d5088910..8ab46a8d 100644
--- a/marllib/envs/global_reward_env/__init__.py
+++ b/marllib/envs/global_reward_env/__init__.py
@@ -24,56 +24,70 @@
 
 try:
     from marllib.envs.global_reward_env.mpe_fcoop import RLlibMPE_FCOOP
+
     COOP_ENV_REGISTRY["mpe"] = RLlibMPE_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["mpe"] = str(e)
 
 try:
     from marllib.envs.global_reward_env.magent_fcoop import RLlibMAgent_FCOOP
+
     COOP_ENV_REGISTRY["magent"] = RLlibMAgent_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["magent"] = str(e)
 
 try:
     from marllib.envs.global_reward_env.mamujoco_fcoop import RLlibMAMujoco_FCOOP
+
     COOP_ENV_REGISTRY["mamujoco"] = RLlibMAMujoco_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["mamujoco"] = str(e)
 
 try:
     from marllib.envs.global_reward_env.smac_fcoop import RLlibSMAC_FCOOP
+
     COOP_ENV_REGISTRY["smac"] = RLlibSMAC_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["smac"] = str(e)
 
 try:
     from marllib.envs.global_reward_env.football_fcoop import RLlibGFootball_FCOOP
+
     COOP_ENV_REGISTRY["football"] = RLlibGFootball_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["football"] = str(e)
 
 try:
     from marllib.envs.global_reward_env.rware_fcoop import RLlibRWARE_FCOOP
+
     COOP_ENV_REGISTRY["rware"] = RLlibRWARE_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["rware"] = str(e)
 
 try:
     from marllib.envs.global_reward_env.lbf_fcoop import RLlibLBF_FCOOP
+
     COOP_ENV_REGISTRY["lbf"] = RLlibLBF_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["lbf"] = str(e)
 
 try:
     from marllib.envs.global_reward_env.pommerman_fcoop import RLlibPommerman_FCOOP
+
     COOP_ENV_REGISTRY["pommerman"] = RLlibPommerman_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["pommerman"] = str(e)
 
-
 try:
     from marllib.envs.global_reward_env.mate_fcoop import RLlibMATE_FCOOP
+
     COOP_ENV_REGISTRY["mate"] = RLlibMATE_FCOOP
 except Exception as e:
     COOP_ENV_REGISTRY["mate"] = str(e)
 
+try:
+    from marllib.envs.global_reward_env.gobigger_fcoop import RLlibGoBigger_FCOOP
+
+    COOP_ENV_REGISTRY["gobigger"] = RLlibGoBigger_FCOOP
+except Exception as e:
+    COOP_ENV_REGISTRY["gobigger"] = str(e)
diff --git a/marllib/envs/global_reward_env/gobigger_fcoop.py b/marllib/envs/global_reward_env/gobigger_fcoop.py
new file mode 100644
index 00000000..455314c8
--- /dev/null
+++ b/marllib/envs/global_reward_env/gobigger_fcoop.py
@@ -0,0 +1,207 @@
+# MIT License
+
+# Copyright (c) 2023 Replicable-MARL
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+import copy
+
+from gobigger.envs import create_env_custom
+from gym.spaces import Dict as GymDict, Box
+from ray.rllib.env.multi_agent_env import MultiAgentEnv
+import numpy as np
+
+policy_mapping_dict = {
+    "all_scenario": {
+        "description": "cooperative scenarios to t=1 (num_teams=1)",
+        "team": ("team0_"),
+        "all_agents_one_policy": True,
+        "one_agent_one_policy": True,
+    },
+}
+
+
+class RLlibGoBigger_FCOOP(MultiAgentEnv):
+
+    def __init__(self, env_config):
+
+        map_name = env_config["map_name"]
+
+        env_config.pop("map_name", None)
+        self.num_agents_per_team = int(map_name.split("p")[-1][0])
+        self.num_teams = 1
+        self.num_agents = self.num_agents_per_team * self.num_teams
+        self.max_steps = env_config["frame_limit"]
+        self.env = create_env_custom(type='st', cfg=dict(
+            team_num=self.num_teams,
+            player_num_per_team=self.num_agents_per_team,
+            frame_limit=self.max_steps
+        ))
+
+        self.action_space = Box(low=-1,
+                                high=1,
+                                shape=(2,),
+                                dtype=float)
+
+        self.rectangle_dim = 4
+        self.food_dim = self.num_agents * 100
+        self.thorns_dim = self.num_agents * 6
+        self.clone_dim = self.num_agents * 10
+        self.team_name_dim = 1
+        self.score_dim = 1
+
+        self.obs_dim = self.rectangle_dim + self.food_dim + self.thorns_dim + \
+                       self.clone_dim + self.team_name_dim + self.score_dim
+
+        self.observation_space = GymDict({"obs": Box(
+            low=-1e6,
+            high=1e6,
+            shape=(self.obs_dim,),
+            dtype=float)})
+
+        self.agents = []
+        for team_index in range(self.num_teams):
+            for agent_index in range(self.num_agents_per_team):
+                self.agents.append("team{}_{}".format(team_index, agent_index))
+
+        env_config["map_name"] = map_name
+        self.env_config = env_config
+
+    def reset(self):
+        original_obs = self.env.reset()
+        obs = {}
+        for agent_index, agent_name in enumerate(self.agents):
+
+            rectangle = list(original_obs[1][agent_index]["rectangle"])
+
+            overlap_dict = original_obs[1][agent_index]["overlap"]
+
+            food = overlap_dict["food"]
+            if 4 * len(food) > self.food_dim:
+                food = food[:self.food_dim // 4]
+            else:
+                padding = [0] * (self.food_dim - 4 * len(food))
+                food.append(padding)
+            food = [item for sublist in food for item in sublist]
+
+            thorns = overlap_dict["thorns"]
+            if 6 * len(thorns) > self.thorns_dim:
+                thorns = thorns[:self.thorns_dim // 6]
+            else:
+                padding = [0] * (self.thorns_dim - 6 * len(thorns))
+                thorns.append(padding)
+            thorns = [item for sublist in thorns for item in sublist]
+
+            clone = overlap_dict["clone"]
+            if 10 * len(clone) > self.clone_dim:
+                clone = clone[:self.clone_dim // 10]
+            else:
+                padding = [0] * (self.clone_dim - 10 * len(clone))
+                clone.append(padding)
+            clone = [item for sublist in clone for item in sublist]
+
+            team = original_obs[1][agent_index]["team_name"]
+            score = original_obs[1][agent_index]["score"]
+
+            all_elements = rectangle + food + thorns + clone + [team] + [score]
+
+            if len(all_elements) != self.obs_dim:
+                print(1)
+
+            all_elements = np.array(all_elements, dtype=float)
+
+            obs[agent_name] = {
+                "obs": all_elements
+            }
+
+        return obs
+
+    def step(self, action_dict):
+        actions = {}
+        for i, agent_name in enumerate(self.agents):
+            actions[i] = list(action_dict[agent_name])
+            actions[i].append(-1)
+
+        original_obs, team_rewards, done, info = self.env.step(actions)
+
+        rewards = {}
+        obs = {}
+        infos = {}
+
+        for agent_index, agent_name in enumerate(self.agents):
+
+            rectangle = list(original_obs[1][agent_index]["rectangle"])
+
+            overlap_dict = original_obs[1][agent_index]["overlap"]
+
+            food = overlap_dict["food"]
+            if 4 * len(food) > self.food_dim:
+                food = food[:self.food_dim // 4]
+            else:
+                padding = [0] * (self.food_dim - 4 * len(food))
+                food.append(padding)
+            food = [item for sublist in food for item in sublist]
+
+            thorns = overlap_dict["thorns"]
+            if 6 * len(thorns) > self.thorns_dim:
+                thorns = thorns[:self.thorns_dim // 6]
+            else:
+                padding = [0] * (self.thorns_dim - 6 * len(thorns))
+                thorns.append(padding)
+            thorns = [item for sublist in thorns for item in sublist]
+
+            clone = overlap_dict["clone"]
+            if 10 * len(clone) > self.clone_dim:
+                clone = clone[:self.clone_dim // 10]
+            else:
+                padding = [0] * (self.clone_dim - 10 * len(clone))
+                clone.append(padding)
+            clone = [item for sublist in clone for item in sublist]
+
+            team = original_obs[1][agent_index]["team_name"]
+            score = original_obs[1][agent_index]["score"]
+
+            all_elements = rectangle + food + thorns + clone + [team] + [score]
+
+            if len(all_elements) != self.obs_dim:
+                print(1)
+
+            all_elements = np.array(all_elements, dtype=float)
+
+            obs[agent_name] = {
+                "obs": all_elements
+            }
+
+            rewards[agent_name] = team_rewards[team]
+
+        dones = {"__all__": done}
+        return obs, rewards, dones, infos
+
+    def get_env_info(self):
+        env_info = {
+            "space_obs": self.observation_space,
+            "space_act": self.action_space,
+            "num_agents": self.num_agents,
+            "episode_limit": self.max_steps,
+            "policy_mapping_info": policy_mapping_dict
+        }
+        return env_info
+
+    def close(self):
+        self.env.close()
diff --git a/marllib/marl/algos/README.md b/marllib/marl/algos/README.md
deleted file mode 100644
index dcd00f6c..00000000
--- a/marllib/marl/algos/README.md
+++ /dev/null
@@ -1,37 +0,0 @@
-10 environments are available for Independent Learning
-
-- Football
-- MPE 
-- SMAC
-- mamujoco
-- RWARE
-- LBF 
-- Pommerman
-- Magent
-- MetaDrive
-- Hanabi
-
-
-7 environments are available for Value Decomposition
-
-- Football
-- MPE 
-- SMAC
-- mamujoco
-- RWARE
-- LBF 
-- Pommerman
-
-9 environments are available for Centralized Critic
-
-- Football
-- MPE 
-- SMAC
-- mamujoco
-- RWARE
-- LBF 
-- Pommerman
-- Magent
-- Hanabi
-
-
diff --git a/marllib/marl/algos/hyperparams/common/coma.yaml b/marllib/marl/algos/hyperparams/common/coma.yaml
index b7a455b0..44274589 100644
--- a/marllib/marl/algos/hyperparams/common/coma.yaml
+++ b/marllib/marl/algos/hyperparams/common/coma.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/common/facmac.yaml b/marllib/marl/algos/hyperparams/common/facmac.yaml
index ad22a4d4..8ffbe6d9 100644
--- a/marllib/marl/algos/hyperparams/common/facmac.yaml
+++ b/marllib/marl/algos/hyperparams/common/facmac.yaml
@@ -36,6 +36,6 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
 
diff --git a/marllib/marl/algos/hyperparams/common/happo.yaml b/marllib/marl/algos/hyperparams/common/happo.yaml
index 42b71349..1388564b 100644
--- a/marllib/marl/algos/hyperparams/common/happo.yaml
+++ b/marllib/marl/algos/hyperparams/common/happo.yaml
@@ -38,4 +38,4 @@ algo_args:
   entropy_coeff: 0.01
   vf_clip_param: 10.0
   min_lr_schedule: 1e-11
-  batch_mode: "complete_episodes"
\ No newline at end of file
+  batch_mode: "truncate_episodes"
\ No newline at end of file
diff --git a/marllib/marl/algos/hyperparams/common/hatrpo.yaml b/marllib/marl/algos/hyperparams/common/hatrpo.yaml
index 85fc6f5f..ddf4e9c5 100644
--- a/marllib/marl/algos/hyperparams/common/hatrpo.yaml
+++ b/marllib/marl/algos/hyperparams/common/hatrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.00005
diff --git a/marllib/marl/algos/hyperparams/common/ia2c.yaml b/marllib/marl/algos/hyperparams/common/ia2c.yaml
index 2b2c4fa6..76af2158 100644
--- a/marllib/marl/algos/hyperparams/common/ia2c.yaml
+++ b/marllib/marl/algos/hyperparams/common/ia2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/common/iddpg.yaml b/marllib/marl/algos/hyperparams/common/iddpg.yaml
index cfbe62aa..c4971a4a 100644
--- a/marllib/marl/algos/hyperparams/common/iddpg.yaml
+++ b/marllib/marl/algos/hyperparams/common/iddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/common/ippo.yaml b/marllib/marl/algos/hyperparams/common/ippo.yaml
index dad13578..8df638d1 100644
--- a/marllib/marl/algos/hyperparams/common/ippo.yaml
+++ b/marllib/marl/algos/hyperparams/common/ippo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/common/itrpo.yaml b/marllib/marl/algos/hyperparams/common/itrpo.yaml
index 1b0ad894..66d1e072 100644
--- a/marllib/marl/algos/hyperparams/common/itrpo.yaml
+++ b/marllib/marl/algos/hyperparams/common/itrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.00005
diff --git a/marllib/marl/algos/hyperparams/common/maa2c.yaml b/marllib/marl/algos/hyperparams/common/maa2c.yaml
index 449462d6..df3b0abb 100644
--- a/marllib/marl/algos/hyperparams/common/maa2c.yaml
+++ b/marllib/marl/algos/hyperparams/common/maa2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/common/maddpg.yaml b/marllib/marl/algos/hyperparams/common/maddpg.yaml
index 20d42498..5c957a8d 100644
--- a/marllib/marl/algos/hyperparams/common/maddpg.yaml
+++ b/marllib/marl/algos/hyperparams/common/maddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/common/mappo.yaml b/marllib/marl/algos/hyperparams/common/mappo.yaml
index c03dcb26..efcbb7f2 100644
--- a/marllib/marl/algos/hyperparams/common/mappo.yaml
+++ b/marllib/marl/algos/hyperparams/common/mappo.yaml
@@ -35,6 +35,6 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
 
diff --git a/marllib/marl/algos/hyperparams/common/matrpo.yaml b/marllib/marl/algos/hyperparams/common/matrpo.yaml
index 4d44a416..76a86a8d 100644
--- a/marllib/marl/algos/hyperparams/common/matrpo.yaml
+++ b/marllib/marl/algos/hyperparams/common/matrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.00005
diff --git a/marllib/marl/algos/hyperparams/common/vda2c.yaml b/marllib/marl/algos/hyperparams/common/vda2c.yaml
index f2d0e24d..95c03bb6 100644
--- a/marllib/marl/algos/hyperparams/common/vda2c.yaml
+++ b/marllib/marl/algos/hyperparams/common/vda2c.yaml
@@ -29,7 +29,7 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
   mixer: "qmix" # vdn
diff --git a/marllib/marl/algos/hyperparams/common/vdppo.yaml b/marllib/marl/algos/hyperparams/common/vdppo.yaml
index 04e90420..3adf7000 100644
--- a/marllib/marl/algos/hyperparams/common/vdppo.yaml
+++ b/marllib/marl/algos/hyperparams/common/vdppo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/facmac.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/facmac.yaml
index 62b186af..1d7f09ce 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/facmac.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/facmac.yaml
@@ -36,6 +36,6 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/happo.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/happo.yaml
index 1b2707dd..1451a6f9 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/happo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/happo.yaml
@@ -36,6 +36,6 @@ algo_args:
   lr: 0.0001
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   min_lr_schedule: 1e-11
   gain: 0.01
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/hatrpo.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/hatrpo.yaml
index fd289dc4..45436c7c 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/hatrpo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/hatrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.0005
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/ia2c.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/ia2c.yaml
index 2b2c4fa6..76af2158 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/ia2c.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/ia2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/iddpg.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/iddpg.yaml
index babce71a..e6e84b55 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/iddpg.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/iddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/ippo.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/ippo.yaml
index 6dde1d9d..c25d964c 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/ippo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/ippo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/itrpo.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/itrpo.yaml
index 2578927b..e38d6d2c 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/itrpo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/itrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.0005
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/maa2c.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/maa2c.yaml
index 449462d6..df3b0abb 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/maa2c.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/maa2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/maddpg.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/maddpg.yaml
index 476e6a96..9dc602d9 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/maddpg.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/maddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/mappo.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/mappo.yaml
index eef75ff0..802aed8f 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/mappo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/mappo.yaml
@@ -35,6 +35,6 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/matrpo.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/matrpo.yaml
index 5942888f..9770b9a2 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/matrpo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/matrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.0005
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/vda2c.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/vda2c.yaml
index f2d0e24d..95c03bb6 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/vda2c.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/vda2c.yaml
@@ -29,7 +29,7 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
   mixer: "qmix" # vdn
diff --git a/marllib/marl/algos/hyperparams/finetuned/mamujoco/vdppo.yaml b/marllib/marl/algos/hyperparams/finetuned/mamujoco/vdppo.yaml
index fe3f1bd4..d1b53e56 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mamujoco/vdppo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mamujoco/vdppo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/coma.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/coma.yaml
index 26ad593f..3b54ae3b 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/coma.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/coma.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  128
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/facmac.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/facmac.yaml
index 2c8d62b7..f42ce4ec 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/facmac.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/facmac.yaml
@@ -36,6 +36,6 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/happo.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/happo.yaml
index 4ab06ad1..afef9151 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/happo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/happo.yaml
@@ -38,4 +38,4 @@ algo_args:
   entropy_coeff: 0.01
   vf_clip_param: 10.0
   min_lr_schedule: 1e-11
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/hatrpo.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/hatrpo.yaml
index 588d1ed3..a0d81929 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/hatrpo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/hatrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.0005
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/ia2c.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/ia2c.yaml
index 2b2c4fa6..76af2158 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/ia2c.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/ia2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  10
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/iddpg.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/iddpg.yaml
index 94ba33ef..621beb54 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/iddpg.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/iddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 1000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/ippo.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/ippo.yaml
index aa8d522d..8c6c08b4 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/ippo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/ippo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 20.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/itrpo.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/itrpo.yaml
index 3e8cc247..f41374a8 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/itrpo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/itrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.0005
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/maa2c.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/maa2c.yaml
index a5201c1f..74dccc18 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/maa2c.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/maa2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  128
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/maddpg.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/maddpg.yaml
index 61ec7e6c..2faf2b4e 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/maddpg.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/maddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 10000
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/mappo.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/mappo.yaml
index e5f13fc5..823705a1 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/mappo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/mappo.yaml
@@ -35,6 +35,6 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 20.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
 
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/matrpo.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/matrpo.yaml
index 3a3da10f..6ded245c 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/matrpo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/matrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.0005
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/vda2c.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/vda2c.yaml
index e11990b1..7053131f 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/vda2c.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/vda2c.yaml
@@ -29,7 +29,7 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  128
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
   mixer: "qmix" # vdn
diff --git a/marllib/marl/algos/hyperparams/finetuned/mpe/vdppo.yaml b/marllib/marl/algos/hyperparams/finetuned/mpe/vdppo.yaml
index dc45d4cb..5df3d881 100644
--- a/marllib/marl/algos/hyperparams/finetuned/mpe/vdppo.yaml
+++ b/marllib/marl/algos/hyperparams/finetuned/mpe/vdppo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 20.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
diff --git a/marllib/marl/algos/hyperparams/test/coma.yaml b/marllib/marl/algos/hyperparams/test/coma.yaml
index f320a3f1..e3019d38 100644
--- a/marllib/marl/algos/hyperparams/test/coma.yaml
+++ b/marllib/marl/algos/hyperparams/test/coma.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  2
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/test/facmac.yaml b/marllib/marl/algos/hyperparams/test/facmac.yaml
index 40c0d4df..e41bbc68 100644
--- a/marllib/marl/algos/hyperparams/test/facmac.yaml
+++ b/marllib/marl/algos/hyperparams/test/facmac.yaml
@@ -36,6 +36,6 @@ algo_args:
   buffer_size_episode: 10
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
 
diff --git a/marllib/marl/algos/hyperparams/test/happo.yaml b/marllib/marl/algos/hyperparams/test/happo.yaml
index 85ed5d79..dfbbc47d 100644
--- a/marllib/marl/algos/hyperparams/test/happo.yaml
+++ b/marllib/marl/algos/hyperparams/test/happo.yaml
@@ -38,4 +38,4 @@ algo_args:
   entropy_coeff: 0.01
   vf_clip_param: 10.0
   min_lr_schedule: 1e-11
-  batch_mode: "complete_episodes"
\ No newline at end of file
+  batch_mode: "truncate_episodes"
\ No newline at end of file
diff --git a/marllib/marl/algos/hyperparams/test/hatrpo.yaml b/marllib/marl/algos/hyperparams/test/hatrpo.yaml
index 3b74bca1..33af497c 100644
--- a/marllib/marl/algos/hyperparams/test/hatrpo.yaml
+++ b/marllib/marl/algos/hyperparams/test/hatrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.00005
diff --git a/marllib/marl/algos/hyperparams/test/ia2c.yaml b/marllib/marl/algos/hyperparams/test/ia2c.yaml
index faed5009..5d830e6a 100644
--- a/marllib/marl/algos/hyperparams/test/ia2c.yaml
+++ b/marllib/marl/algos/hyperparams/test/ia2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  2
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/test/iddpg.yaml b/marllib/marl/algos/hyperparams/test/iddpg.yaml
index d52f814d..a1f237f3 100644
--- a/marllib/marl/algos/hyperparams/test/iddpg.yaml
+++ b/marllib/marl/algos/hyperparams/test/iddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 10
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/test/ippo.yaml b/marllib/marl/algos/hyperparams/test/ippo.yaml
index c40456a9..e13de22e 100644
--- a/marllib/marl/algos/hyperparams/test/ippo.yaml
+++ b/marllib/marl/algos/hyperparams/test/ippo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/test/itrpo.yaml b/marllib/marl/algos/hyperparams/test/itrpo.yaml
index ed85d536..ce0093d6 100644
--- a/marllib/marl/algos/hyperparams/test/itrpo.yaml
+++ b/marllib/marl/algos/hyperparams/test/itrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.00005
diff --git a/marllib/marl/algos/hyperparams/test/maa2c.yaml b/marllib/marl/algos/hyperparams/test/maa2c.yaml
index 1a199a75..cca3b1e3 100644
--- a/marllib/marl/algos/hyperparams/test/maa2c.yaml
+++ b/marllib/marl/algos/hyperparams/test/maa2c.yaml
@@ -29,6 +29,6 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  2
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
diff --git a/marllib/marl/algos/hyperparams/test/maddpg.yaml b/marllib/marl/algos/hyperparams/test/maddpg.yaml
index efe7c914..a4f3197c 100644
--- a/marllib/marl/algos/hyperparams/test/maddpg.yaml
+++ b/marllib/marl/algos/hyperparams/test/maddpg.yaml
@@ -36,5 +36,5 @@ algo_args:
   buffer_size_episode: 10
   target_network_update_freq_episode: 1
   tau: 0.002
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
diff --git a/marllib/marl/algos/hyperparams/test/mappo.yaml b/marllib/marl/algos/hyperparams/test/mappo.yaml
index f13392e2..c96c5f9a 100644
--- a/marllib/marl/algos/hyperparams/test/mappo.yaml
+++ b/marllib/marl/algos/hyperparams/test/mappo.yaml
@@ -35,6 +35,6 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
 
 
diff --git a/marllib/marl/algos/hyperparams/test/matrpo.yaml b/marllib/marl/algos/hyperparams/test/matrpo.yaml
index 915e843d..29972443 100644
--- a/marllib/marl/algos/hyperparams/test/matrpo.yaml
+++ b/marllib/marl/algos/hyperparams/test/matrpo.yaml
@@ -34,7 +34,7 @@ algo_args:
   vf_loss_coeff: 1.0
   entropy_coeff: 0.01
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   kl_threshold: 0.00001
   accept_ratio: 0.5
   critic_lr: 0.00005
diff --git a/marllib/marl/algos/hyperparams/test/vda2c.yaml b/marllib/marl/algos/hyperparams/test/vda2c.yaml
index 3f0bd5c4..c3889033 100644
--- a/marllib/marl/algos/hyperparams/test/vda2c.yaml
+++ b/marllib/marl/algos/hyperparams/test/vda2c.yaml
@@ -29,7 +29,7 @@ algo_args:
   lambda: 1.0
   vf_loss_coeff: 1.0
   batch_episode:  2
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   lr: 0.0005
   entropy_coeff: 0.01
   mixer: "qmix" # vdn
diff --git a/marllib/marl/algos/hyperparams/test/vdppo.yaml b/marllib/marl/algos/hyperparams/test/vdppo.yaml
index b2b4c59e..0e792bec 100644
--- a/marllib/marl/algos/hyperparams/test/vdppo.yaml
+++ b/marllib/marl/algos/hyperparams/test/vdppo.yaml
@@ -35,5 +35,5 @@ algo_args:
   entropy_coeff: 0.01
   clip_param: 0.3
   vf_clip_param: 10.0
-  batch_mode: "complete_episodes"
+  batch_mode: "truncate_episodes"
   mixer: "qmix" # qmix or vdn
diff --git a/marllib/marl/ray/ray.yaml b/marllib/marl/ray/ray.yaml
index 9ba4d281..b58285be 100644
--- a/marllib/marl/ray/ray.yaml
+++ b/marllib/marl/ray/ray.yaml
@@ -24,7 +24,7 @@
 
 local_mode: False # True for debug mode only
 share_policy: "group" #  individual(separate) / group(division) / all(share)
-evaluation_interval: 10 # evaluate model every 10 training iterations
+evaluation_interval: 50 # evaluate model every 10 training iterations
 framework: "torch"
 num_workers: 1 # thread number
 num_gpus: 1 # gpu to use