Course Project: Reinforcement Learning
- Multi-agent collision-free path planning algorithm in unknown and deterministic environment.
- Built the long short-term memory (LSTM) with the self-attention mechanism to encode the other agents’ states.
- Fed the agent’s own state and the other agents’ states into two multilayer perceptrons (MLP) to output the optimal value function and the optimal policy.