Press ← or → to navigate between chapters
Press S or / to search in the book
Press ? to show this help
Press Esc to hide this help
Policy Based 方法直接学习策略函数$\pi$,用于选择动作。