- Implemented policy_utils.py with helper functions for action selection, including epsilon-greedy support.

- Updated `requirements.txt` to relax PyTorch version constraint for better GPU compatibility.
- Added detailed GPU setup instructions, new device fallback options, and command examples to `README.md`.
- Developed a new script `plot_model_max_x_trend.py` for visualizing training trends, generating HTML/Markdown reports.
This commit is contained in:
2026-02-13 16:11:38 +08:00
parent 71008dfb72
commit 2960ac1df5
11 changed files with 1294 additions and 32 deletions

1
.gitignore vendored
View File

@@ -20,6 +20,7 @@ env/
# Logs / test / tooling
*.log
.cache/
.pytest_cache/
.mypy_cache/
.ruff_cache/