Afterstate
11 Nov 2025
Temporal Difference (TD) Control Algorithms Comparison: SARSA, Expected SARSA, and Q-learning
Comparative analysis of major one-step Temporal Difference (TD) control algorithms: SARSA, Expected SARSA, and Q-learning, focusing on their policy nature and target construction.
15 Sep 2025
Reinforcement Learning for Outfit Compatibility
Modeling the outfit compatibility problem as a Markov Decision Process (MDP), defining the state space, action space, and afterstate formulation for sequential item selection.
1 Sep 2024
Formalization of the afterstate concept in Reinforcement Learning, including value functions and Dynamic Programming / Temporal Difference algorithms.