Discover the model-free algorithm that learns the quality of actions, guiding agents on what to do in different circumstances.