• howrar
    link
    fedilink
    arrow-up
    1
    ·
    3 months ago

    Counterexample: There exists an optimal deterministic policy for any MDP.