help3001

It's UWAweek 27

help3001

This forum is provided to promote discussion amongst students enrolled in CITS3001 Algorithms, Agents and Artificial Intelligence.

Please consider offering answers and suggestions to help other students! And if you fix a problem by following a suggestion here, it would be great if other interested students could see a short "Great, fixed it!" followup message.

How do I ask a good question?

Displaying selected article

Showing 1 of 292 articles.
Currently no other people reading this forum.

Policy iteration (all 3)

10:04am Wed 13th Sep, Joshua N.

ANONYMOUS wrote:

> With policy iteration, my understanding is that: > - We compute the utilities of a policy > - Then we compute a new policy according to the utilities we just calculated > - We repeat this until the policy converges > > What I don't understand is why does the policy improve? If we start with policy P and determine the set of utilities U for each state, then using U wouldn't we just get back the same policy P?

Hi Anon, With policy iteration, - We use value determination to determine the agent's corresponding utilities if it follows a policy. - We then use action determination to determine the optimal policy given a utility for each state. - We then switch to this new policy and repeat the process until the policy converges. The reason why it improves is because, initially we choose an arbitrary policy without any knowledge of utilities. Afterwards, we compute the utilities of following the initial policy and we can now use action determination to determine the best move for each state. We can then update our policy to the new and more optimal policy. Then we repeat the process to see if we can improve the policy further. I hope that helps.

help3001 articles all ANNOUNCEMENTS recent articles login to post a new article help3001 topics from today and yesterday from the past week from the past two weeks ALL topics, most recent first	login chapter0 unitinfo research projects csforum csse-feedback csbreakdown cssubmit csmarks 2nd semester help fora Algebra Advanced Algorithms Cloud Computing High Performance Computing Intelligent Agents Professional Computing Relational Database Management Systems Systems Programming