In the Treasure Hunt game, the agent determines to either explore the maze or exploit the Q-table through the use of an epsilon value.Ĭode block 2 # use epsilon value to determine next action The intelligent agent learns the optimal path to the treasure by using two techniques referred to as exploitation and exploration. IF (epsolonRatio > randomNumber) THEN exploreĮLSE exploit Q-Table The ideal proportion of exploitation and exploration in Treasure Hunt The agent determines between the two options by using an epsilon value to control the ratio of exploration VS exploitation.Ĭode block 1 # Example of exploration vs exploitation An agent explores the environment when it chooses a random action and exploits when it chooses the most likely action from its Q-table.Īn intelligent agent explores the environment to find new paths that lead to greater rewards and exploits its Q-table to choose the best action for the current state.įor every episode, the agent determines between exploring the environment or exploiting its Q-table. What is the difference between exploration and exploitation in machine learning?Įxploitation and exploration are two different approaches for determining the basis on which an intelligent agent chooses an action.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |