ANONYMOUS wrote:
> I was wondering how the testing would be done with our agents. Currently, I am testing my agent against a pool of agents consisting of: my agent, the satisfactory, basic and the random agent in the agent pool. I ask this because I've noticed that my win rate changes depending on which agents I play against (which makes sense why), but I was wondering if we could know a little more about which agents our agent verses.
>
> For example, excluding the RandomAgent from the pool changes the winrate %s.
I am not intending to provide any fixed rules about what combinations will be tested. Note that assessing your own agent's performance is explicitly part of this project. Given that your agent is not receiving any different treatment to the other agents, the SatisfactoryAgent and yours are in the same situation regardless of what pool of other agents is in play. Yes, this may affect your win rate, but if it causes your win rate to be worse than the SatisfactoryAgent, then surely this means you are not outperforming the benchmark?
I encourage you to explore different testing methodologies and scenarios and use these to justify the effectiveness of your agent in your report.
If nothing else, if a benchmark agent and your agent are dropped into identical scenarios, for your agent to outperform the benchmark would mean for it to be more likely to win, and so over many samples for it to have a higher win rate, yes?
Hope that helps.
Cheers,
Gozz