I guess I should’ve clarified; in reforcement learning “I was wrong in numerous ways” almost always translates to “unpublishable, try to not be wrong next time”. Nobody cares if a reinforcement learning hypothesis didn’t work, its only worth publishing if it worked well.
This is such an awesome answer; exactly what I was looking for. Simple, general, and something I can actually try. Thanks for replying