I am using a reward-based approach for training in Bonsai. However, I keep getting the following message in the outputs at the end of each time I train a brain:
Training of concept "optimize" has ended without completing the assigned curriculum. The final scenario reached was default. Consider changing your LessonSuccessThreshold (Currently: 0.9). If you are using rewards, consider setting or changing your LessonRewardThreshold (Currently: None).
What could be possible reasons for getting this message? and could you please suggest some ways to solve the problem?
Hi @Maythah. I added that warning because some users were confused about why their training had ended before their curriculum was complete, and wanted to know what they could do about it.
If your training runs for more than NoProgressIterationLimit (NPIL) iterations (state-action exchanges with your simulators) without producing a new high water mark for the policy, training stops. If you have one or more lessons in your curriculum that were not completed when the NPIL is hit, the message you copied will be emitted. It outlines the two main parameters you can control in order to calibrate when a lesson is deemed complete, and thus when the policy is ready to move on to the next lesson in the curriculum.