As the COVID-19 outbreak continues to pose a serious worldwide threat, numerous governments choose to establish lock-downs in order to reduce disease transmission. However, imposing the strictest possible lock-down at all times has dire economic consequences, especially in areas with widespread poverty. In fact, many countries and regions have started charting paths to ease lock-down measures. Thus, planning efficient ways to tighten and relax lock-downs is a crucial and urgent problem. We develop a reinforcement learning based approach that is (1) robust to a range of parameter settings, and (2) optimizes multiple objectives related to different aspects of public health and economy, such as hospital capacity and delay of the disease. The absence of a vaccine or a cure for COVID to date implies that the infected population cannot be reduced through pharmaceutical interventions. However, non-pharmaceutical interventions (lock-downs) can slow disease spread and keep it manageable. This work focuses on how to manage the disease spread without severe economic consequences.