Falsification-Driven Reinforcement Learning for Maritime Motion Planning