In contrast to the conventional role of evolution in evolutionary computation (EC) as an optimization algorithm, a new class of evolutionary algorithms has emerged in recent years that instead aim to accumulate as diverse a collection of discoveries as possible, yet where each variant in the collection is as fit as it can be. Often applied in both neuroevolution and morphological evolution, these new quality diversity (QD) algorithms are particularly well-suited to evolution's inherent strengths, thereby offering a promising niche for EC within the broader field of machine learning. However, because QD algorithms are so new, until now no comprehensive study has yet attempted to systematically elucidate their relative strengths and weaknesses under different conditions. Taking a first step in this direction, this paper introduces a new benchmark domain designed specifically to compare and contrast QD algorithms. It then shows how the degree of alignment between the measure of quality and the behavior characterization (which is an essential component of all QD algorithms to date) impacts the ultimate performance of different such algorithms. The hope is that this initial study will help to stimulate interest in QD and begin to unify the disparate ideas in the area.