Recently the inverted generational distance (IGD) measure has been frequently used for performance evaluation of evolutionary multi-objective optimization (EMO) algorithms on many-objective problems. When the IGD measure is used to evaluate an obtained solution set of a many-objective problem, we have to specify a set of reference points as an approximation of the Pareto front. The IGD measure is calculated as the average distance from each reference point to the nearest solution in the solution set, which can be viewed as an approximate distance from the Pareto front to the solution set in the objective space. Thus the IGD-based performance evaluation totally depends on the specification of reference points. In this paper, we illustrate difficulties in specifying reference points. First we discuss the number of reference points required to approximate the entire Pareto front of a many-objective problem. Next we show some simple examples where the uniform sampling of reference points on the known Pareto front leads to counter-intuitive results. Then we discuss how to specify reference points when the Pareto front is unknown. In this case, a set of reference points is usually constructed from obtained solutions by EMO algorithms to be evaluated. We show that the selection of EMO algorithms used to construct reference points has a large effect on the evaluated performance of each algorithm.