Many argue that to evolve artificial intelligence that rivals that of natural animals, we need to evolve neural networks that are structurally organized in that they exhibit modularity, regularity, and hierarchy. It was recently shown that a cost for network connections, which encourages the evolution of modularity, can be combined with an indirect encoding, which encourages the evolution of regularity, to evolve networks that are both modular and regular. However, the bias towards regularity from indirect encodings may prevent evolution from independently optimizing different modules to perform different functions, unless modularity in the phenotype is aligned with modularity in the genotype. We test this hypothesis on two multi-modal problems---a pattern recognition task and a robotics task---that each require different phenotypic modules. In general, we find that performance is improved only when genotypic and phenotypic modularity are encouraged simultaneously, though the role of alignment remains unclear. In addition, intuitive manual decompositions fail to provide the performance benefits of automatic methods on the more challenging robotics problem, emphasizing the importance of automatic, rather than manual, decomposition methods. These results suggest encouraging modularity in both the genotype and phenotype as an important step towards solving large-scale multi-modal problems, but also indicate that more research is required before we can evolve structurally organized networks to solve tasks that require multiple, different neural modules.