The present thesis deals with the challenging multidisciplinary task of combining the manufacturing process of friction stir welding (FSW) with mathematical optimization methods in the search for optimal process parameters. The goals (objectives in optimization parlance) are process related in the sense that they describe or express when the process works in an optimal way or yields final parts that are somehow optimal. These expressions are denoted objective functions and the mathematical optimization algorithm is then searching for a set of the investigated process parameters (in optimization terms denoted design variables) that will either minimize or maximize the objective functions depending on the problem at hand. The FSW process which has been the subject of the optimization in this study is a relatively new welding process that was invented in 1991 by TheWelding Institute (TWI), UK. In short, the process is solid-state, that is, no melting takes place, meaning that a lot of the disadvantages normally associated with traditional fusing welding processes can be avoided. In the FSW process a rotating tool is submerged into the two work pieces and due to frictional and plastic dissipation, the temperature is increased to an extend where the material is sufficiently softened to be stirred together, thereby forming a weld. The process is characterized by multiphysics involving solid material flow, heat transfer, thermal softening, recrystallization and the formation of residual stresses. In the present work, several models for the FSW process have been applied. Initially, the thermal models were addressed since they in essence constitute the basis of all other models of FSW, be it microstructural, flow or residual stress models. Both analytical and numerical models were used and combined with the Sequential Quadratic Programming (SQP) gradient-based optimization algorithm in order to find the welding speed and the heat input that would yield a prescribed average temperature close to the solidus temperature under the tool, thereby expressing a condition which is favourable for the process. Following this, several thermomechanical models for FSW in both ABAQUS and ANSYS were developed. They were used for the analysis of the transient temperature and stress evolutions during welding and subsequent cooling, eventually leading to the residual stress state and reduced mechanical properties due to thermal softening. In one case, the subsequent loading situation of a real FSW structure was also taken into account, thus making way for an integrated analysis of the welding process and the loading situation during service of the welded part. Another case combined the predicted stresses with a subsequent uni-axial loading situation in which a damage evolution analysis was carried out in order to predict the final weld's load carrying capacity when subject to tension perpendicular to the weld line. The thermomechanical models predicting residual stresses were also combined with an evolutionary optimization algorithm (NSGA-II) in the search for the optimal combination of the process parameters that one essentially controls in practice, namely the welding speed and the rotational speed, which would minimize residual stresses and maximize the welding speed divided by rotational speed (known as advancement per revolution and expressing a desired feature of the process). Finally, some more theoretical investigations regarding several well-known unconstrained and constrained multi-objective-optimization (MOO) benchmark problems were carried out. This was done in order to investigate some of the difficulties that a multi objective evolutionary algorithm may have to tackle. Specifically, three elitist algorithms, i.e. the MOGA-II, the NSGA-II (the versions implemented in modeFRONTIER) and the cNSGA-II (custom NSGA-II implementation by the author in MATLAB with two versions including the one with an extra Pareto-optimal set archieve strategy for post-processing purposes), were employed for this purpose. The results (especially for the constrained case) show that the cNSGA-II shows a good performance for having both a converged and a well-spread distribution of the Pareto-optimal set with less computational cost.