In times of big data, large-scale database applications are of increasing importance. A central task in this field is to reduce the amount of data that has to be processed. One way to approach this task is to capture the structure of the given data. This thesis deals with two particular data structuring techniques. The first part of this thesis is about cluster analysis. More precisely, we consider the diameter k-clustering problem. We provide the first analysis of the agglomerative complete linkage clustering algorithm. Assuming that the dimension d is a constant, we show that for any k the solution computed by this algorithm is an O(log k)-approximation to the diameter k-clustering problem. Our analysis does not only hold for the Euclidean distance but for any metric that is based on a norm. Furthermore, we analyze the closely related k-center and discrete k-center problem. For the corresponding agglomerative algorithms, we deduce an approximation factor of O(log k) as well. The second part of this thesis deals with the parameter estimation problem for general and Gaussian mixture distributions. We analyze a probabilistic variant of the Expectation-Maximization (EM) algorithm, known as the Stochastic EM or SEM algorithm. Unlike the original work, we focus on the analysis of a single run of the algorithm. We focus on the SEM algorithm for Gaussian mixture models and show that with high probability the updates of the EM algorithm and its probabilistic variant are almost the same, given that the data set is sufficiently large. A series of experiments confirms that this still holds for a large number of successive update steps. Furthermore, we explain why the SEM algorithm is considerably faster than the classical EM algorithm. In particular, we show that the probabilistic variant establishes a constant factor speedup for Gaussian mixture models.