Pipeline evaluation of clustering algorithms aimed at clinical data
Duarte Dyck, David Absalón
MetadataShow full item record
Disease understanding is key in designing effective treatments and diagnostic tools. A key aspect of this understanding is grouping the patients according to their phenotypes. Phenotypes are patterns in the characteristics of certain members of a population that are correlated with a particular illness. This grouping may be useful in revealing associations between disease risk, treatment responses, and other key clinical outcomes. Once these associations are found, it is easier to design tailored diagnosis tools and effective personalized treatments. To achieve this grouping goal, data is key, and recent advancements in digital technology have made possible to capture hundreds and thousands of clinical data that may be used to group patients into different disease phenotypes. To handle hundreds of patients, with hundreds of features, clinical researchers use clustering algorithms that automatically find hiding association between subjects. These algorithms are very useful once the researcher selects the correct clustering and configure it to the specific research task. Selecting the correct clustering algorithm is time-consuming, and setting up their parameters may take several trail and test sessions. On the other hand, computer scientists have developed several clustering metrics that can evaluate the fitness of the clustering algorithms to the data, and computer power has increased, allowing the automated testing and evaluation of the clustering algorithms in the specific data set. The objective of this proposal was the development of an automated computer pipeline that evaluates several clustering algorithms, providing metrics regarding important features such as clustering stability (Jaccard index) and clustering relevance (ANOVA test). Furthermore, the pipeline returns the number of natural clusters that may be useful for the given dataset (Dunn index). The designed pipeline was set up to evaluate the classical clustering algorithms of k-means, Fuzzy C-means, and Hierarchical clustering, but it can be used to test a user-provided clustering method. The evaluation consisted in bootstrapping the data and extracting the Dunn and Jaccard clustering indexes in a meaningful manner. Furthermore, the clinical relevance of the final clusters was evaluated using an ANOVA test, that provided indications of disease phenotypes. All the test results are plotted and the user can visually evaluate the performance of the different clustering methods in their data. The result of this development was deployed in R (github.com/majordave/clustest). The utility of the pipeline was tested on synthetic data sets and two radiomics datasets associated with the development of Osteoarthritis (OA) and the presence of breast cancer from mammograms. Furthermore, we contrasted the closeting approach to supervised learning of a large dataset of the association of nutrition with OA symptoms. Hence, the present work established that the automated robust evaluation of the utility of clustering algorithms in clinical data is feasible, and provided a publicly available software tool that can be used by any clinical researchers to select the best clustering algorithm for their data.
The following license files are associated with this item: