Thumbnail

Visual Pattern Analytics for Event Sequences

W. Jentner

2023

Pattern mining plays an essential role in unsupervised machine learning as it allows the clustering of structured data without requiring distance measures and purely relying on the definition of containment. Because it is unsupervised, it is predestined for exploratory analysis, and visual analytics offers a holistic perspective thoroughly involving the data, task, and especially the user in the decision-making process of designing tools for exploratory analysis. Pattern mining can easily generate millions of patterns since the search spaces are exponential. Additionally, the structures are often large and complex, which thwarts sense-making efforts by the user. This dissertation explains how visual analytics can be leveraged to allow the effective exploration of sequentially structured data using pattern mining algorithms. The first focus is on interesting measures, a concept known from data mining that should quantify interestingness. Because interestingness is subjective and heavily depends on the task and the user, this work argues for understanding interestingness measures as features that quantify different properties of the patterns and the clusters they represent. It further presents an alternative taxonomy of available features that can be used in pattern mining and discusses their importance and limitations. Secondly, this work surveys visualization techniques for structured data patterns, including their features, and highlights the differences between structured data as the input for the mining and the patterns themselves. Furthermore, it discusses the limitations of the visualization techniques, especially concerning scalability and the number of features. Finally, well-known visual analytics concepts such as interactive visualizations, progressive visual analytics, or concepts from visual text analytics are being transferred for pattern mining and the exploration of patterns. It is explained and discussed how these concepts can be exploited and implemented to mitigate the effects of the exponential search spaces and the complexity of the patterns to ease the user’s burden during the exploration process. Even though this work focuses on event sequences and sequential patterns, all aspects can be transferred onto different data structures and pattern mining algorithms. Therefore, this dissertation provides a foundation for the exploratory analysis of structured data using pattern mining with countless possible extensions to inspire future research.

Materials
Title