Understanding Internal Cluster Variability Through Subcluster Metric Analysis in a Geophysical Context
Clustering algorithms are commonly used for inspecting the behavior of clouds in both model and satellite data sets. Often overlooked in cluster analysis is the variability that occurs within any clusters generated. This is particularly important in the geophysics where clusters are often generated with a focus on interpretability over mathematical optimization. Two metrics, the Davies-Bouldin index and the subsom entropy, are used to identify clusters with large internal variability. These metrics are applied to an example set of clusters from prior research that were generated using cloud top pressure-cloud optical thickness joint histograms from the Moderate Resolution Imaging Spectroradiometer data set. Applying these metrics to the clusters identifies one cluster in particular as a major outlier. Examining the calculations behind these metrics in more detail provides further information about the internal variability of the clusters. The clusters are also examined over several geographic regions showing mostly consistent behavior. There are, however, some large anomalies such as the behavior of the clear sky cluster or the behavior of several different clusters over the Arctic Ocean. To aide our interpretation of these results, two clusters are chosen for a detailed analysis of their subclusters. The geographic distributions and radiative properties of these subclusters are examined and clearly identify that subclusters have physically distinct behavior. This result illustrates that these metrics are capable of determining when a cluster contains physically distinct subclusters. This demonstrates the potential utility of these metrics if they were applied to other geophysical data sets.