Cluster Studio

Clustering is the statistical technique of grouping a set of items in such a way that items in the same group (cluster) are more similar to each other than to those in other groups (clusters). In application, this technique helps solving many problems by identifying related items and alienating unrelated items.

Cluster Studio solution not only groups the items into related groups/clusters (auto-generation of clusters), but also allows to create scenarios by merging of multiple clusters or splitting of large cluster into multiple small clusters to create custom scenarios to conduct deep-dive analysis and to take necessary action.

Business Problem

Software quality or support teams often work on time and budget constrained environment. Moreover, a big release or major upgrade may result in spike of incident/ticket volume and immediate scaling up the team to handle the spike is a challenge. Automating the process of identifying critical areas, grouping all related incidents, triaging and routing to concerned team will not only streamlines the process but helps in quick addressal of the spike.

Solution

This solution has 3 components. The “Data Extractor” is a connector to ticketing tool like VSO/ADO to extract the incidents/items’ information like issue title, description, bridge log, priority and other attributes from Azure DevOps (or VSO) and sends the data to SQL Server database.
“Cluster Generator” is the “intelligence” of the solution to auto-generate the clusters using the multiple unsupervised machine learning algorithms like KMeans clustering, Agglomerative clustering, Gaussian Mixture Models and Topic Modeling. These multiple machine learning models allow us to experiment with different ML techniques and to pick-up most suitable clustering algorithm.
“Cluster Canvas” is the UI that facilitates the users to merge/splitting the auto generated clusters to create custom scenarios. Users can find the most related cluster in case of a new issue to understand, what SOPs apply for the specific issue being investigated.

Why the solution is unique?

  • Multiple pre-built machine learning algorithms to support diverse scenarios in clustering
  • There are many tools / algorithms that can identify the clusters, but the ability to combine / split the clusters as scenarios will practically help in managing the clusters and to take necessary actions
  • Leveraging Azure search, which is a cognitive AI service is leveraged for natural language processing (NLP)

Benefits

In the context of Incident analysis, the tool offers following benefits:

  • Incident grouping and triaging- Identify the different clusters or groups of incidents automatically. Based on the top keywords appearing in each cluster, we can assign a name or category to the cluster. Subsequently, similar group of incidents can be assigned to particular team in a large project, thus saving manual effort required.
  • Incident data analysis – This solution performs analysis on incident data like the number of defects different groups or scenarios, areas/features thar are more critical and need to be looked at with highest priority so that the corrective and preventive actions can be taken.
  • Other usecases have the similar benefits of saving the time and avoiding manual effort, in their context.




Cluster Studio tool screenshot.