COLLECT

With the Selector AI data pipeline, familiar data sources are pre-integrated. It can ingest data from sources such as SNMP, gNMI/gRPC, InfluxDB, Kafka, Prometheus, REST API, vendor-specific device metrics, GitLab, GitHub, and Splunk. Additional data sources can be added with YAML file definitions and do not require a change to the data pipeline. As we encounter new use cases, we continue to add new data sources.

In one scenario, the customer uses Telegraph agents to collect metrics from various sources and writes them to InfluxDB. Selector AI reads data from InfluxDB using GET API calls. In another scenario, Selector AI is reading metrics from Kafka’s message bus. As the data collection engine of Selector AI receives a high volume of data and requirements are bounded by the velocity of data, we wrote collection engine in Go to achieve high throughput.

The collection layer can also ingest proprietary databases in CSV file format and join tables across relational databases and time-series databases. This join helps to get contextual insights that map user-defined tags such as customer and user to operational tags such as port, server, and switch.

Download Paper

CORRELATE

Selector AI normalizes, filters, clusters, and correlates metrics, events, and alarms using pre-built workflows to draw actionable insights. Selector AI adds automatic baseline bands by monitoring each metric’s past behavior and alerts when an anomaly is detected. Users are not required to set baselines and thresholds for thousands of metrics. For peer group metrics, cross-correlation detects outliers that differ significantly using statistical methods such as top violators, 3-sigma, and 99-percentile.

With auto/cross-correlation of metrics and events from heterogeneous data sources, Selector AI uses a ranking system to shortlist factors responsible for an outage. Users can get to the root cause of an outage faster by focusing on this ranked list. With automatic alert notification for an SLI or KPI violation, Selector AI presents a list of correlated metrics and labels, reducing the diagnosis time. Data science experts can customize the Selector AI data pipeline, and teams early in the data science journey can use it as-is.

The key technology used in this stack is a local metric, log and event data store, and multiple data processing engines to detect anomalies, violations, and correlations. We use open-source databases such as Prometheus to store time-series data, Loki for logs, Postgres for relational data, and MongoDB for NoSQL data. With our focus on data-centric AI, we apply multiple models to improve data quality.

For example, we use LSTM (Long short-term memory) based deep learning techniques for anomaly detection and auto-baselining. We use the techniques of Recommender systems to identify correlations between different metrics, logs and events. We implemented most of this using PyTorch, a software machine learning library in the Python programming language.

COLLABORATE

Both technical and business leaders want to get short and instantaneous status reports without writing SQL queries, and building new widgets or dashboards. To enable this desire for data democratization, Selector AI supports Natural Language Queries directly integrated into common messaging platforms such as Slack and Microsoft Teams, providing a conversational chatbot experience. Selector AI’s collaboration service engine learns from labels associated with metrics and events. It infers natural
language queries and finds the closest match. A fully interactive web portal with on-demand dashboards and a structured query language are available for expert users. In the web portal, users can build widgets and dashboards using Query auto-complete. Using high-performance APIs, users can share query outputs and dashboards with larger workflows such as IT-service management ticketing, alerting, and automation tools.

The key technology components used in this layer are for Natural Language Processing (NLP) and interactive query interface in Slack/Team and portal. For NLP, we use spaCy, an open-source software library to build models for advanced natural language processing. User asks queries in natural language and our models will translate them into underlying SQL queries. The interactive web portal is built using Javascript and React with visual rendering using Highcharts.

Selector AI responding to NL queries in Slack
Selector AI web portal for queries and widgets

Schedule Your Free Demo Today

You can modernize your operations across your application, network, and cloud infrastructure with Selector’s cloud-native AIOps solution and deploy it in three possible modes – public, private, and cloud
VPCs.

This field is for validation purposes and should be left unchanged.