Blog

Did you know Selector’s platform can extract information from log messages without using regular expressions?

Network operation engineers must deal with log messages generated by multiple devices and technologies. Log messages carry valuable information, but the way they have been designed (unstructured by nature) makes it challenging to automate the extraction of this data. The standard procedure used so far is based on regular expressions (Regex or Grok or similar) that are used to find the field of interest and extract it. This has always been an arduous process for several reasons. First, building those expressions can be complex. Second, they need to be supported and evolved over time. This can be a challenge since the engineers that created these expressions in the first place may not have the required documentation in place resulting in a reverse engineering effort. To complicate matters further, vendors may change the format for their syslog messages which will cause the regular expression to stop working and prevent the extraction of key information required.

The era of “human-based rules” to extract information has passed. We have evolved into digital transformation of how data and algorithms are processed. With data and algorithms, we can accomplish many of the tasks that previously required manually configured rules. This specific situation is a good example of this transformation.

Selector’s platform uses state-of-the-art Natural Language Processing techniques to manage log messages. In our previous blog post on regular expressions, we discussed automatic clustering and classification of logs. Now, we are also using NLP techniques to identify and extract key information from within a log.

Named Entity Recognition (NER) is a well-known technique in the context of Natural Language Processing that allows the automatic extraction of key and relevant information from text. In this case, the target text is a log message, and the relevant information can be a variety of topics: hostnames, IP addresses, MAC addresses, fully qualified domain names, interface names, etc.

Selector’s Named Entity Recognition will train a model using the customer’s key sources of data such as inventories, customer databases, etc. With that, Selector’s Log processing pipeline will extract and enrich log messages with the new fields automatically identified by Selector’s NER algorithm.

Once logs are enriched with the new labels, key analytics and insights can be generated so that the Selector Platform can surface multidimensional anomalies which are otherwise very difficult to identify.

Chasing fields with regular expressions is not an effective use of time. We now have advanced techniques that can be used to automate this process. Selector’s technology is designed to help network operation engineers focus on what matters most: the anomalies which provide the key insights to help facilitate proactive management of underlying network issues. The key focus is what a network engineer can do with the fields extracted from logs, not the extraction process itself.

Let’s explore some of the benefits of Selector’s Analytics platform:

First, operations teams are no longer required to create and maintain complex and numerous regular expressions to parse and extract data from logs.

Second, by automatically extracting key fields and performing multidimensional anomaly detection, Selector’s platform also helps surface unseen anomalies from the logs and enables network operation engineers to identify issues otherwise impossible to detect.

Below are several screenshots of this unique feature:

Interested in learning more about this feature? Contact us today for a free demo!