The standard installation already contains many parsers for common sources. But from version to version of the event format can vary, and the number of different sources connected to the SIEM - millions.
In RuSIEM commercial version and in RvSIEM free there are tools for normalizing and connecting any event sources.
What is normalization
Normalization is the process of preparing data, extracting keys and their values. Normalization is necessary for processing according to the already written rules of correlation along with other events from other sources, correct search of stored data, building complex reports and subsequent processing using Data Learning, Machine Learning and Artificial Intelligence, as well as correct saving to the database.
Let's look at an example: the company has two Cisco and Palo Alto Networks border routers. The format of events from these devices is different.
The event from Palo Alto Networks looks like this:
<14>Feb 11 01:27:43 EXT01.uk 1,2019/02/11 01:27:42,001801044435,TRAFFIC,start,1,2017/11/29 01:27:42,10.0.1.23,184.108.40.206,10.160.1.240,220.127.116.11,Rule ID 57,,,zabbix,vsys1,Trust,untrust-micex,ethernet1/1,ethernet1/12
end event from Cisco device:
<113>Feb 11 01:27:59 firepower SFIMS: Protocol: TCP, SrcIP: 10.0.1.23, OriginalClientIP: ::, DstIP: 18.104.22.168, SrcPort: 50795, DstPort: 443, TCPFlags: 0x0, IngressInterface: inside, EgressInterface: outside, DE: Primary Detection Engine (13d1ae3e-1d3b-11e8-bae3-88193dbc790d), Policy: Access Policy, ConnectType: Start, AccessControlRuleName: TeamViewer_Block_All
Formats are very different. You can wrap the entire event in the message field and save it. But if you need to make a sample, for example, who connected to ip: 22.214.171.124, or make a correlation rule "connect to a critical resource 126.96.36.199", independent of the source (who will report in their logs), then we will encounter a problem without normalization. Making samples for individual sources, we will not see the full picture and work with such data will be difficult. Therefore, at the stage of normalization, the values are extracted into the fields according to the taxonomy in the specific SIEM system. For example, src.ip: "10.0.1.23", dst.ip: "188.8.131.52". The number of fields to be extracted depends on the cases of working with these events and the need for these extracted values.
For gathering of events transports are used:
- syslog (plain, CEF, LEEF, TLS, etc)
- filelog (various text formats)
- msevt (event collection from any Windows logs, including applications and custom)
- mssql (universal collection transport with tables and views)
- mysql (universal transport)
- oracle (universal transport)
- 1C v8.3
- and others
New transports can be added with product updates.
Transports allow you to collect events from sources and send for further processing.
Reception of events is carried out by lsinput microservice. At startup, it loads configuration files from the file system, located in the / opt / rusiem / lsinput / etc / directory with the .conf extension (system) and with the _fser.conf postfix (user). Microservice opens ports on incoming connections and receives a syslog stream and an encrypted stream from agents. The lsinput task is to accept the stream and route it according to the types of events for normalization in frs_server.
The frs_server microservice is responsible for normalization. At the start, it loads the configuration files from the postgreSQL database, describing the passage of events and their normalization. The difference from earlier versions is that earlier frs_server also took configuration files in the file system. Now most of the configuration files are contained in postgresql.
By default, frs_server runs the event on suitable auto-parsers. Auto-parsers - a set of configuration files containing patterns and procedures for handling events according to the source. If from one source (for example, syslog) there are events from mysql, auditd, ids and other sources in one stream, the auto-parser will split this stream into separate ones and send it to the corresponding parsers.
Lsfilter is responsible for symptom enrichment and correlation. After normalization, events pass through a binary tree of symptoms and are sent for correlation (or, in the case of RvSIEM free, in lselastic).
Lselastic is a microservice responsible for interacting with the Elasticsearch database. It manages the flow of data, keeps track of errors, determines the methods of saving to the database.
Since there may be problems during processing or writing to the database (microservice failure, lack of resources, reboot), there is a buffer for saving events (mq - message queue) in each microservice to avoid losing events. Each microservice has a different one, but the producer queue is controlled by the lusinput producer. If the buffers in some microservices overflow - lsinput continues to receive incoming events, caches them to the hard disk, but does not start up further into microservices until the recovery and reduction of the queue.
System and user entities
There are system configuration files and user. Changes to the system configuration files are overwritten when updated. With the help of custom it is possible to create your own parsers and sources and redefine system parsers.
Configuration file format
The format of configuration files with syntax is similar to logstash-like, but it has significant differences and is not compatible.
Each configuration file of any microservice (lsinput, frs_server, lsfilter, lselastic) has a structure that defines:
- where to get the event - input section
- what to do with it (handlers) - filter section
- where to send the event further - output section
The filter section may be empty for route events along other routes.
Processing the configuration file is performed sequentially. Changes to configuration files take effect after microservice restart.
How to add your parser
As mentioned above, the format of parsers is simple, similar to logstash-like (but not compatible). Configuring parsers is done in Settings -> Parsers. In this section you can see the system parsers, create your own and test it.
You can test both on the real stream (by selecting the appropriate transport) and on the sample event.
In order for the parser to work after your check - you need to check the "Automatically load" checkbox on the Parsers page and enable it in the Settings -> Sources section, defining the flow parameters. After these actions, your events will be available along with the rest.
If the event format is not defined, there are no parsers for the source, then they should be accessible by searching type: "unparsed" and stored in separate database indexes.
Perhaps remote control and work with parsers and sources. As well as for other sections - you can select the desired remote server in the upper right corner and from the same interface do all the necessary operations on another node.
More detailed instructions on parsers, operators, functions are available in the "Source Connection Guide".