Product architecture and technology used

Product architecture

The architecture used (not in detail) is shown in the figure above.

The RuSIEM kernel, web interface, backend and related components are fully developed by our team. In development are used:

С++ 11 (stl, boost), c# (.net 3-4), java, php (bootstrap, laravel), bash, python, powershell (wmi), scala, maven, json, xml, html, C+, .Net, g++, gdb, gprof, cmake, binutils, Maven, Markdown, Scala, Java.

RuSIEM uses the following components and databases:

  • Postgresql
  • Redis MQ
  • Apache Kafka
  • ElasticSearch
  • Neo4j
  • Yandex ClickHouse
  • Apache Storm

The product consists of:

  1. server side
  2. agent for windows

Agent for Windows is used as an intermediary for native transports to collect from various sources other than syslog. One agent can collect both locally and simultaneously remotely from multiple sources using an agentless method. The communication channel between the agent and the server is encrypted. The agent has a built-in database for temporary storage of events when there is no connection to the server. The agent can work only with RuSIEM / RvSIEM, it is not compatible with other products.

Syslog sources are sent directly to the server.

The kernel (microservices) is more than 80% written in C ++. The analytics module, mainly java.

MQ (message queue) are used for communication of microservices, exchange between microservices of data pools, to ensure the transfer of events without loss in case of event peaks, restart of any component. The use of MQ is due to the fact that the loss of even one event can affect the detection of a threat, failure and incident.

Within the framework of one server (node), transfer between microservices is mainly carried out via the tcp protocol, but with the producer RuSIEM MQ, which controls the queues on all microservices. If it is impossible to process events on microservices (or peak node performance), the input queue is buffered in RuSIEM MQ directly to disk.

RuSIEM / RvSIEM nodes can be connected to each other using various methods:

  • Conditionally linked. Manage other nodes from a single interface.
  • Distributed correlation without transmission between the event nodes.
  • Distributed search for events on selected nodes.
  • Transfer of all events, by severity level or by selected filters to other nodes.
  • Nodes without a database of events that distribute the burden of event handling.
  • Nodes that have separate components for load balancing, horizontal and vertical scaling.
  • Remote nodes with databases, database clusters.

Both fully complete nodes with a set of all microservices and databases can be scaled, as well as individual microservices. For example, on one node normalization, on the other correlation, on the third — the database of events.

It is allowed to connect a set of nodes of freely distributed RvSIEM to the nodes of a commercial version of RuSIEM.

Events are stored in RuSIEM / RvSIEM versions in the Elasticsearch database in json format. In this case, by default, both the raw event and the normalized event are saved. Each json contains both a raw event and a normalized one. Saving raw events can be disabled. Only normalized events are transmitted to the analytics module, raw events are not transmitted. The analytics module has its own set of databases and storage formats.

Database clusters scale very flexibly. Since database nodes are not limited to a license, you can build, for example:

  • multi-node event storage cluster
  • cluster nodes with replication with the same data set for fault tolerance
  • cluster nodes with distributed data set to improve performance
  • separate nodes for recording and reading events with replication between them.