Kafka 2.0 Queue Monitor Configuration Guide on the Elven Platform

The Kafka Monitor from the Elven Platform enables checks based on key queries in Kafka. This feature helps configure continuous checks, set alerts, and define thresholds for automatic incident creation, ensuring you are quickly informed about any irregularities in connectivity or service performance.

Apache Kafka is an open-source distributed streaming platform designed for real-time data stream processing and integration. It operates both as a pub/sub messaging system and a distributed log storage system, enabling the transport of large volumes of data with high resilience and low latency. Kafka organizes data into topics, which can be consumed by multiple clients simultaneously, ensuring horizontal scalability.

Its ability to handle millions of messages per second makes it ideal for applications such as real-time event analysis, data pipelines, system integrations, and microservices-based solutions.

Accessing Kafka 2.0 Monitoring

Navigate to the main menu and click on Services Hub.
In Queue, select the item Kafka 2.0.

Monitoring Configuration

Monitoring your service availability has never been easier. Start by giving a clear name to the resource you want to monitor (Resource Name) to make it easier to identify. Then, adjust the Interval between checks and the Timeout for receiving a response.

Select where the monitoring agent is located by choosing the appropriate Checkpoint Cloud. If no environment exists, you can create one by clicking + Checkpoint. After this setup, in Form Type, you can choose between User and Password or Healthcheck URL.

If you select User and Password, you must provide the server address in the Host, Port, Topic (the name of the Kafka topic you want to monitor or interact with), Consumer Group (the consumer group to be analyzed), and Lag Tolerance (the maximum delay tolerated before triggering an alert or identifying a system issue).

If you choose Healthcheck URL, simply enter the URL of the resource to be monitored. Remember that the Host and Healthcheck URL fields accept only URLs. If you need to use an IP address, it must be stored in a Secret to ensure security and organization of the information.

If authentication is required in Kafka, check the Authentication option and fill in the fields according to the security mechanism implemented by the cluster. First, enter the Username and corresponding Password. These values are provided by the Kafka administrator and are used to authenticate the client.

Next, in the Mechanism field, select the authentication mechanism used by the Kafka broker. If the cluster uses a simple method like PLAIN, only the username and password are required. For more secure configurations like SCRAM, Kafka performs authentication based on hashed passwords, offering greater protection. If the cluster uses OAuth, authentication is done using OAuth 2.0 tokens, allowing secure integration with identity providers. In the case of GSSAPI, the mechanism uses the Kerberos protocol for authentication, providing a robust and widely adopted solution in enterprise environments.

Additionally, check whether Kafka is configured to use TLS or SASL. If the cluster implements encryption to ensure secure communication, enable the TLS option. If Kafka uses an additional authentication layer such as SASL, also check the corresponding box.

Configuration with User and Password

Configuration with User and Password and Authentication

Configuration with Healthcheck URL

Automatic Incident Opening

You can configure automatic incident opening to ensure a quick response to critical issues. To begin, define the incident severity, allowing you to prioritize according to urgency. Next, adjust the Check Interval, specifying the check frequency in seconds to continuously monitor the resource. This helps ensure you're always one step ahead, detecting problems as soon as they arise.

Additionally, select the team to be notified whenever an incident occurs and enable the "Enable to set up automatic incidents opening" option to ensure the configuration is active. With this setup, the platform automates incident management, making the response process faster and more efficient, without the need for manual intervention. This ensures your team is always ready to resolve any issue with speed and precision.

Maintenance Window

We also have the Maintenance Window, an essential feature for managing planned maintenance periods in your application. During this interval, checks are temporarily paused, preventing monitoring, alerts, and notifications from being triggered while you perform updates or adjustments. This allows maintenance to proceed smoothly, without generating unnecessary notifications or false alarms, ensuring your operations continue in an orderly fashion without unexpected interruptions in performance reports.

For example, imagine you need to update the payment system of an e-commerce platform, making backend adjustments such as installing new security certificates. To do this, you can configure the Maintenance Window for a specific time, such as 12/13/2024, from 2:00 PM to 2:30 PM. During this period, the Elven Platform suspends checks, preventing the monitoring system from registering temporary failures or triggering false alerts. This way, you can make the necessary changes calmly, knowing that the monitoring system will not be impacted during maintenance.

This approach ensures that the update is carried out in an organized manner, without affecting the user experience or generating unwanted notifications.

Application Opening Hours

You can also rely on the Application Opening Hours feature, which allows you to configure your application's operating hours. This functionality is essential for customizing monitoring based on the periods when your application is actually active, avoiding alerts and notifications outside of business hours. This way, monitoring becomes more aligned with your business’s real needs, ensuring more accurate reports and efficient management.

For example, imagine your application operates only from Monday to Friday, between 9:00 AM and 6:00 PM. You can configure the Application Opening Hours to reflect this schedule by specifying the days and operating periods. With this setup, the Elven Platform automatically disables checks outside of these hours, preventing the logging of failures that don’t affect end users and avoiding unnecessary alerts.

This approach optimizes performance analysis, focusing only on relevant periods and providing a clearer view of your application's health during its operating hours.

Glossary of Technical Terms

Kafka: An open-source event streaming platform designed to process, store, and transmit large volumes of data in real time. With support for publish-subscribe patterns, distributed queues, and durable storage, it offers high scalability, low latency, and high availability. Ideal for applications requiring continuous data processing, such as system integration, real-time analytics, and monitoring. Compatible with various programming languages, it is widely used in microservices architectures and modern data pipelines.

Interval: Time interval between automatic checks performed during monitoring.

Timeout: Maximum time allowed for monitoring to receive a response from the monitored resource before registering a failure.

Checkpoint Cloud: Location where the monitoring agent is hosted, which can be a preexisting environment or one created by the user.

Host: URL address of the monitored resource. If an IP address is needed, it must be stored in a Secret for enhanced security.

Port: Access port for Kafka.

Consumer Group: A group of consumers that share the load of processing messages from one or more Kafka topics. Ensures that each partition is processed by only one consumer within the group.

Lag Tolerance: Acceptable limit of message consumption delay, measured by the number of unprocessed messages. Used to identify bottlenecks and maintain system performance.

Secret: Resource used to store sensitive information, such as IP addresses or credentials, ensuring security and organization.

Username: Unique identifier configured in the Kafka broker to authenticate a client’s access to the cluster.

Password: Credential associated with the username, used for authentication and authorization in Kafka.

Mechanism: Authentication method used by Kafka, selected in the form. Examples include:

PLAIN: Simple method using username and password for authentication.
SCRAM (Salted Challenge Response Authentication Mechanism): More secure mechanism using hashed passwords for authentication.
OAUTHBEARER: Uses OAuth 2.0 tokens for authentication, allowing integration with identity providers.
GSSAPI (Generic Security Service Application Program Interface): Kerberos-based authentication, widely used in enterprise environments.

TLS (Transport Layer Security): Encryption protocol that protects communication between the Kafka client and server, ensuring data confidentiality.

SASL (Simple Authentication and Security Layer): Authentication layer supporting various mechanisms (such as PLAIN, SCRAM, OAUTHBEARER, and GSSAPI) for secure authentication in Kafka.

Kafka Broker: A component of a Kafka cluster responsible for storing and distributing messages between producers and consumers.

Kafka Cluster: A group of Kafka brokers working together to manage and process messages in a distributed and scalable manner.

Enable to set up automatic incidents opening: Option that, when enabled, activates automatic incident creation upon detection of critical issues.

Severity: Level of criticality assigned to an incident, allowing it to be prioritized according to urgency.

Check Interval: Time interval, in seconds, for performing continuous checks on the monitored resource.

Maintenance Window: Feature that temporarily pauses monitoring, alerts, and notifications during planned maintenance periods.

Application Opening Hours: Configuration that defines the operating hours of the application, aligning monitoring with active periods and avoiding alerts outside those hours.

PreviousRabbitMQ Queue Monitor Configuration Guide on the Elven Platform NextKafka Queue Monitor Configuration Guide on the Elven Platform

Last updated 5 months ago

Was this helpful?