Real-Time Explorer Insights Management Guide on the Elven Platform

The Real-Time Explorer feature of the Elven Platform provides a modern, user-centered experience for monitoring system health and performance in real time. This tool allows teams to view continuously updated data, explore specific metrics, and analyze events as they occur. Using detailed charts, configurable filters, and precise indicators, the Real-Time Explorer facilitates the identification of anomalies, trends, and critical patterns, helping prioritize corrective actions with agility.

Accessing the Insights Center in the Real-Time Explorer section

  • Navigate to the main menu and click on Insights.

  • In the submenu, select the item Real-Time Explorer.

Understanding the Metrics

With the Real-time Explorer feature, it is possible to analyze, in real time, a variety of crucial metrics for operational performance. Through indicators such as the Hits-Failures Pie, you can visualize the ratio of successes and failures in operations, helping quickly identify potential issues. The Hits-Failures Bar Chart provides a detailed view of performance over time, allowing metrics to be tracked hour by hour and making it easier to detect trends and spikes in failures. The Hits-Failures List, in turn, presents detailed information about each request, including latency, operation type, and correlated events, enabling a deeper analysis of the root cause of failures.

With these tools, teams can make informed decisions, optimize performance, and boost operational efficiency, aligning corrective actions with business goals. Let’s explore more details on how these metrics can transform your real-time analysis.

Hit-Failures Pie

The Hits-Failures Pie is a feature of the Elven Platform that provides a consolidated view of the ratio between successful and failed operations being monitored. This resource allows teams to track, in a visual and intuitive way, the system health in real time, quickly identifying potential issues and ensuring greater reliability and quality in application performance.

Example:

In an e-commerce payment API, it is possible to identify that, in the last 24 hours, 98.78% of requests were successful, while 1.22% resulted in failures. This enables a detailed analysis to understand whether the failures are related to infrastructure issues, such as high server latency, or to business logic errors in the application, such as payment rejections. With this information, teams can prioritize corrective actions and optimize the end-user experience.

Hits-Failures Bar Chart

The Hits-Failures Bar Chart is a feature of the Elven Platform that shows the distribution of successes and failures over time in a bar chart. This visualization allows teams to identify temporal patterns related to the performance of monitored operations, helping in the detection of anomalies and in the analysis of system behavior at different times of the day.

Example:

In a critical systems monitoring application, the chart may indicate that, between 12 PM and 2 PM on a specific day, there was a significant increase in failures compared to the rest of the day. With this information, the team can investigate whether the issue was caused by a spike in user traffic, an unplanned maintenance, or an integration problem with external systems. This targeted analysis enables a quick and efficient response to mitigate negative impacts and prevent recurrence.

Hits-Failures List

The Hits-Failures List is a feature of the Elven Platform that presents, in a detailed manner, a chronological log of monitored operations, categorized as hits (successes) or failures. This resource provides rich and structured information, such as the operation type, the timestamp, the resource name, the execution latency, and, in case of failures, the correlated events. This detailed list is essential for in-depth performance analysis and for identifying patterns or anomalies that may impact service quality.

Example:

In a front-end monitoring scenario, the Hits-Failures List may reveal that, within a 15-minute interval, all requests made to the Front resource were successful, with latencies ranging from 47 ms to 75 ms. This granular analysis allows teams to validate system stability and use the collected data to identify latency spikes and correlate them with possible external factors, such as high server load or concurrent operations. This way, it becomes possible to continuously optimize system performance and the end-user experience.

Glossary of Technical Terms

Insights Center: Central module of the Elven Platform that provides in-depth analysis of operational and business data, supporting strategic decision-making and performance improvement.

Real-time Explorer: Feature of the Elven Platform that allows exploration of real-time performance metrics, displaying detailed indicators and interactive charts for analyzing events as they occur.

Hits-Failures Pie: Pie chart that displays the proportion between successes and failures in monitored operations, offering a quick view of system health and facilitating problem identification.

Hits-Failures Bar Chart: Bar chart that shows the distribution of successes and failures over time, enabling the identification of temporal patterns, such as failure spikes or improvement trends.

Hits-Failures List: Detailed list that displays chronological records of operations, classified as successes or failures. Provides information about the operation type, latency, involved resources, and correlated events.

Performance Metrics: Numerical indicators that measure system effectiveness and health, such as response time, failure rate, and latency, essential for evaluating and optimizing application performance.

Latency: System response time to a request. May include maximum, minimum, or average latency, and is critical for assessing system efficiency and speed.

Success (Hit): Operation or request that was executed correctly and without failures. Represents the ideal system performance.

Failure: Operation or request that was not completed successfully. May indicate system issues, such as network failures, code errors, or infrastructure problems.

Correlated Events: Events that occur simultaneously or within a specific time frame, and can be analyzed to identify underlying causes of failures or performance patterns.

Anomalies: Unexpected behaviors or outlier patterns observed in performance metrics. May indicate issues that require investigation and corrective actions.

Trends: Patterns observed in metrics over time, such as an increase or decrease in failure rate, allowing for problem anticipation or improvement identification.

System Monitoring: Process of collecting and analyzing data about the performance of systems and applications, aiming to identify failures, optimize resources, and ensure continuous and efficient operation.

Last updated

Was this helpful?