SLI: Service Level Indicators

Card Puncher Data Processing

About

SLI (Service Level Indicators) are metrics that indicate how well a service is performing.

Type

Resources Metrics

See Counter - Resources Metrics

Process Metrics

Process or Work Metrics give the system’s internal health and performance (observability).

Example:

  • Web server (at time 2015-04-24 08:13:01 UTC)
Subtype Description Value
Performance
throughput requests per second 312
latency 90th percentile response time in seconds 0.4
Exit status
success percentage of responses that are 2xx since last measurement 99.1
error percentage of responses that are 5xx since last measurement 0.1
  • Data store / Database (at time 2015-04-24 08:13:01 UTC)
Subtype Description Value
Performance
latency 90th percentile query time in seconds 0.02
throughput queries per second 949
Exit status
success percentage of queries successfully executed since last measurement 100
error percentage of queries yielding exceptions since last measurement 0
error percentage of queries returning stale data since last measurement 4.2

Documentation / Reference





Discover More
Scale Counter Graph
Counter - Error Rate

Errrors rate is a SLI metrics that captures the number of errors (resources (memory, disk) or process (timeout)) that occurs. The error rate metrics is generally expressed as a rate of errors (per unit...
Scale Counter Graph
Counter - Resources Metrics

This resource (counter|metrics) are usually expressed in the following terms: utilization: as a percent over a time interval. eg, “one disk is running at 90% utilization”. saturation: as a queue...
Card Puncher Data Processing
Monitoring - Service level

Monitoring - Service level
Scale Counter Graph
Performance

has two dimensions (two metrics): either the Time to do the task from start to finish (execution time, response time, latency) or the tasks per unit time (throughput, bandwidth) is a feature....
Response Time Of System
Performance - (Latency|Response time|Running Time)

Latency is a performance metric also known as Response time. Latency (Response Time) is the amount of time take a system to process a request (ie to first response) from the outside or not, remote or...
Card Puncher Data Processing
SLO: Service Level Objectives

are objectives that define targeted levels of service, typically measured by one or more Service Level Indicators (SLIs). Site Reliability Engineering Edited...
Card Puncher Data Processing
System - Availability

Availability is the (elimination|absence) of downtime. Availability is when a system still work after a node (computer) failure. Availability means that you can always read and write to the system....
Scale Counter Graph
What are Application Metrics? ie Perfcounter, Performance Metrics, Operational data, Monitoring, telemetry

This section is about the collection and calculation of metrics in a monitoring context known as observability.



Share this page:
Follow us:
Task Runner