404 Not Found
  • Introduction
  • Monitoring related
    • K8s cluster monitoring
    • Monitor Jenkins with G.A.P on K8s cluster
    • Monitoring tools | projects
      • Grafana
      • AlertManager
      • Prometheus
      • Wavefront
  • Logging related
    • BOSH logs
    • How to gather systemd log
    • K8s cluster logging
    • Logging tools | projects
      • vRealize Log Insight
      • Fluentd
      • syslog vs fluentd
  • Having fun with docker
    • Using docker-compose for redmine
    • Customize Fluentd docker image
  • K8S or Apache Mesos
  • K8S Related
    • Main Architecture
      • Master components
        • API Server
        • etcd
        • Controller Manager
        • Kube Scheduler
      • Worker components
        • kubelet
        • kube-proxy
    • K8S Storage
      • Volume Provisioning
      • Understand CSI
      • How to write CSI
      • VMware CNS
      • K8S storage e2e experiment under VMware vSphere
      • Experiment on Persistent Volume Access Mode
      • Design: Storage in Cluster-API architecture
    • K8S Networking
      • Ingress
      • Endpoints
    • K8S Policies
      • Resource Quotas
    • K8S Management Platform
    • K8S Tests Tool
    • K8S Extension
      • CRDs
        • Custom Resources
        • Custom Controllers
        • How to user code-generator
        • K8S Operators
        • Operators Development Tools
          • Kubebuilder
          • Metacontroller
          • Operator SDK
      • Custom API Server
    • K8S Resource CRUD Workflow
    • K8S Garbage Collection
  • K8S CONTROLLER RELATED
    • IsController: true
    • Controller clients
  • PKS RELATED
    • How to Access VMs and Databases related to PKS
    • PKS Basics
    • BOSH Director
    • Backup and Restore on Ent. PKS with Velero
  • CICD RELATED
    • Configure Jenkins to run on K8S
    • Customize Jenkins JNLP slave image
    • Jenkins global shared libs
  • Google Anthos
    • Google Anthos Day from KubeCon 2019 San Diego
    • Migrate for Anthos
    • Config Connector
  • SYSTEM DESIGN RELATED
    • Design Data Intensive Application - Notes
      • RSM
        • Reliability
        • Scalability
      • Data models and Query Languages
      • Storage and Retrieval
    • How Alibaba Ensure K8S Performance At Large Scale
  • Miscellaneous
    • Knative
    • Serverless
    • Service Mesh
    • gRPC
    • Local persistent volumes
    • ownerReferences in K8S
    • File(NAS) vs Block(SAN) vs Object storage
    • KubeVirt
    • Why K8S HA chooses 3 instead of 5..6..7 as the size of masters?
    • goroutine & go channel
    • How to make docker images smaller
Powered by GitBook
On this page
  • Describe load
  • Describe performance
  • Latency and Response time are different
  • Using percentile to measure the performance
  • Where to measure the performance?

Was this helpful?

  1. SYSTEM DESIGN RELATED
  2. Design Data Intensive Application - Notes
  3. RSM

Scalability

Describe load

Describe performance

Latency and Response time are different

The response time is what client see, it includes processing time, network delays and queueing delays.

The latency is the duration that the request waits to be handled.

Using percentile to measure the performance

Percentile: p50 is also known as median. If the p50 response time is 200ms, then you know there are half of the requests are slower than that. In production, there are usually p99, p999 indicates the 99th, 99.9th percentile. If the response time is at p99, then we know there are 99% of the requests are faster than the threshold. Higher percentile is also known as tail latency.

SLO and SLAs: It is expensive to fix that 1% slow requests, since the root cause of that slowness could be random. So most of the time, SLO and SLAs defines the expected performance.

Head of line blocking: If a server can only process a small number of requests, and a small number of slow requests slow down the entire performance, even the subsequent requests are fast to be processed.

Where to measure the performance?

Server side

  • Measure the request processing time (latency)

Client side:

  • Due to the head of line blocking, it is important to also measure the performance from client side.

  • Client needs to keep sending requests independently. (Do not wait previous request to complete)

How to measure in practice

Keep a rolling window of response times of requests in a period of time. (Sliding window)

there are algorithms that can calculate a good approximation of percentiles at minimal CPU and memory cost, such as forward decay, t-digest, or HdrHistogram.

PreviousReliabilityNextData models and Query Languages

Last updated 5 years ago

Was this helpful?

GitHub - tdunning/t-digest: A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed meansGitHub
** HdrHistogram **
Logo