Kubernetes API Priority And Fairness
Kubernetes API Priority and Fairness (APF) allows requests to the Kubernetes API server to be classified, isolated, and queued in a fine-grained way.
The APF metrics can be monitored to determine how well the API servers are handling the workload. The metrics are intended to be interpreted by tools like Prometheus or VictoriaMetrics. This document will use them in their raw form.
The metrics covered by this document are counter type. Counters are incremented, never decremented. While sampling counters in raw form, they will appear to bounce but on an idle system a given counter should make its current high value known after it appears in 3-5 samples.
Concepts
Requests coming into the API server are classified by FlowSchemas
and assigned to priority levels. The FlowSchema assigns the request to a flow and gives it a flow distinguisher. The flow distinguisher indicates the origin of the request--a user, service account, controller, namespace, or nothing. A priority level may take requests from multiple flows. The priority level attempts to give equal response time to each flow.
To view FlowSchemas and their assigned priority levels:
Flowschema sample output:
NAME PRIORITYLEVEL MATCHINGPRECEDENCE DISTINGUISHERMETHOD AGE MISSINGPL
[...]
system-leader-election leader-election 100 ByUser 112d False
endpoint-controller workload-high 150 ByUser 112d False
workload-leader-election leader-election 200 ByUser 112d False
system-node-high node-high 400 ByUser 112d False
system-nodes system 500 ByUser 112d False
[...]
To view priority levels:
Priority level sample output:
NAME TYPE NOMINALCONCURRENCYSHARES QUEUES HANDSIZE QUEUELENGTHLIMIT AGE
[...]
global-default Limited 20 128 6 50 112d
leader-election Limited 10 16 4 50 112d
node-high Limited 40 64 6 50 112d
system Limited 30 64 6 50 112d
workload-high Limited 40 128 6 50 112d
workload-low Limited 100 128 6 50 112d
[...]
Metric types
As noted earlier, the metrics will be viewed in their raw form and they are all of counter type. An individual counter must be sampled multiple times before its current high value can be clearly identified.
To view a counter's type:
The output will describe the counter and its type:
# HELP apiserver_flowcontrol_rejected_requests_total [BETA] Number of requests rejected by API Priority and Fairness subsystem
# TYPE apiserver_flowcontrol_rejected_requests_total counter
Examples
A quick way to get a summary of requests by priority level:
From here one can drill down into the Flowschemas
that feed a given priority level to see which one is generating the traffic.
View activity that uses the nnf-clientmount credentials:
View activity that uses the viewer user credential:
Resources
Kubernetes
A description of APF: API Priority and Fairness
Debugging guide: Flow Control
Other sources
An excellent, though dated, description of tunables: Kubernetes API and flow control: Managing request quantity and queuing procedure
Slide deck that gets into the algorithms: Kubernetes API Priority and Fairness