-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Nomad version
Nomad v1.10.5
(because we've updated last week to 1.11.1 , i've not the exact build number anymore and have up to now no system for testing it with 1.11.1 - but in the changelogs i saw nothing related to this topic)
Operating system and Environment details
RHEL 8.10
Issue
When restarting the Nomad Process (e.g. after updating the CONSUL_HTTP_TOKEN) nomad looses track on the current consul_tokens used on running envoy-proxy containers.
Normally the consul-token used inside the envoy proxies is the same as configured in nomad configuration. But after updating and restarting nomad, the token in the envoy-proxies is still the "old one".
This means, after a short time, the proxy-containers stop working, because the consul_token expired.
As workaround we have to drain all applications on the nomad hosts, before updating the token and restarting the nomad process.
Reproduction steps
- Use given jobfile to create a job with consul connection.
- compare used tokens:
- token in "/secrets/.envoy_bootrstrap.env" (proxy container)
- token in "/etc/nomad.d/nomad.env" (host)
- token in application "/secrets/consul_token" (application container)
- validating all tokens via "consul acl token read -token $TOKEN -self"
The proxy-container is using the same agent-token as the nomad host. The application container gets a dynamic token via workload identity.
After updating CONSUL_HTTP_TOKEN on host and restarting nomad, the token inside the containers will be updated. But the proxy-container still uses the old consul-token.
Expected Result
After renewal the envoy-proxy container should use the updated consul_token for connections to the service-mesh.
Actual Result
After expiring of the old consul_token (after 1h), the envoy proxy looses connection to the service mesh, and consul logs showing "ACL not found" errors.
Unfortunately the envoy-proxy is also not failing, so the healthchecks will not trigger or restart the proxy.
Job file (if appropriate)
job "httpd-test" {
group "httpd" {
network {
mode = "bridge"
port "http" {}
}
constraint {
attribute = "${attr.unique.hostname}"
value = "q10i22"
}
service {
provider = "consul"
#name = "playground-httpd"
port = "http"
identity {
aud = ["inf293.consul"]
ttl = "1h"
}
connect {
sidecar_service {}
}
}
task "httpd" {
driver = "docker"
consul {}
config {
image = "busybox:1.36"
command = "httpd"
args = ["-f", "-p", "${NOMAD_PORT_http}"]
ports = ["http"]
}
identity {
name = "consul_default"
aud = ["inf293.consul"]
ttl = "1h"
}
template {
data = <<EOF
Consul Services:
{{- range services}}
* {{.Name}}{{end}}
Consul KV for "httpd/config":
{{- range ls "httpd/config"}}
* {{.Key}}: {{.Value}}{{end}}
EOF
destination = "local/consul-info.txt"
}
}
}
}
Nomad Server logs (if appropriate)
Nomad Client logs (if appropriate)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status