From e1d62ce450daaba624e5ff73fec4dbafd1af8b89 Mon Sep 17 00:00:00 2001 From: Konrad Bańka Date: Mon, 25 Jan 2021 07:44:49 +0100 Subject: [COMMON][ETCD] Skip startup self-discovery for etcd nodes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Current startup script of etcd checks whether all assumed other nodes are already running, before proceeding. This check, however, also includes checking localhost, but due to using headless service statefulset pod DNS discovery, it doesnt succeed immediately. In some deployments k8s DNS server may be laggy, thus failing startup script to finish before liveness check. This patch fixes such failures of 1 pod etcd clusters, and improves startup time for any size ones. Signed-off-by: Konrad Bańka Issue-ID: OOM-2668 Change-Id: I2f9263a0f4964b0a495631775d0cbbceef25e85b --- kubernetes/common/etcd/templates/statefulset.yaml | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'kubernetes/common/etcd') diff --git a/kubernetes/common/etcd/templates/statefulset.yaml b/kubernetes/common/etcd/templates/statefulset.yaml index f5592bd252..e39b8c4ca2 100644 --- a/kubernetes/common/etcd/templates/statefulset.yaml +++ b/kubernetes/common/etcd/templates/statefulset.yaml @@ -133,6 +133,10 @@ spec: # we should wait for other pods to be up before trying to join # otherwise we got "no such host" errors when trying to resolve other members for i in $(seq 0 $((${INITIAL_CLUSTER_SIZE} - 1))); do + if [ "${SET_NAME}-${i}" == "${HOSTNAME}" ]; then + echo "Skipping self-checking" + continue + fi while true; do echo "Waiting for ${SET_NAME}-${i}.${SERVICE_NAME} to come up" ping -W 1 -c 1 ${SET_NAME}-${i}.${SERVICE_NAME} > /dev/null && break -- cgit 1.2.3-korg