Kubernetes Cluster Deployment using Kubespray

Raman Pandey
9 min readApr 24, 2019

Here we will start Kubernetes cluster deployment in virtual machines. we need at least 2 vms and one vm we will call as master node and another vm as worker node.

Before starting ip_forward value should be 1 in path /proc/sys/net/ipv4/ip_forward.

we will use Kubespray( https://github.com/kubernetes-sigs/kubespray) library to setup cluster using these 2 nodes.

ansible and jinja python plugins must be installed along with latest python version in order to work Kubespray.

we will use flannel as overlay network and metallb as Load balancer.

Kubespray is a set of ansible roles defined in cluster.yml and these roles consists of several play which will install core components required for Kubernetes control plane and other system level components on both nodes required to setup the cluster.

you must have root access in order to Kuberspray work. along with public key ssh should be enabled in all nodes. Use this link to enable public key ssh login in all nodes. Make sure each node should able to access all other nodes using public key.
https://www.thegeekstuff.com/2008/11/3-steps-to-perform-ssh-login-without-password-using-ssh-keygen-ssh-copy-id

Now execute following command to create cluster specific tasks.
cp -r kubespray/inventory/sample kubespray/inventory/<cluster-name>
for eg; your cluster name is mycluster then execute
cp -r kubespray/inventory/sample kubespray/inventory/mycluster

now navigate to mycluster and update inventory.ini file as following:
# ## Configure ‘ip’ variable to bind kubernetes services on a
# ## different ip than the default iface
# ## We should set etcd_member_name for etcd cluster. The node that is not a etcd member do not need to set the value, or can set the empty string value.
[all]
master ansible_host=<node-1-ip> ip=<node-1-ip> ansible_user=<ssh-user> ansible_sudo=yes
worker ansible_host=<node-2-ip> ip=<node-2-ip> ansible_user=<ssh-user> ansible_sudo=yes

[kube-master]
master

[etcd]
master

[kube-node]
worker

[k8s-cluster:children]
kube-node
kube-master

[calico-rr]

now navigate to inventory/mycluster/group_vars/all/all.yml and change following values as per your environment:
— -
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd

## Directory where the binaries will be installed
#bin_dir: /usr/local/bin

bin_dir: /opt/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the “public” ip,
## but don’t know about that address themselves.
#access_ip: 1.1.1.1
boostrap_os: ubuntu

## External LB example config
## apiserver_loadbalancer_domain_name: “elb.some.domain”
loadbalancer_apiserver:
address: <master_node_ip>
port: 6443

## Internal loadbalancers for apiservers
loadbalancer_apiserver_localhost: true
loadbalancer_apiserver_type: haproxy

## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443

## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081

### OTHER OPTIONAL VARIABLES
## For some things, kubelet needs to load kernel modules. For example, dynamic kernel services are needed
## for mounting persistent volumes into containers. These may not be loaded by preinstall kubernetes
## processes. For example, ceph and rbd backed volumes. Set to true to allow kubelet to load kernel
## modules.
# kubelet_load_modules: false

## Upstream dns servers
upstream_dns_servers:
- 8.8.8.8
- 8.8.4.4

docker_dns_servers_strict: false
root@master:~/kubespray# cat inventory/mycluster/group_vars/all/all.yml
— -
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd

## Directory where the binaries will be installed
#bin_dir: /usr/local/bin

bin_dir: /opt/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the “public” ip,
## but don’t know about that address themselves.
#access_ip: 1.1.1.1
boostrap_os: ubuntu

## External LB example config
## apiserver_loadbalancer_domain_name: “elb.some.domain”
loadbalancer_apiserver:
address: <master_node_ip>
port: 6443

## Internal loadbalancers for apiservers
loadbalancer_apiserver_localhost: true
loadbalancer_apiserver_type: haproxy

## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443

## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081

### OTHER OPTIONAL VARIABLES
## For some things, kubelet needs to load kernel modules. For example, dynamic kernel services are needed
## for mounting persistent volumes into containers. These may not be loaded by preinstall kubernetes
## processes. For example, ceph and rbd backed volumes. Set to true to allow kubelet to load kernel
## modules.
# kubelet_load_modules: false

## Upstream dns servers
upstream_dns_servers:
- 8.8.8.8
- 8.8.4.4

docker_dns_servers_strict: false

kubeadm_enabled: true
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either ‘gce’, ‘aws’, ‘azure’, ‘openstack’, ‘vsphere’, ‘oci’, or ‘external’
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
## Note: The ‘external’ cloud provider is not supported.
## TODO(riverzhang): https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/#running-cloud-controller-manager
# cloud_provider:

## Set these proxy values in order to update package manager and docker daemon to use proxies
HTTP_PROXY: “http://<your_proxy_ip>:<your_proxy_port>"
#http_proxy: “http://<your_proxy_ip>:<your_proxy_port>/"
#https_proxy: “http://<your_proxy_ip>:<your_proxy_port>/"

## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
no_proxy: “<master_node_ip>,<worker_node_ip>,10.96.0.0/12,localhost,127.0.0.1,192.168.0.0/16

## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False

## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
additional_no_proxy: “<master_node_ip>,<worker_node_ip>,10.96.0.0/12,localhost,127.0.0.1,192.168.0.0/16”

## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose ‘none’ if you provide your own certificates.
## Option is “script”, “none”
## note: vault is removed
cert_management: script

## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false

## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
kube_read_only_port: 10255

## Set true to download and cache container
download_container: true

## Deploy container engine
# Set false if you want to deploy container engine manually.
deploy_container_engine: true

## Set Pypi repo and cert accordingly
# pyrepo_index: https://pypi.example.com/simple
# pyrepo_cert: /etc/ssl/certs/ca-certificates.crt

now navigate to inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml and change following values as per your environment:
root@master:~/kubespray# cat inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml
— -
# Kubernetes configuration dirs and system namespace.
# Those are where all the additional config stuff goes
# the kubernetes normally puts in /srv/kubernetes.
# This puts them in a sane location and namespace.
# Editing those values will almost surely break something.
kube_config_dir: /etc/kubernetes
kube_script_dir: “{{ bin_dir }}/kubernetes-scripts”
kube_manifest_dir: “{{ kube_config_dir }}/manifests”

# This is where all the cert scripts and certs will be located
kube_cert_dir: “{{ kube_config_dir }}/ssl”

# This is where all of the bearer tokens will be stored
kube_token_dir: “{{ kube_config_dir }}/tokens”

# This is where to save basic auth file
kube_users_dir: “{{ kube_config_dir }}/users”

kube_api_anonymous_auth: true

## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.13.5

# kubernetes image repo define
kube_image_repo: “gcr.io/google-containers”

# Where the binaries will be downloaded.
# Note: ensure that you’ve enough disk space (about 1G)
local_release_dir: “/tmp/releases”
# Random shifts for retrying failed ops like pushing/downloading
retry_stagger: 5

# This is the group that the cert creation scripts chgrp the
# cert files to. Not really changeable…
kube_cert_group: kube-cert

# Cluster Loglevel configuration
kube_log_level: 2

# Directory where credentials will be stored
credentials_dir: “{{ inventory_dir }}/credentials”

# Users to create for basic auth in Kubernetes API via HTTP
# Optionally add groups for user
kube_api_pwd: “{{ lookup(‘password’, credentials_dir + ‘/kube_user.creds length=15 chars=ascii_letters,digits’) }}”
kube_users:
kube:
pass: “{{kube_api_pwd}}”
role: admin
groups:
— system:masters

## It is possible to activate / deactivate selected authentication methods (basic auth, static token auth)
# kube_oidc_auth: false
# kube_basic_auth: false
# kube_token_auth: false

## Variables for OpenID Connect Configuration https://kubernetes.io/docs/admin/authentication/
## To use OpenID you have to deploy additional an OpenID Provider (e.g Dex, Keycloak, …)

# kube_oidc_url: https:// …
# kube_oidc_client_id: kubernetes
## Optional settings for OIDC
# kube_oidc_ca_file: “{{ kube_cert_dir }}/ca.pem”
# kube_oidc_username_claim: sub
# kube_oidc_username_prefix: oidc:
# kube_oidc_groups_claim: groups
# kube_oidc_groups_prefix: oidc:

# Choose network plugin (cilium, calico, contiv, weave or flannel. Use cni for generic cni plugin)
# Can also be set to ‘cloud’, which lets the cloud provider setup appropriate routing
kube_network_plugin: flannel

# Setting multi_networking to true will install Multus: https://github.com/intel/multus-cni
kube_network_plugin_multus: false

# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18

# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18

# internal network node size allocation (optional). This is the size allocated
# to each node on your network. With these defaults you should have
# room for 4096 nodes with 254 pods per node.
kube_network_node_prefix: 24

# The port the API Server will be listening on.
kube_apiserver_ip: “{{ kube_service_addresses|ipaddr(‘net’)|ipaddr(1)|ipaddr(‘address’) }}”
kube_apiserver_port: 6443 # (https)
#kube_apiserver_insecure_port: 8080 # (http)
# Set to 0 to disable insecure port — Requires RBAC in authorization_modes and kube_api_anonymous_auth: true
kube_apiserver_insecure_port: 0 # (disabled)

# Kube-proxy proxyMode configuration.
# Can be ipvs, iptables
kube_proxy_mode: iptables

# A string slice of values which specify the addresses to use for NodePorts.
# Values may be valid IP blocks (e.g. 1.2.3.0/24, 1.2.3.4/32).
# The default empty string slice ([]) means to use all local addresses.
# kube_proxy_nodeport_addresses_cidr is retained for legacy config
kube_proxy_nodeport_addresses: >-
{%- if kube_proxy_nodeport_addresses_cidr is defined -%}
[{{ kube_proxy_nodeport_addresses_cidr }}]
{%- else -%}
[]
{%- endif -%}

# If non-empty, will use this string as identification instead of the actual hostname
# kube_override_hostname: >-
# {%- if cloud_provider is defined and cloud_provider in [ ‘aws’ ] -%}
# {%- else -%}
# {{ inventory_hostname }}
# {%- endif -%}

## Encrypting Secret Data at Rest (experimental)
kube_encrypt_secret_data: false

# DNS configuration.
# Kubernetes cluster name, also will be used as DNS domain
cluster_name: cluster.local
# Subdomains of DNS domain to be resolved via /etc/resolv.conf for hostnet pods
ndots: 2
# Can be coredns, coredns_dual, manual or none
dns_mode: coredns
# Set manual server if using a custom cluster DNS server
# manual_dns_server: 10.x.x.x
# Enable nodelocal dns cache
enable_nodelocaldns: false
#nodelocaldns_ip: 169.254.25.10

# Can be docker_dns, host_resolvconf or none
resolvconf_mode: docker_dns
# Deploy netchecker app to verify DNS resolve as an HTTP service
deploy_netchecker: false
# Ip address of the kubernetes skydns service
skydns_server: “{{ kube_service_addresses|ipaddr(‘net’)|ipaddr(3)|ipaddr(‘address’) }}”
skydns_server_secondary: “{{ kube_service_addresses|ipaddr(‘net’)|ipaddr(4)|ipaddr(‘address’) }}”
dns_domain: “{{ cluster_name }}”

## Container runtime
## docker for docker and crio for cri-o.
container_manager: docker

## Settings for containerized control plane (etcd/kubelet/secrets)
etcd_deployment_type: docker
kubelet_deployment_type: host
helm_deployment_type: host

# K8s image pull policy (imagePullPolicy)
k8s_image_pull_policy: IfNotPresent

# audit log for kubernetes
kubernetes_audit: true

# dynamic kubelet configuration
dynamic_kubelet_configuration: false

# define kubelet config dir for dynamic kubelet
# kubelet_config_dir:
default_kubelet_config_dir: “{{ kube_config_dir }}/dynamic_kubelet_dir”
dynamic_kubelet_configuration_dir: “{{ kubelet_config_dir | default(default_kubelet_config_dir) }}”

# pod security policy (RBAC must be enabled either by having ‘RBAC’ in authorization_modes or kubeadm enabled)
podsecuritypolicy_enabled: false

# Make a copy of kubeconfig on the host that runs Ansible in {{ inventory_dir }}/artifacts
kubeconfig_localhost: true
# Download kubectl onto the host that runs Ansible in {{ bin_dir }}
kubectl_localhost: true

# Enable creation of QoS cgroup hierarchy, if true top level QoS and pod cgroups are created. (default true)
# kubelet_cgroups_per_qos: true

# A comma separated list of levels of node allocatable enforcement to be enforced by kubelet.
# Acceptable options are ‘pods’, ‘system-reserved’, ‘kube-reserved’ and ‘’. Default is “”.
# kubelet_enforce_node_allocatable: pods

## Supplementary addresses that can be added in kubernetes ssl keys.
## That can be useful for example to setup a keepalived virtual IP
# supplementary_addresses_in_ssl_keys: [10.0.0.1, 10.0.0.2, 10.0.0.3]

## Running on top of openstack vms with cinder enabled may lead to unschedulable pods due to NoVolumeZoneConflict restriction in kube-scheduler.
## See https://github.com/kubernetes-sigs/kubespray/issues/2141
## Set this variable to true to get rid of this issue
volume_cross_zone_attachment: false
# Add Persistent Volumes Storage Class for corresponding cloud provider ( OpenStack is only supported now )
persistent_volumes_enabled: false

## Container Engine Acceleration
## Enable container acceleration feature, for example use gpu acceleration in containers
# nvidia_accelerator_enabled: true
## Nvidia GPU driver install. Install will by done by a (init) pod running as a daemonset.
## Important: if you use Ubuntu then you should set in all.yml ‘docker_storage_options: -s overlay2’
## Array with nvida_gpu_nodes, leave empty or comment if you dont’t want to install drivers.
## Labels and taints won’t be set to nodes if they are not in the array.
# nvidia_gpu_nodes:
# — kube-gpu-001
# nvidia_driver_version: “384.111”
## flavor can be tesla or gtx
# nvidia_gpu_flavor: gtx
## NVIDIA driver installer images. Change them if you have trouble accessing gcr.io.
# nvidia_driver_install_centos_container: atzedevries/nvidia-centos-driver-installer:2
# nvidia_driver_install_ubuntu_container: gcr.io/google-containers/ubuntu-nvidia-driver-installer@sha256:7df76a0f0a17294e86f691c81de6bbb7c04a1b4b3d4ea4e7e2cccdc42e1f6d63
## NVIDIA GPU device plugin image.
# nvidia_gpu_device_plugin_container: “k8s.gcr.io/nvidia-gpu-device-plugin@sha256:0842734032018be107fa2490c98156992911e3e1f2a21e059ff0105b07dd8e9e”

now navigate to inventory/mycluster/group_vars/k8s-cluster/k8s-net-flannel.yml and change following values as per your environment:
root@master:~/kubespray# cat inventory/mycluster/group_vars/k8s-cluster/k8s-net-flannel.yml
# see roles/network_plugin/flannel/defaults/main.yml

## interface that should be used for flannel operations
## This is actually an inventory cluster-level item
# flannel_interface:

## Select interface that should be used for flannel operations by regexp on Name or IP
## This is actually an inventory cluster-level item
## example: select interface with ip from net 10.0.0.0/23
## single quote and escape backslashes
#flannel_interface_regexp: ‘10\\.0\\.[0–2]\\.\\d{1,3}’

# You can choose what type of flannel backend to use: ‘vxlan’ or ‘host-gw’
# for experimental backend
# please refer to flannel’s docs : https://github.com/coreos/flannel/blob/master/README.md
flannel_backend_type: “vxlan”

now navigate to kubespray directory and execute following command to start ansible plays to setup Kubernetes Control Plane, Docker, Flannel, Helm and other important components
root@master:~/kubespray# ansible-playbook — flush-cache -i inventory/mycluster/inventory.ini cluster.yml — ask-pass — become — ask-become-pass

--

--