Preface

OpenShift was first launched in 2011 and relied on Linux containers to deploy and run user applications. In 2014, Kubernetes was open-sourced by Google, pitched as" a system for automating deployment, scaling, and operations of application containers."

The release of OpenShift V3 was quite substantial. OpenShift began using containers and images, and to orchestrate those images, V3 introduced using Kubernetes.

Red Hat proved to be at the forefront of container technology, second only to Google in contributions to Cloud Native Computing Foundation (CNCF) projects. Moreover, the acquisition of CoreOS in January 2018. The CoreOS flagship product was a lightweight Linux operating system designed to run containerized applications, and Red Hat made available in OpenShift V4 as "Red Hat Enterprise Linux CoreOS".

Red Hat OpenShift is one of those technologies causing a lot of noise and demand for skills in the Information Technology industry. Still relatively young, it is under massive development and evolving faster than professionals can keep up.

The pragmatic OpenShift guide aims to provide a hands-on approach to deploying and configuring OpenShift 4.6.

OpenShift 4.6

Introduction

This document aims to remove the ambiguity sometimes found in the official documentation and using clear examples, demonstrate how to deploy an OpenShift cluster and complete the most common post-installation, configuration tasks.

The primary purpose of the information detailed is for learning and building a personal lab environment. Therefore the scenarios are not intended for production use. However, it is the know-how what counts.

OpenShift is a rapidly moving target, with minor releases often incrementing weekly, this document is focused on OpenShift 4.6.

A challenge for a would-be OpenShift administrator is the accessibility to technology. Minimum requirements are enormous, and let us remember OpenShift (based on Kubernetes) is a cloud-native platform first. Evident from when OpenShift 4.1 was first released. Initially, it only supported Amazon Web Services (AWS) using the Installer Provisioned Infrastructure (IPI).

Today, OpenShift now supports AWS, Azure, GCP, IBM, OpenStack, RHV, vSphere and bare metal. All of these have their nuances, and for a home lab, most are too costly for learning and experimenting.

Bare metal allows us to provision infrastructure, and the User Provisioned Infrastructure (UPI) installation enables customisations. The process of doing a UPI bare-metal installation is far more involved than say an AWS IPI. However, the knowledge gained is invaluable, and the result is a local cluster, albeit a minimal three-node cluster.

INFRASTRUCTURE

Infrastructure is the compute resources that software platforms get deployed on to. Traditionally and as demonstrated in this documentation that can be physical hardware. It might mean virtualisation using hypervisors such as VMware, Red Hat Virtualisation or Hyper-V for example. Most commonly, it will likely mean cloud infrastructure. Cloud infrastructure builds on virtualisation providing on-demand abstracted resources such as hardware, storage, and network resources.

In reality, cloud infrastructure resources are costly and move capital expenditure (CapEx) to operation expenditure (OpEx). For a lab environment, a monthly bill using cloud resources would equate to the outright purchase of sufficient hardware to keep forever. For production enterprise environments, resilience, scalability and flexibility of OpEx infrastructure are highly appealing and make sense. However, for learning, experimentation, and keeping costs down with the purchase of hardware for the long term is appealing.

This guide demonstrates using Intel® NUC’s for master nodes, which are affordable, compact with low power consumption. Furthermore, a Raspberry Pi used for the core utility services including DNS, Load Balancer and Apache webserver.

Architecture overview

The following diagram is a high-level overview of the lab environment deployed in this document. It depicts both the physical hosts and virtual hosts that make up a hybrid cluster. The virtual hosts include the temporary bootstrap host, only needed during the initial deployment of the three master nodes.

Three master nodes make up a minimal cluster. The nodes will play the role of "master", "worker" and "infra" nodes.

Further scaling of the cluster is optional and done using virtual machines (VM) with network bridging. Using VMs provides flexibility where resources are limited. Temporarily adding and removing worker and infrastructure nodes is excellent for trying various activities while keeping the three core physical master nodes permanently.

Architecture Overview

The following table includes the details of the environment used throughout this document:

Table 1. Host Details
DNS Name IP Address Description

client.local

192.168.0.35

RHEL8/Fedora32 client laptop

utilities.cluster.lab.com

192.168.0.101

Raspberry Pi 3 Model B, 4 core 1GB RAM, 8GB storage

bootstrap.cluster.lab.com

192.168.0.102

KVM VM, 4 core, 16GB RAM, 120GB storage

master1.cluster.lab.com

192.168.0.111

Intel NUC i5, 4 core, 16GB RAM, 120GB storage

master2.cluster.lab.com

192.168.0.112

Intel NUC i5, 4 core, 16GB RAM, 120GB storage

master3.cluster.lab.com

192.168.0.113

Intel NUC i5, 4 core, 16GB RAM, 120GB storage

Table 2. Worker and Infra node examples
DNS Name IP Address Description

worker1.cluster.lab.com

192.168.0.121

KVM VM, 4 core, 16GB RAM, 120GB storage

worker2.cluster.lab.com

192.168.0.122

KVM VM, 4 core, 16GB RAM, 120GB storage

worker3.cluster.lab.com

192.168.0.123

KVM VM, 4 core, 16GB RAM, 120GB storage

infra1.cluster.lab.com

192.168.0.131

KVM VM, 4 core, 16GB RAM, 120GB storage

infra2.cluster.lab.com

192.168.0.132

KVM VM, 4 core, 16GB RAM, 120GB storage

infra3.cluster.lab.com

192.168.0.133

KVM VM, 4 core, 16GB RAM, 120GB storage

PREREQUISITES

Getting the prerequisites right is the essential part of deploying OpenShift. Mistakes with DNS, load balancing or networking in general, will only lead problems with the deployment of the cluster. Troubleshooting OpenShift deployments is notoriously challenging and often misleading towards the real root cause of issues.

With OpenShift 4, begin with a minimal working deployment, adding subsequent nodes and performing any cluster configuration post-deployment.

Once bootstrapping is completed, the minimal cluster will look like this:

Minimal Target Architecture Overview

In this guide, a Raspberry Pi is used, but it does not need to be, any host either physical or virtual (providing it is either on the same subnet or sufficient routing configured) with CentOS 7 or 8 install will do.

Regardless of the device, Domain Name System (DNS) needs to be in place and the provisioning of two load balancers (LB). One LB for the Application Programming Interface (API) and another LB for ingress application traffic flowing in from outside the cluster. A web server is also needed to serve files and images used for provisioning hosts.

All steps documented assume a Linux client computer either Fedora, CentOS or Red Hat Enterprise Linux.

Raspberry Pi

Refer to the following for information regarding CentOS for Raspberry Pi: wiki.centos.org

In this document CentOS-Userland-7-armv7hl-RaspberryPI-Minimal-2009-sda.raw.xz was used.

xz is a lossless compression program, if not already installed, install it on your client:

dnf install xz -y

Decompress the file:

unxz CentOS-Userland-7-armv7hl-RaspberryPI-Minimal-2009-sda.raw.xz

Use fdisk to identify existing storage devices on your system, then insert the MicroSD card, using fdisk again to identify the card:

fdisk -l
[ ... output omitted ... ]
Disk /dev/sda: 14.9 GiB, 15931539456 bytes, 31116288 sectors
[ ... output omitted ... ]

Using dd write the image to the SD card:

sudo dd if=CentOS-Userland-7-armv7hl-RaspberryPI-Minimal-2009-sda.raw of=/dev/sda bs=8192 status=progress; sync

Insert the SD card and power on the Raspberry Pi, logging in as root with the default password of centos:

Username: root
Password: centos

Expand the filesystem with /usr/bin/rootfs-expand.

/usr/bin/rootfs-expand

Set the hostname:

hostnamectl set-hostname utilities.cluster.lab.com

Remove NetworkManger:

yum remove NetworkManager

Edit /etc/sysconfig/network-scripts/ifcfg-eth0 and configure it as a static IP, I’m setting it to 192.168.0.101:

DEVICE=eth0
TYPE=Ethernet
BOOTPROTO=static
ONBOOT=yes
IPADDR=192.168.0.101
PREFIX=24
GATEWAY=192.168.0.1
DNS1=192.168.0.1
As a rule of thumb, take the time and effort to manage SELinux and firewalld correctly, in this case, to save time and focus on the prerequisites and deployment of OpenShift, disable both:

Disable SELinux:

vi /etc/sysconfig/selinux
SELINUX=disabled

Disable firewall:

systemctl stop firewalld
systemctl disable firewalld

These can be configured these later, but the less potential cause of issues the better because troubleshooting can be tricky if numerous problems are in the equation. Get the most basic working deployment complete, then introduce things one at a time making the process of troubleshooting "cause and effect" easier.

Install any updates and reboot:

yum update -y
reboot

Assuming the changes made are correct, test the static IP address using SSH from a client:

DNS

Install dnsmasq:

yum install dnsmasq -y

Backup the original configuration:

cp /etc/dnsmasq.conf /etc/dnsmasq.conf.bak

The following dnsmasq.conf configuration file includes a few things but a key line to point out is the apps.cluster.lab.com line which provides the wildcard DNS resolution such as foo.apps.cluster.lab.com or bar.apps.cluster.lab.com:

Edit dnsmasq.conf:

vi /etc/dnsmasq.conf
server=192.168.0.1
server=8.8.8.8
server=8.8.4.4
local=/cluster.lab.com/
address=/apps.cluster.lab.com/192.168.0.101
interface=eth0
listen-address=::1,127.0.0.1,192.168.0.101
expand-hosts
domain=cluster.lab.com
addn-hosts=/etc/dnsmasq.openshift.hosts
conf-dir=/etc/dnsmasq.d,.rpmnew,.rpmsave,.rpmorig
srv-host=_etcd-server-ssl._tcp.cluster,master1.cluster.lab.com,2380,0,10
srv-host=_etcd-server-ssl._tcp.cluster,master2.cluster.lab.com,2380,0,10
srv-host=_etcd-server-ssl._tcp.cluster,master3.cluster.lab.com,2380,0,10

Next I’m adding all the DNS entire I might ever need for a cluster:

vi /etc/dnsmasq.openshift.hosts
192.168.0.101 utilities.cluster.lab.com dns.cluster.lab.com lb.cluster.lab.com api.cluster.lab.com api-int.cluster.lab.com
192.168.0.102 bootstrap.cluster.lab.com
192.168.0.111 master1.cluster.lab.com etcd-0.cluster.lab.com
192.168.0.112 master2.cluster.lab.com etcd-1.cluster.lab.com
192.168.0.113 master3.cluster.lab.com etcd-2.cluster.lab.com
192.168.0.121 worker1.cluster.lab.com
192.168.0.122 worker2.cluster.lab.com
192.168.0.123 worker3.cluster.lab.com
192.168.0.131 infra1.cluster.lab.com
192.168.0.132 infra2.cluster.lab.com
192.168.0.133 infra3.cluster.lab.com

Next configure this host to use itself for DNS resolution:

vi /etc/resolv.conf
search Home cluster.lab.com
nameserver 192.168.0.101

Lock resolv.conf from being modified:

chattr +i /etc/resolv.conf

Start and enable the service:

systemctl enable dnsmasq.service --now

Install bind-utils:

yum install bind-utils -y

Test some lookups, both IP Addresses and DNS entries should be resolvable, including anything.apps.cluster.lab.com:

nslookup www.google.com
nslookup master1.cluster.lab.com
nslookup 192.168.0.111
nslookup foo.apps.cluster.lab.com
nslookup bar.apps.cluster.lab.com

HAProxy

Install HAProxy:

yum install haproxy -y

Back up the original configuration file:

cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak

And add the following configuration (changing IPs for your environment)

vi /etc/haproxy/haproxy.cfg
global
    log         127.0.0.1 local2
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    stats socket /var/lib/haproxy/stats

defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    30s
    timeout queue           1m
    timeout connect         30s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 30s
    timeout check           30s
    maxconn                 4000

frontend api
    bind 0.0.0.0:6443
    option tcplog
    mode tcp
    default_backend api

backend api
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    balance roundrobin
    server bootstrap bootstrap.cluster.lab.com:6443 check check-ssl verify none
    server master1 master1.cluster.lab.com:6443 check check-ssl verify none
    server master2 master2.cluster.lab.com:6443 check check-ssl verify none
    server master3 master3.cluster.lab.com:6443 check check-ssl verify none

frontend api-int
    bind 0.0.0.0:22623
    option tcplog
    mode tcp
    default_backend api-int

backend api-int
    mode tcp
    balance roundrobin
    server bootstrap 192.168.0.102:22623 check
    server master1 192.168.0.111:22623 check
    server master2 192.168.0.112:22623 check
    server master3 192.168.0.113:22623 check

frontend apps-http
    bind 192.168.0.101:80
    option tcplog
    mode tcp
    default_backend apps-http

backend apps-http
    mode tcp
    balance roundrobin
    server master1 master1.cluster.lab.com:80 check
    server master2 master2.cluster.lab.com:80 check
    server master3 master3.cluster.lab.com:80 check

frontend apps-https
    bind 192.168.0.101:443
    option tcplog
    mode tcp
    default_backend apps-https

backend apps-https
    mode tcp
    balance roundrobin
    option ssl-hello-chk
    server master1 192.168.0.111:443 check
    server master2 192.168.0.112:443 check
    server master3 192.168.0.113:443 check

listen stats
    bind 0.0.0.0:9000
    mode http
    balance
    timeout client 5000
    timeout connect 4000
    timeout server 30000
    stats uri /stats
    stats refresh 5s
    stats realm HAProxy\ Statistics
    stats auth admin:changeme
    stats admin if TRUE

This haproxy.conf example purposely uses inconsistent methods of configuration between the load balancers to provide good working examples. The configuration here is correct for serving OpenShift requirements. Notice the configuration includes an HAProxy Statistics page that auto-refreshes, and that the apps-http excludes the SSL check.

Enable and start HAProxy:

systemctl enable haproxy.service --now

View the graphical statistics report at http://192.168.0.101:9000/stats. In this example the username is admin and password is changeme. If you’ve pointed your local client to use 192.168.0.101 for its DNS, try http://lb.cluster.lab.com:9000/stats.

Apache web server

Install and configure httpd on port 8080 (because port 80 is already used by HAProxy)

yum install httpd -y

Edit httpd.conf:

vi /etc/httpd/conf/httpd.conf
Listen 8080

Enable and start the service:

systemctl enable httpd.service --now

Remember to append port 8080 when referring to this service, for example: http://192.168.0.101:8080/

For OpenShift bare metal installations, files can be copied into /var/www/html on this utilities server.

INSTALLATION

At the time of writing the latest version is 4.6.4. The downloads necessary are publicly available however download a legitimate pull secret from the Red Hat cluster portal. Vising https://cloud.redhat.com/openshift/install/metal/user-provisioned for the latest versions and obtaining your pull secret.

Client tools

Download the OpenShift installer program on your client computer:

Download the OpenShift command-line tools:

Checksums:

Check the integrity of the downloaded files:

sha256sum openshift-client-linux.tar.gz
sha256sum openshift-install-linux.tar.gz
c1f39a966fc0dbd4f8f0bfec0196149d54e0330de523bf906bbe2728b10a860b  openshift-client-linux.tar.gz
b81e1d25d77a05eaae8c0f154ed563c2caee21ed63401d655a7ad3206fdfd53c  openshift-install-linux.tar.gz

Make a bin directory in your home directory:

mkdir ~/bin
You may prefer to extract to /usr/local/bin/

Extract the CLI tools:

tar -xvf openshift-client-linux.tar.gz -C ~/bin
tar -xvf openshift-install-linux.tar.gz -C ~/bin

Check the oc version:

oc version
Client Version: 4.6.4

Check the openshift-install version:

openshift-install version
openshift-install 4.6.4

Install preparation

Create a working directory, for example:

mkdir -p ~/ocp4/cluster && cd ~/ocp4

Generate a new SSH key pair that will be embedded into the OpenShift deployment, this will enable you to SSH to OpenShift nodes.

ssh-keygen -t rsa -b 4096 -N '' -f cluster_id_rsa

Create an installation configuration file, the compute replicas is always set to zero for bare metal, this refers to worker nodes which are manually added post-deployment. The key option here is the controlPlane: replicas needs to be either 1 for a single node cluster or 3 for the minimal three node cluster. The bootstrap process does not complete until this defined critical is met, so plan ahead!

vi install-config.yaml.orig
apiVersion: v1
baseDomain: lab.com
compute:
- hyperthreading: Enabled
  name: worker
  replicas: 0
controlPlane:
  hyperthreading: Enabled
  name: master
  replicas: 3
metadata:
  name: cluster
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  none: {}
fips: false
pullSecret: '{"auths": ...}'
sshKey: 'ssh-ed25519 AAAA...'

Copy the configuration file into the cluster directory, it’s important to have the original copy because the installation process destroys it! It’s handy to keep for reference and because in reality it usually takes a few attempts to get right.

Remember to paste in your real pull secret and public key.
cp install-config.yaml.orig cluster/install-config.yaml

The following two commands create and initiates the installation process. The first create manifests step gives you an opportunity to make further tweak to the deployment. The create ignition-configs uses those manifest to create the ignition files.

openshift-install create manifests --dir=cluster
openshift-install create ignition-configs --dir=cluster

The files in the cluster directory should now look like this:

auth
bootstrap.ign
master.ign
metadata.json
worker.ign

Dependencies

Download the installer ISO image and the compressed metal RAW:

Checksums:

Check the integrity of the downloaded files:

sha256sum rhcos-installer.x86_64.iso
sha256sum rhcos-metal.x86_64.raw.gz
d15bd7ae942573eece34ba9c59e110e360f15608f36e9b83ab9f2372d235bef2  rhcos-installer.x86_64.iso
7e61bbe56735bc26d0808d4fffc4ccac25554df7d3c72c7b678e83e56c7ac5ba  rhcos-metal.x86_64.raw.gz

Copy the three ignition files and the Red Hat CoreOS image to the utilities.cluster.lab.com, to be served by Apache:

scp cluster/*.ign [email protected]:/var/www/html/

Copy the Red Hat CoreOS image:

scp rhcos-metal.x86_64.raw.gz [email protected]:/var/www/html/

On utilities.cluster.lab.com ensure file permissions are correct:

chmod 644 /var/www/html/*

From the client computer test these files are available to download via HTTP:

wget http://192.168.0.101:8080/bootstrap.ign

Bootstrap node

Using either using Virtual machine manager to create a KVM VM or VirtualBox, create a Virtual Machine with 4 cores, 16GB RAM (16384) and 120GB of storage. This VM will is destroyed after the cluster installation is complete.

Using the rhcos-installer.x86_64.iso boot the VM up, until you arrive at a command prompt:

The VM will have an IP Address assigned via DHCP, we need to set a static IP.

View current interface IP Address:

ip a

View the connection:

nmcli con show

Connection

Set the IPAddress for the bootstrap node:

nmcli con mod 'Wired Connection' ipv4.method manual ipv4.addresses 192.168.0.102/24 ipv4.gateway 192.168.0.1 ipv4.dns 192.168.0.101 connection.autoconnect yes

Restart Network Manager and bring up the connection:

sudo systemctl restart NetworkManager
nmcli con up 'Wired Connection'

Start the CoreOS installer, providing the bootstrap.ign ignition file:

sudo coreos-installer install --ignition-url=http://192.168.0.101:8080/bootstrap.ign /dev/sda --insecure-ignition --copy-network

Reboot the VM with reboot, make sure the VM boots from the hard disk storage (eject the ISO before it boots) or shutdown the VM and remove the CD-ROM from the boot order.

Make sure the VM boots up with the correct IP Address previously assigned:

Bootstrap login prompt

Once the bootstrap node is up and at the login prompt with the correct IP Address, the VM should provision itself, and eventually come up in the load balancer http://192.168.0.101:9000/stats:

Bootstrap load balancer

From a Linux client you should be able to SSH to it using the private key generated earlier:

ssh -i cluster_id_rsa [email protected]

Check the progress on the bootstrap node with:

journalctl -b -f -u release-image.service -u bootkube.service

The following pods should eventually be up and running:

sudo crictl pods

...Ready               bootstrap-kube-scheduler-bootstrap.cluster.lab.com...
...Ready               bootstrap-kube-controller-manager-bootstrap.cluster.lab.com...
...Ready               bootstrap-kube-apiserver-bootstrap.cluster.lab.com...
...Ready               cloud-credential-operator-bootstrap.cluster.lab.com...
...Ready               bootstrap-cluster-version-operator-bootstrap.cluster.lab.com...
...Ready               bootstrap-machine-config-operator-bootstrap.cluster.lab.com...
...Ready               etcd-bootstrap-member-bootstrap.cluster.lab.com...

List the running containers and tail the logs of any one:

sudo crictl ps

sudo crictl logs <CONTAINER_ID>

From the the Linux client the following command should return ok:

curl -X GET https://api.cluster.lab.com:6443/healthz -k

Export the kubeconfig and test getting cluster operators with oc get co:

export KUBECONFIG=cluster/auth/kubeconfig
oc get co

You’ll only see the cloud-credential operator is available at this stage:

NAME                        VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication
cloud-credential            True        False         False      26m
cluster-autoscaler
config-operator
console
csi-snapshot-controller
dns
etcd
...
All of these tests MUST work as documented else it’s pointless continuing any further.

Any other responses or errors mean there are issues with either networking, DNS or Load Balancing configurations. Go back and troubleshoot any issues until you get the expected results at his stage.

On your client you can see the progress of the installation and that it’s moved on a step because api.cluster.lab.com is up and working:

openshift-install --dir=cluster wait-for bootstrap-complete
INFO Waiting up to 20m0s for the Kubernetes API at https://api.cluster.lab.com:6443...
INFO API v1.19.0+9f84db3 up
INFO Waiting up to 30m0s for bootstrapping to complete...

The bootstrapping process will not complete until all three master nodes have been provisioned.

Master nodes

For physical host installations, write the rhcos-installer.x86_64.iso image to a USB pen drive.

Use fdisk to identify existing storage devices on your system, then insert the USB pen drive, using fdisk again to identify the device:

fdisk -l
[ ... output omitted ... ]
Disk /dev/sda: 58.5 GiB, 62763565056 bytes, 122585088 sectors
[ ... output omitted ... ]

Write the image to the device:

sudo dd if=rhcos-installer.x86_64.iso of=/dev/sda status=progress; sync

The next steps repeat the process of booting the three physical nodes using the Red Hat CoreOS ISO. Make sure to use master.ign, and the right IP Address and hostname for each master node. In the case of an Intel NUC, F10 is used to interrupt the host BIOS and select a boot device.

master1

Using the rhcos-installer.x86_64.iso USB device, boot the VM up, until you arrive at a command prompt:

The VM will have an IP Address assigned via DHCP, we need to set a static IP.

View current interface IP Address:

ip a

View the connection:

nmcli con show

Set the IP Address for the bootstrap node:

nmcli con mod 'Wired Connection' ipv4.method manual ipv4.addresses 192.168.0.111/24 ipv4.gateway 192.168.0.1 ipv4.dns 192.168.0.101 connection.autoconnect yes

Restart Network Manager and bring up the connection:

sudo systemctl restart NetworkManager
nmcli con up 'Wired Connection'

Start the CoreOS installer, providing the master.ign ignition file:

sudo coreos-installer install --ignition-url=http://192.168.0.101:8080/master.ign /dev/sda --insecure-ignition --copy-network

Reboot the VM with reboot, make sure the VM boots from the hard disk storage (remove the USB/ISO before it boots) or shutdown the VM and remove the CD-ROM from the boot order and power it back on.

Hit tab at the RHCOS GRUB menu and add the following:

ip=192.168.0.111::192.168.0.1:255.255.255.0:master1.cluster.lab.com:ens3:none nameserver=192.168.0.101

Unable to provide a screenshot of a physical host GRUB configuration, here is the example when repeating this process for an infra node:

GRUB
It’s unclear why this step is needed but with nodes other than the bootstrap node, this intervention was required. There are better methods for provisioning nodes but this documentation is focused on the most fundamental approach.

Prior to OCP 4.6, all the CoreOS parameters where added at the GRUB stage, for reference here are the original parameters:

coreos.inst=yes
coreos.inst.install_dev=sda
coreos.inst.image_url=http://192.168.0.101:8080/rhcos-metal.raw.gz
coreos.inst.ignition_url=http://192.168.0.101:8080/master.ign
ip=192.168.0.111::192.168.0.1:255.255.255.0:master1.cluster.lab.com:eno1:none:192.168.0.101
nameserver=192.168.0.101
master2

Repeat the process for the second master node:

nmcli con mod 'Wired Connection' ipv4.method manual ipv4.addresses 192.168.0.112/24 ipv4.gateway 192.168.0.1 ipv4.dns 192.168.0.101 connection.autoconnect yes
sudo systemctl restart NetworkManager
nmcli con up 'Wired Connection'
sudo coreos-installer install --ignition-url=http://192.168.0.101:8080/master.ign /dev/sda --insecure-ignition --copy-network
ip=192.168.0.112::192.168.0.1:255.255.255.0:master2.cluster.lab.com:ens3:none nameserver=192.168.0.101

Original bootstrap parameters:

coreos.inst=yes
coreos.inst.install_dev=sda
coreos.inst.image_url=http://192.168.0.101:8080/rhcos-metal.raw.gz
coreos.inst.ignition_url=http://192.168.0.101:8080/master.ign
ip=192.168.0.112::192.168.0.1:255.255.255.0:master2.cluster.lab.com:eno1:none:192.168.0.101
nameserver=192.168.0.101
master3

Repeat the process for the third master node:

nmcli con mod 'Wired Connection' ipv4.method manual ipv4.addresses 192.168.0.113/24 ipv4.gateway 192.168.0.1 ipv4.dns 192.168.0.101 connection.autoconnect yes
sudo systemctl restart NetworkManager
nmcli con up 'Wired Connection'
sudo coreos-installer install --ignition-url=http://192.168.0.101:8080/master.ign /dev/sda --insecure-ignition --copy-network
ip=192.168.0.113::192.168.0.1:255.255.255.0:master3.cluster.lab.com:ens3:none nameserver=192.168.0.101

Original bootstrap parameters:

coreos.inst=yes
coreos.inst.install_dev=sda
coreos.inst.image_url=http://192.168.0.101:8080/rhcos-metal.raw.gz
coreos.inst.ignition_url=http://192.168.0.101:8080/master.ign
ip=192.168.0.113::192.168.0.1:255.255.255.0:master3.cluster.lab.com:eno1:none:192.168.0.101
nameserver=192.168.0.101

Completion

Once all three master nodes are provisioning the process can take some time to complete. As indicated by the installer INFO "Waiting up to 40m0s for bootstrapping to complete".

The two key things to watch are the load balancers and cluster operators. Once the master node boots up to the login prompt, it will download a bunch of images and do some initial installation, and the host will perform a reboot and come back to the login prompt during this process.

Once all load balancers are showing up, and all cluster operators are "Available" the openshift-install should complete and advise removing the bootstrap node.

openshift-install --dir=cluster wait-for bootstrap-complete
INFO Waiting up to 20m0s for the Kubernetes API at https://api.cluster.lab.com:6443...
INFO API v1.19.0+9f84db3 up
INFO Waiting up to 30m0s for bootstrapping to complete...
INFO It is now safe to remove the bootstrap resources
INFO Time elapsed: 0s

Check all nodes are "Ready":

oc get nodes
NAME                      STATUS   ROLES           AGE   VERSION
master1.cluster.lab.com   Ready    master,worker   14h   v1.19.0+9f84db3
master2.cluster.lab.com   Ready    master,worker   13h   v1.19.0+9f84db3
master3.cluster.lab.com   Ready    master,worker   13h   v1.19.0+9f84db3

Check all operators are available:

oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.4     True        False         False      8m34s
cloud-credential                           4.6.4     True        False         False      15h
cluster-autoscaler                         4.6.4     True        False         False      13h
config-operator                            4.6.4     True        False         False      13h
console                                    4.6.4     True        False         False      7m35s
csi-snapshot-controller                    4.6.4     True        False         False      13h
dns                                        4.6.4     True        False         False      13h
etcd                                       4.6.4     True        False         False      11h
image-registry                             4.6.4     True        False         False      11h
ingress                                    4.6.4     True        False         False      8m40s
insights                                   4.6.4     True        False         False      13h
kube-apiserver                             4.6.4     True        False         False      11h
kube-controller-manager                    4.6.4     True        False         False      13h
kube-scheduler                             4.6.4     True        False         False      13h
kube-storage-version-migrator              4.6.4     True        False         False      13h
machine-api                                4.6.4     True        False         False      13h
machine-approver                           4.6.4     True        False         False      13h
machine-config                             4.6.4     True        False         False      13h
marketplace                                4.6.4     True        False         False      8m23s
monitoring                                 4.6.4     True        False         False      8m18s
network                                    4.6.4     True        False         False      13h
node-tuning                                4.6.4     True        False         False      13h
openshift-apiserver                        4.6.4     True        False         False      8m55s
openshift-controller-manager               4.6.4     True        False         False      13h
openshift-samples                          4.6.4     True        False         False      8m21s
operator-lifecycle-manager                 4.6.4     True        False         False      13h
operator-lifecycle-manager-catalog         4.6.4     True        False         False      13h
operator-lifecycle-manager-packageserver   4.6.4     True        False         False      8m57s
service-ca                                 4.6.4     True        False         False      13h
storage                                    4.6.4     True        False         False      13h

Power off the bootstrap node (and destroy it) and comment out the node in both the api and api-int load balancers in /etc/haproxy/haproxy.cfg.

The load balancers should look like the following screenshots, note that the ingress LB only has two replicas, therefore will show down on one of the nodes.

API LB
Ingress LB

Login

During installation and from your client you can access the cluster using the system:admin account:

export KUBECONFIG=cluster/auth/kubeconfig
oc whoami
system:admin

Login is as kubeadmin:

cat cluster/auth/kubeadmin-password
oc login -u kubeadmin -p kLsUd-GkkRt-GwPI7-n2cku  https://api.cluster.lab.com:6443

Login to the OpenShift web console:

oc project openshift-console
oc get routes

Troubleshooting

Single master

It is possible to deploy a single node "cluster" if defined in the install-config.yaml, however the installation never completes, with operators pending. Apply the following patch for the installation to complete with a single master configuration:

oc patch etcd cluster -p='{"spec": {"unsupportedConfigOverrides": {"useUnsupportedUnsafeNonHANonProductionUnstableEtcd": true}}}' --type=merge
Unknown authority

The following error can sometimes occur when attempting to login to the API via the command line:

error: x509: certificate signed by unknown authority

Switch projects:

oc project openshift-authentication

List the pods in the openshift-authentication project:

oc get pods

Using one of the pod names export the ingress certificate:

oc rsh -n openshift-authentication oauth-openshift-568bcc5d8f-84zh2 cat /run/secrets/kubernetes.io/serviceaccount/ca.crt > ingress-ca.crt

Copy and update your certificate authority certificates on your client host:

sudo cp ingress-ca.crt /etc/pki/ca-trust/source/anchors/
sudo update-ca-trust extract
Missing Console

If both the openshift-samples and console operators were absent during deployment of a cluster. Powering off all three masters and powering them back on brought all the operators up.

NODES

For a home lab, it might be common to leave the cluster as a minimal three-node cluster. Resources can be tight; however, in the real world, it is likely to provision dedicated infra nodes and many worker nodes. It is then possible to label nodes accordingly and mark specific applications to only run on designated infrastructure.

Whatever the labelling conventions used, all nodes from this point are "worker" nodes with labels.

Infra nodes

To avoid duplication just the key details are included, in this case three infra VMs are provisioned with bridged networking, each with 8GB Memory, 2 cores and 50GB storage. Pay close attention to the IP Addresses and the use of the worker.ign ignition file.

infra1.cluster.lab.com

nmcli con mod 'Wired Connection' ipv4.method manual ipv4.addresses 192.168.0.131/24 ipv4.gateway 192.168.0.1 ipv4.dns 192.168.0.101 connection.autoconnect yes
sudo systemctl restart NetworkManager
nmcli con up 'Wired Connection'
sudo coreos-installer install --ignition-url=http://192.168.0.101:8080/worker.ign /dev/sda --insecure-ignition --copy-network
ip=192.168.0.131::192.168.0.1:255.255.255.0:infra1.cluster.lab.com:ens3:none nameserver=192.168.0.101

The node should deploy as seen while doing master nodes but worker nodes get provisioned by the masters. Once a worker node has booted up off its hard drive and arrived at the login prompt, they will be a reboot.

Check for certificate signing requests, as a cluster administrator view any pending csr:

oc get csr

Refined:

oc get csr | grep -i pending

Approve them:

oc adm certificate approve csr-xyz

Typically, approving the first pending CSR, will cause a second one to appear shortly afterwards.

Once both CRS are approved you’ll see the node with a status NotReady:

oc get nodes
NAME                      STATUS     ROLES           AGE   VERSION
infra1.cluster.lab.com    NotReady   worker          25s   v1.19.0+9f84db3
master1.cluster.lab.com   Ready      master,worker   16h   v1.19.0+9f84db3
master2.cluster.lab.com   Ready      master,worker   15h   v1.19.0+9f84db3
master3.cluster.lab.com   Ready      master,worker   14h   v1.19.0+9f84db3

The deployment of the node is still in progress, eventually the node with change status to "Ready".

Example when all three infra nodes are initially added to the cluster:

NAME                      STATUS     ROLES           AGE   VERSION
infra1.cluster.lab.com    Ready      worker          68s   v1.19.0+9f84db3
infra2.cluster.lab.com    Ready      worker          32m   v1.19.0+9f84db3
infra3.cluster.lab.com    Ready      worker          11m   v1.19.0+9f84db3
master1.cluster.lab.com   Ready      master,worker   16h   v1.19.0+9f84db3
master2.cluster.lab.com   Ready      master,worker   15h   v1.19.0+9f84db3
master3.cluster.lab.com   Ready      master,worker   15h   v1.19.0+9f84db3

You repeat this for any other infra nodes desired, most commonly at least three to achieve high availability, as depicted in the example above.

Notice the ROLE for the infra node is currently set to worker.
Label infra nodes

Create the infra machine config pool infra-mcp.yaml:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: infra
spec:
  machineConfigSelector:
    matchExpressions:
      - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
  maxUnavailable: 1
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/infra: ""
  paused: false
oc create -f infra-mcp.yaml

Label infra nodes and remove worker label:

The adding of infra label forces a reboot of the node, wait for reboot before removing the worker label
oc label node infra1.cluster.lab.com node-role.kubernetes.io/infra=
oc label node infra2.cluster.lab.com node-role.kubernetes.io/infra=
oc label node infra3.cluster.lab.com node-role.kubernetes.io/infra=

Each node will be marked as non-schedulable and reboot in turn:

watch oc get nodes
infra1.cluster.lab.com    Ready,SchedulingDisabled   infra,worker    5m3s   v1.19.0+9f84db3

Once completed, each infra node will be labelled both infra,worker, remove the worker label from each:

oc label node infra1.cluster.lab.com node-role.kubernetes.io/worker-
oc label node infra2.cluster.lab.com node-role.kubernetes.io/worker-
oc label node infra3.cluster.lab.com node-role.kubernetes.io/worker-

Node should look like:

oc get nodes
NAME                      STATUS   ROLES           AGE   VERSION
infra1.cluster.lab.com    Ready    infra           11m   v1.19.0+9f84db3
infra2.cluster.lab.com    Ready    infra           42m   v1.19.0+9f84db3
infra3.cluster.lab.com    Ready    infra           21m   v1.19.0+9f84db3
master1.cluster.lab.com   Ready    master,worker   16h   v1.19.0+9f84db3
master2.cluster.lab.com   Ready    master,worker   15h   v1.19.0+9f84db3
master3.cluster.lab.com   Ready    master,worker   15h   v1.19.0+9f84db3

Worker nodes

Add worker nodes by repeating the same process as adding infra nodes without any labelling. Making sure IP Addresses and host names are correct during the process.

A typical deployment might look like this:

NAME                      STATUS   ROLES           AGE   VERSION
infra1.cluster.lab.com    Ready    infra           11m   v1.19.0+9f84db3
infra2.cluster.lab.com    Ready    infra           42m   v1.19.0+9f84db3
infra3.cluster.lab.com    Ready    infra           21m   v1.19.0+9f84db3
master1.cluster.lab.com   Ready    master,worker   16h   v1.19.0+9f84db3
master2.cluster.lab.com   Ready    master,worker   15h   v1.19.0+9f84db3
master3.cluster.lab.com   Ready    master,worker   15h   v1.19.0+9f84db3
worker1.cluster.lab.com   Ready    worker          68s   v1.19.0+9f84db3
worker2.cluster.lab.com   Ready    worker          32m   v1.19.0+9f84db3
worker3.cluster.lab.com   Ready    worker          11m   v1.19.0+9f84db3

Move ingress router

It’s common practice to move the ingress router off the master nodes and run three replicas on the infra nodes. With the correctly labelled infranodes in place this is done with the following two patches.

oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"nodePlacement": {"nodeSelector":{"matchLabels":{"node-role.kubernetes.io/infra": "" }}}}}' --type=merge
oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 3}}' --type=merge
Update the backend ingress load balancer, in this case, HAProxy, to now point at the three infra nodes.

Disable master scheduler

Disabling the master scheduler removes the "worker" label from master nodes, preventing unwanted applications from running on master nodes, reserving their resources for running the cluster.

oc edit scheduler

Set mastersSchedulable to false:

apiVersion: config.openshift.io/v1
kind: Scheduler
metadata:
  creationTimestamp: null
  name: cluster
spec:
  mastersSchedulable: false
  policy:
    name: ""
status: {}

Validate:

oc get nodes
NAME                      STATUS   ROLES           AGE   VERSION
infra1.cluster.lab.com    Ready    infra           11m   v1.19.0+9f84db3
infra2.cluster.lab.com    Ready    infra           42m   v1.19.0+9f84db3
infra3.cluster.lab.com    Ready    infra           21m   v1.19.0+9f84db3
master1.cluster.lab.com   Ready    master          16h   v1.19.0+9f84db3
master2.cluster.lab.com   Ready    master          15h   v1.19.0+9f84db3
master3.cluster.lab.com   Ready    master          15h   v1.19.0+9f84db3
worker1.cluster.lab.com   Ready    worker          68s   v1.19.0+9f84db3
worker2.cluster.lab.com   Ready    worker          32m   v1.19.0+9f84db3
worker3.cluster.lab.com   Ready    worker          11m   v1.19.0+9f84db3

Delete nodes

oc get nodes
oc delete node infra1.cluster.lab.com

NTP/CHRONY

In a lab environment, chronyd will be already configured on nodes with the default pool 2.rhel.pool.ntp.org iburst. Should the configuration need to be changed, the process involves adding MachineConfig. Machine Configs are found via the Web Console under Compute → Machine Config. Think of Machine Configs as configuration management for the cluster.

SSH to a master node and switch to root, for example:

ssh -i cluster_id_rsa [email protected]
sudo su -

Get a working minimalistic chrony.conf:

grep -v -e '^#' -e '^$' /etc/chrony.conf > chrony.conf

On a client make a copy of the chrony.conf configuration file:

vi chrony.conf
pool 2.rhel.pool.ntp.org iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
keyfile /etc/chrony.keys
leapsectz right/UTC
logdir /var/log/chrony

Encoded the file:

base64 chrony.conf > chrony.conf.encoded

Create a MachineConfig for worker nodes, pasting in the chrony.conf.encoded content:

vi worker-chrony.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: worker-chrony
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,cG9vbCAyLnJoZWwucG9vbC5udHAub3JnIGlidXJzdApkcmlmdGZpbGUgL3Zhci9saWIvY2hyb255L2RyaWZ0Cm1ha2VzdGVwIDEuMCAzCnJ0Y3N5bmMKa2V5ZmlsZSAvZXRjL2Nocm9ueS5rZXlzCmxlYXBzZWN0eiByaWdodC9VVEMKbG9nZGlyIC92YXIvbG9nL2Nocm9ueQo=
          verification: {}
        filesystem: root
        mode: 0644
        path: /etc/chrony.conf
Applying the configuration causes nodes to schedule a reboot, expect each worker node to bounce in sequence.
watch oc get nodes
oc create -f worker-chrony.yaml

And repeat for master:

vi master-chrony.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: master-chrony
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,cG9vbCAyLnJoZWwucG9vbC5udHAub3JnIGlidXJzdApkcmlmdGZpbGUgL3Zhci9saWIvY2hyb255L2RyaWZ0Cm1ha2VzdGVwIDEuMCAzCnJ0Y3N5bmMKa2V5ZmlsZSAvZXRjL2Nocm9ueS5rZXlzCmxlYXBzZWN0eiByaWdodC9VVEMKbG9nZGlyIC92YXIvbG9nL2Nocm9ueQo=
          verification: {}
        filesystem: root
        mode: 0644
        path: /etc/chrony.conf
Remeber, applying the configuration causes nodes to schedule a reboot, expect each master node to bounce in sequence.
oc create -f master-chrony.yaml

IMAGE REGISTRY

In typical OpenShift deployments, using a cloud provider with credentials, storage classes will be made available for the target infrastructure. In a bare-metal situation, this is not a luxury. It is simple to define NFS for shared storage, which is OK for specific services and tasks but something like logging; performance will take a hit. It is possible to add physical block devices to nodes and use the "Local Storage Operator". However, local storage volumes are fixed to nodes, so moving pods that depend on block storage will run into difficulties, not being able to mount storage to different nodes on demand.

For the image registry, NFS shared storage is the right choice.

NFS server

Using a RHEL/CentOS 8.2 host on the same network as your OpenShift Cluster install nfs-utils:

sudo dnf install nfs-utils -y
systemctl start nfs-server
systemctl enable nfs-server
systemctl status nfs-server
sudo mkdir -p /mnt/openshift/registry
vi /etc/exports

Add the following, including the options for rw,no_wdelay,no_root_squash:

/mnt/openshift/registry         192.168.0.1/24(rw,sync,no_wdelay,no_root_squash,insecure)

Export the new share with:

exportfs -arv

And confirm the share is visible:

exportfs  -s
showmount -e 127.0.0.1

If required, open up the firewall ports needed:

firewall-cmd --permanent --add-service=nfs
firewall-cmd --permanent --add-service=rpc-bind
firewall-cmd --permanent --add-service=mountd
firewall-cmd --reload

NFS storage class

Add a storage class with the no-provisioner option, making it a manual process:

vi nfs-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs
provisioner: no-provisioner
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
oc create -f nfs-storage-class.yaml

See storage classes via the web console Storage → Storage Classes

Configuration

You can now add persistent volume(s) (PV) using the nfs storage class:

vi registry-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: registry-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
  - ReadWriteMany
  nfs:
    path: /mnt/openshift/registry
    server: 192.168.0.15
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
oc create -f registry-pv.yaml

And view the result:

oc get pv
NAME          CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
registry-pv   50Gi       RWX            Retain           Available           nfs                     3s

To update the registry storage, you can add a persistent volume claim (PVC) using the new NFS storage class:

vi registry-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: image-registry-storage
  namespace: openshift-image-registry
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 50Gi
  storageClassName: nfs
oc create -f registry-pvc.yaml
The default name for the pvc created is image-registry-storage which is a known here.

Switch to the openshift-image-registry project and view the pending PVC:

oc project openshift-image-registry
oc get pvc

It will be currently pending:

NAME                     STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
image-registry-storage   Pending                                      nfs            15s

And edit the image registry configuration:

oc edit configs.imageregistry.operator.openshift.io

Three things need changing under spec: include managementState, replicas and storage:

...
  managementState: Managed
...
...
  replica: 3
...
...
  storage:
    pvc:
      claim: image-registry-storage

You can check the state/progress of these changes by viewing the pods:

oc project openshift-image-registry
oc get pods
NAME                                               READY   STATUS      RESTARTS   AGE
cluster-image-registry-operator-6c55f65c7d-sst5g   2/2     Running     0          18h
image-pruner-1605225600-cpm8d                      0/1     Completed   0          10h
image-registry-659c75894d-28mp4                    1/1     Running     0          18h
image-registry-659c75894d-5mx25                    1/1     Running     0          18h
image-registry-659c75894d-zqxcq                    1/1     Running     0          18h
node-ca-vj6ql                                      1/1     Running     0          3d
node-ca-wjk57                                      1/1     Running     0          3d
node-ca-ww946                                      1/1     Running     0          3d

And see the PVC has been claimed:

oc get pvc
NAME                     STATUS   VOLUME        CAPACITY   ACCESS MODES   STORAGECLASS
image-registry-storage   Bound    registry-pv   50Gi       RWX            nfs

Expose registry

Finally, you can expose the OpenShift image registry to enable you to work with it using Docker or Podman to tag and push images, make sure your in the openshift-image-registry project or add -n openshift-image-registry to include namespace with the command:

oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"defaultRoute":true}}' --type=merge
oc get routes

Migrate registry

To move the image registry to run on infra nodes apply the follwoing patch:

oc patch configs.imageregistry.operator.openshift.io/cluster -n openshift-image-registry --type=merge --patch '{"spec":{"nodeSelector":{"node-role.kubernetes.io/infra":""}}}'

Check where pods are running by adding -o wide to the following command:

oc get pods -o wide

Troubleshooting

No route to host

If pods never get past ContainerCreating, use oc describe pod to see details:

oc project openshift-image-registry
oc get pods
oc describe pod image-registry-5cc87cc5b8-4k6l6

If you see:

mount.nfs: No route to host

It’s either the PV is is configured incorrectly, pointing to a wrong NFS server or the NFS server/share is being blocked by a firewall or unavailable.

Unexpected status

If you see errors with OpenShift deployments later like this:

Registry server Address:
Registry server User Name: serviceaccount
Registry server Email: [email protected]
Registry server Password: <<non-empty>>
error: build error: Failed to push image: error copying la... received unexpected HTTP status: 500 Internal Server Error

The permissions on the share directory need fixing:

chmod 775 /mnt/openshift/registry
Undo storage config

If you need to revert back to a known working configuration, you can make it ephemeral by replacing the registry storage with:

oc edit configs.imageregistry.operator.openshift.io
  storage:
    emptyDir: {}

Delete PVC:

oc delete pvc image-registry-storage

Delete PV:

oc delete pv registry-pv

LOCAL STORAGE

Using the local storage operator deals with adding real disks, block storage to nodes.

Prior to OCP 4.6, a project needed to be created manually:

oc new-project local-storage

To enable local storage on all nodes including masters and infras:

oc annotate project local-storage openshift.io/node-selector=''

Navigate to Operators → OperatorHub, type "Local Storage" into the filter box to locate the Local Storage Operator which now creates openshift-local-storage project namespace.

Local storage class

Like with NFS, next create a new custom storage class for local block storage:

vi local-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-sc
provisioner: no-provisioner
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
parameters:
  diskformat: thin
oc create -f local-storage-class.yaml

See storage classes via the web console Storage → Storage Classes

Adding block storage

Adding disks to servers is either physical activity, or a simple case of adding disks to VMs in which ever virtualization in use.

Once block storage is added, determine the device paths for the new devices, SSH to each node and use fdisk -l to see devices.

ssh -i cluster_id_rsa [email protected]
sudo fdisk -l

Commonly, new devices will begin with /dev/sdb (sda used by RHCOS).

Its good practice to manage each node because paths might differ depending on the environment.

Assuming three infra nodes, each with a new 50GB disk attached, here is infra1:

vi local-disks-infra1.yaml
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
  name: "local-disks-infra1"
  namespace: "openshift-local-storage"
spec:
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - infra1.cluster.lab.com
  storageClassDevices:
    - storageClassName: "local-sc"
      volumeMode: Filesystem
      fsType: xfs
      devicePaths:
        - /dev/sdb
oc create -f local-disks-infra1.yaml

In a short time, see the PVs appear:

oc get pv
NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM STORAGECLASS   REASON   AGE
local-pv-297ca047   50Gi       RWO            Retain           Available         local-sc                53s
local-pv-dc609890   50Gi       RWO            Retain           Available         local-sc                63s
local-pv-fe609342   50Gi       RWO            Retain           Available         local-sc                70s

Navigate to Administration → Custom Resource Definitions → LocalVolume → Instances to view Local Volumes.

Repeat this process for each node in the cluster that needs local block storage made available.

LOGGING

OpenShift comes with its native logging stack using Elasticsearch, Fluentd, and Kibana (EFK). It can be very resource-intensive, in a production environment dedicate resource plentiful "infra" nodes to run Elasticsearch and use decent block storage.

Making do with limited resources for home lab environments.

Install the "Elastic Search" and "Cluster Logging" operators via the Web Console, See https://docs.openshift.com/container-platform/4.6/logging/cluster-logging-deploying.html

Make sure you select operators provided by Red Hat, Inc and not proprietary or community versions.

Check the current state of the cluster-logging project:

oc project openshift-logging
oc get pods
NAME                                       READY   STATUS    RESTARTS   AGE
cluster-logging-operator-f58c98989-2jrxx   1/1     Running   0          28m

Ephemeral logging

To run logging with ephemeral storage, meaning all of a pod’s data is lost upon restart because no real storage is provided, perfect for a lab.

Via the web console, Administration → Custom Resource Definitions → ClusterLogging → Instances → Create ClusterLogging.

Make note of the resources: limits:, The following example has reduced memory defined and emptyDir: {} for storage, nodeSelector can be omitted if no infra nodes are defined.

Paste in the following logging instance:

apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
  name: "instance"
  namespace: "openshift-logging"
spec:
  managementState: "Managed"
  logStore:
    type: "elasticsearch"
    retentionPolicy:
      application:
        maxAge: 1d
      infra:
        maxAge: 7d
      audit:
        maxAge: 7d
    elasticsearch:
      nodeCount: 3
      nodeSelector:
        node-role.kubernetes.io/infra: ''
      storage:
        emptyDir: {}
      redundancyPolicy: "SingleRedundancy"
      resources:
        limits:
          memory: 3Gi
        requests:
          memory: 3Gi
  visualization:
    type: "kibana"
    kibana:
      replicas: 1
  curation:
    type: "curator"
    curator:
      schedule: "30 3 * * *"
  collection:
    logs:
      type: "fluentd"
      fluentd: {}

A working deployment should look something like this:

oc project openshift-logging
oc get pods
NAME                                            READY   STATUS    RESTARTS   AGE
cluster-logging-operator-f58c98989-2jrxx        1/1     Running   0          53m
elasticsearch-cdm-2p9fwrm5-1-8fff599cb-j7xh2    2/2     Running   0          3m4s
elasticsearch-cdm-2p9fwrm5-2-944758ff6-zrv9p    2/2     Running   0          3m1s
elasticsearch-cdm-2p9fwrm5-3-68bfc4b584-vbdqp   2/2     Running   0          2m58s
fluentd-bzmw4                                   1/1     Running   0          3m11s
fluentd-msv2p                                   1/1     Running   0          3m11s
fluentd-sglqw                                   1/1     Running   0          3m12s
kibana-86f69c8b84-62b7r                         2/2     Running   0          3m7s

Fluentd runs an instance on every node in the cluster.

Local storage logging

Assuming the PVs are available with a storage class of local-sc as described in the https://www.richardwalker.dev/pragmatic-openshift/#_local_storage section of this document. The following logging instance includes the storage class definition. Both the storageClassName and size are added:

apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
  name: "instance"
  namespace: "openshift-logging"
spec:
  managementState: "Managed"
  logStore:
    type: "elasticsearch"
    retentionPolicy:
      application:
        maxAge: 1d
      infra:
        maxAge: 7d
      audit:
        maxAge: 7d
    elasticsearch:
      nodeCount: 3
      nodeSelector:
        node-role.kubernetes.io/infra: ''
      storage:
        storageClassName: local-sc
        size: 50G
      redundancyPolicy: "SingleRedundancy"
      resources:
        limits:
          memory: 3Gi
        requests:
          memory: 3Gi
  visualization:
    type: "kibana"
    kibana:
      replicas: 1
  curation:
    type: "curator"
    curator:
      schedule: "30 3 * * *"
  collection:
    logs:
      type: "fluentd"
      fluentd: {}
oc project openshift-logging
oc get pods

If successful, the PVC should be claimed and bound:

oc get pvc
NAME                                         STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS   AGE
elasticsearch-elasticsearch-cdm-lk4f9958-1   Bound    local-pv-d4d267e6   50Gi       RWO            local-sc       22s
elasticsearch-elasticsearch-cdm-lk4f9958-2   Bound    local-pv-297ca047   50Gi       RWO            local-sc       22s
elasticsearch-elasticsearch-cdm-lk4f9958-3   Bound    local-pv-6317e505   50Gi       RWO            local-sc       22s

Log Forwarding

To test log forwarding in a lab environment, external services need deploying and configuring to receive them.

Elasticsearch
Deploy External EFK

Create a Virtual Machine, this example uses 4 CPU cores, 8GB of memory and 60GB storage with bridge networking so the IP Address of the EFK VM is on the same network as my OpenShift 4.6 home lab.

Assuming CentOS 8.2 is installed on the VM, make sure all is up-to-date:

dnf update -y
reboot

Install Java:

dnf install java-11-openjdk-devel -y

Add EPEL:

dnf install epel-release -y

Reducing steps in this document and to remove potential issues, disabling both SELinux and firewalld:

vi /etc/sysconfig/selinux
SELINUX=disabled
systemctl stop firewalld
systemctl disable firewalld

Elasticsearch

Add the Elasticsearch repository:

vi /etc/yum.repos.d/elasticsearch.repo
[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Import the key:

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Install Eleasticsearch:

dnf install elasticsearch -y

Back up the original configuration:

cp /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml.original

Strip out the noise:

grep -v -e '^#' -e '^$' /etc/elasticsearch/elasticsearch.yml.original > /etc/elasticsearch/elasticsearch.yml

Add the following settings to expose Elasticsearch to the network:

cluster.name: my-efk
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
transport.host: localhost
transport.tcp.port: 9300
http.port: 9200
network.host: 0.0.0.0
cluster.initial_master_nodes: node-1

Start and enable the service:

systemctl enable elasticsearch.service --now

Kibana

Install Kibana:

dnf install kibana -y

Back up the original configuration:

cp /etc/kibana/kibana.yml /etc/kibana/kibana.yml.original

Update the configuration for the Elasticsearch host:

vi /etc/kibana/kibana.yml
elasticsearch.hosts: [“http://localhost:9200"]

Start and enable Kibana:

systemctl enable kibana.service --now

NGINX

Install NGINX:

dnf install nginx -y

Create a user name and password for Kibana:

echo "kibana:`openssl passwd -apr1`" | tee -a /etc/nginx/htpasswd.kibana

Back up the original configuration:

cp /etc/kibana/kibana.yml /etc/kibana/kibana.yml.original

Add the following configuration:

vi /etc/kibana/kibana.yml
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events {
    worker_connections 1024;
}
http {
    log_format main '$remote_addr — $remote_user [$time_local] "$request"'
    '$status $body_bytes_sent "$http_referer"'
    '"$http_user_agent" "$http_x_forwarded_for"';
    access_log /var/log/nginx/access.log main;
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    include /etc/nginx/conf.d/*.conf;
    server {
        listen 80;
        server_name _;
        auth_basic "Restricted Access";
        auth_basic_user_file /etc/nginx/htpasswd.kibana;
    location / {
        proxy_pass http://localhost:5601;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection ‘upgrade’;
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        }
    }
}

Start and enable NGINX:

systemctl enable nginx.service --now
Smoke testing

With all that in place, test Elasticsearch is up and running, the following should return a JSON response:

curl http://127.0.0.1:9200/_cluster/health?pretty

You should be able access Kibana via a browser at the IP Address of your instance, in my case http://192.168.0.70

Once in there, navigate to "Management" → "Stack Management", Under "Kibana" → "Index Patterns" and click "Create Index Pattern". This is where you will see various sources to index.

From a command line PUT an example data:

curl -X PUT "192.168.0.70:9200/characters/_doc/1?pretty" -H 'Content-Type: application/json' -d '{"name": "Mickey Mouse"}
curl -X PUT "192.168.0.70:9200/characters/_doc/2?pretty" -H 'Content-Type: application/json' -d '{"name": "Daffy Duck"}
curl -X PUT "192.168.0.70:9200/characters/_doc/3?pretty" -H 'Content-Type: application/json' -d '{"name": "Donald Duck"}
curl -X PUT "192.168.0.70:9200/characters/_doc/4?pretty" -H 'Content-Type: application/json' -d '{"name": "Bugs Bunny"}

In Kibana, when you go to "Create Index Pattern" as described before, you should now see characters has appeared, type characters* and click "Next step" and create the index pattern. Navigate to "Kibana" → "Discover" and if you have more than one "Index Pattern" select the characters* index from the drop-down menu (near top left) and you should see the data you PUT into Elasticsearch.

This pattern is what I use to see and add indexes to Kibana when adding forwarders.

For reference you can return individual results using:

curl -X GET "localhost:9200/characters/_doc/1?pretty"
Forwarding

Example of OCP 4.6 log forwarding of application logs to an external Elasticsearch stack:

vi log-forwarding.yaml
apiVersion: "logging.openshift.io/v1"
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  outputs:
   - name: elasticsearch-insecure
     type: "elasticsearch"
     url: http://192.168.0.70:9200
  pipelines:
   - name: application-logs
     inputRefs:
     - application
     outputRefs:
     - elasticsearch-insecure
     labels:
       logs: application
oc create -f log-forwarding.yaml
oc project openshift-logging
oc get pods
Rsyslog
Rsyslog receiver

To test rsyslog forwarding, configure rsyslog on a RHEL/CentOS 8 host. In this example, UDP with a DNS name of syslog.cluster.lab.com.

Rsyslog should already be enabled and running:

systemctl status rsyslog
vi /etc/rsyslog.conf

Uncomment the lines:

module(load="imudp")
input(type="imudp" port="514")

Add a rule for local0, something like:

local0.*                       /var/log/openshift.log

Either stop and disable firewalld or add the follwoing rule:

firewall-cmd  --add-port=514/udp  --zone=public  --permanent
firewall-cmd --reload

Restart rsyslog:

systemctl restart rsyslog

Test the receiving

From any other Linux host, configure rsyslog to forward UDP:

vi /etc/rsyslog.conf
*.* @syslog.cluster.lab.com:514     # Use @ for UDP protocol
systemctl restart rsyslog

Send a test message:

logger -p local0.notice "Hello, this is test!"
Forwarding

Here is an example of creating a syslog forwarder for just a single project:

vi rsyslog-forwarder.yaml
apiVersion: "logging.openshift.io/v1"
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  inputs:
    - application:
        namespaces:
          - my-project
      name: django-logger-logs
  outputs:
    - name: rsyslog-test
      syslog:
        appName: cluster-apps
        facility: local0
        msgID: cluster-id
        procID: cluster-proc
        rfc: RFC5424
        severity: debug
      type: syslog
      url: 'udp://192.168.0.145:514'
  pipelines:
    - inputRefs:
        - django-logger-logs
      labels:
        syslog: rsyslog-test
      name: syslog-test
      outputRefs:
        - rsyslog-test
oc create -f rsyslog-forwarder.yaml

Testing app

The following application was written to trigger event in log files for testing:

Create a new project:

oc new-project logging-project

Import my s2i-python38-container image:

oc import-image django-s2i-base-img --from quay.io/richardwalkerdev/s2i-python38-container --confirm

Deploy the application:

oc new-app --name django-logger django-s2i-base-img~https://github.com/richardwalkerdev/django-logger.git

And expose the route:

oc expose service/django-logger
Forwarding

With the testing application deployed the following example combines forwarding to the external Elasticsearch (v7) and Rsyslog. This example also includes "forwarding" to the EFK stack (v6) deployed on OpenShift by specifying default in the outputRefs. Moreover, the forwarding is limited to just the logging-project project/namespace.

vi log-forwarding.yaml
apiVersion: "logging.openshift.io/v1"
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  inputs:
    - application:
        namespaces:
          - logging-project
      name: project-logs
  outputs:
    - name: elasticsearch-insecure
      type: elasticsearch
      url: 'http://192.168.0.70:9200'
    - name: rsyslog-insecure
      syslog:
        appName: cluster-apps
        facility: local0
        msgID: cluster-id
        procID: cluster-proc
        rfc: RFC5424
        severity: debug
      type: syslog
      url: 'udp://192.168.0.145:514'
  pipelines:
    - inputRefs:
        - project-logs
      labels:
        logs: application
      name: application-logs
      outputRefs:
        - elasticsearch-insecure
        - rsyslog-insecure
        - default
oc create -f log-forwarding.yaml

Going to the application, for example: http://django-logger-logging-project.apps.cluster.lab.com/

Logger app

Generate some logs by clicking the buttons.

Example from OCP EFK - Kibana v6:

Kibana V6

Example from External - Kibana v7:

Kibana V7

Example of rsyslog:

Rsyslog

Troubleshooting

oc project openshift-logging
oc get pods
Insufficient memory
oc describe pod elasticsearch-cdm-uz12dkcd-1-6cf9ff6cb9-945gg
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  33m   default-scheduler  0/6 nodes are available: 3 Insufficient memory, 3 node(s) didn't match node selector.

Resource limits set for elasticsearch must be available on the nodes, either increase memory on the hosts or decrease the memory in the settings.

Delete cluster logging

Administration → Custom Resource Definitions → ClusterLogging → Instances → Create ClusterLogging

Delete the cluster logging instance.

MONITORING

Configuring the cluster monitoring stack on OpenShift Container Platform.

The document demonstrates deploying the monitoring stack using NFS storage.

NFS is NOT recommended and decent block storage should be used, refer to the official documentation and substitute the storage class for a recommend one.

NFS requirements

Prepare the following NFS shares:

/mnt/openshift/alertmanager1    192.168.0.1/24(rw,sync,no_wdelay,no_root_squash,insecure)
/mnt/openshift/alertmanager2    192.168.0.1/24(rw,sync,no_wdelay,no_root_squash,insecure)
/mnt/openshift/alertmanager3    192.168.0.1/24(rw,sync,no_wdelay,no_root_squash,insecure)
/mnt/openshift/prometheus1      192.168.0.1/24(rw,sync,no_wdelay,no_root_squash,insecure)
/mnt/openshift/prometheus2      192.168.0.1/24(rw,sync,no_wdelay,no_root_squash,insecure)

Ensure permissions on the share directories:

chmod 775 /mnt/openshift/*

Exported the new shares with:

exportfs -arv

And confirmed the shares are visible:

exportfs  -s
showmount -e 127.0.0.1

Alert manager

These steps demonstrate how to add persistent storage for Alert Manager.

Create PVs

With shares available, add the alert manager PVs, NOTE: the accessMode is set to ReadWriteOnce:

vi alert-manager-nfs-pv.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: alertmanager-pv1
spec:
  capacity:
    storage: 40Gi
  accessModes:
  - ReadWriteOnce
  nfs:
    path: /mnt/openshift/alertmanager1
    server: 192.168.0.15
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: alertmanager-pv2
spec:
  capacity:
    storage: 40Gi
  accessModes:
  - ReadWriteOnce
  nfs:
    path: /mnt/openshift/alertmanager2
    server: 192.168.0.15
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: alertmanager-pv3
spec:
  capacity:
    storage: 40Gi
  accessModes:
  - ReadWriteOnce
  nfs:
    path: /mnt/openshift/alertmanager3
    server: 192.168.0.15
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
oc create -f alert-manager-nfs-pv.yaml

Use oc get pv to display PVs.

Configure

First, check whether the cluster-monitoring-config ConfigMap object exists:

oc -n openshift-monitoring get configmap cluster-monitoring-config
Error from server (NotFound): configmaps "cluster-monitoring-config" not found

If not, create it:

vi cluster-monitoring-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
oc apply -f cluster-monitoring-config.yaml
oc -n openshift-monitoring get configmap cluster-monitoring-config
NAME                        DATA   AGE
cluster-monitoring-config   1      3s

This step is easier via the web console, amend Workloads → Config Maps (Select Project openshift-monitoring) → "cluster-monitoring-config" → YAML

Add the following:

data:
  config.yaml: |+
    alertmanagerMain:
      volumeClaimTemplate:
        metadata:
          name: alertmanager-claim
        spec:
          storageClassName: nfs
          resources:
            requests:
              storage: 40Gi

Take note of the storage size, storage class name and node selector (if applicable) for your environment.

Make sure you in the right project:

oc project openshift-monitoring

You should see the three alertmanager-main pods recreating:

oc get pods
NAME                                           READY   STATUS              RESTARTS   AGE
alertmanager-main-0                            0/5     ContainerCreating   0          36s
alertmanager-main-1                            0/5     ContainerCreating   0          36s
alertmanager-main-2                            0/5     ContainerCreating   0          36s

And the that PVCs have been claimed:

oc get pvc
NAME                                     STATUS   VOLUME             CAPACITY   ACCESS MODES   STORAGECLASS   AGE
alertmanager-claim-alertmanager-main-0   Bound    alertmanager-pv1   40Gi       RWO            nfs            26h
alertmanager-claim-alertmanager-main-1   Bound    alertmanager-pv3   40Gi       RWO            nfs            26h
alertmanager-claim-alertmanager-main-2   Bound    alertmanager-pv2   40Gi       RWO            nfs            26h

Prometheus

Create PVs

Add the prometheus PVs:

vi prometheus-nfs-pv.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-pv1
spec:
  capacity:
    storage: 40Gi
  accessModes:
  - ReadWriteOnce
  nfs:
    path: /mnt/openshift/prometheus1
    server: 192.168.0.15
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-pv2
spec:
  capacity:
    storage: 40Gi
  accessModes:
  - ReadWriteOnce
  nfs:
    path: /mnt/openshift/prometheus2
    server: 192.168.0.15
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
oc create -f prometheus-nfs-pv.yaml
Configure

Again via the web console, amend Workloads → Config Maps (Select Project openshift-monitoring) → "cluster-monitoring-config" → YAML

And add the following:

    prometheusK8s:
      volumeClaimTemplate:
          metadata:
            name: prometheus-claim
          spec:
            storageClassName: nfs
            resources:
              requests:
                storage: 40Gi

Note, this is appended so the whole configuration should look like this:

data:
  config.yaml: |+
    alertmanagerMain:
      volumeClaimTemplate:
        metadata:
          name: alertmanager-claim
        spec:
          storageClassName: nfs
          resources:
            requests:
              storage: 40Gi
    prometheusK8s:
      volumeClaimTemplate:
          metadata:
            name: prometheus-claim
          spec:
            storageClassName: nfs
            resources:
              requests:
                storage: 40Gi

Once completed, you should see all the PVCs have been claimed:

oc get pvc
NAME                                     STATUS   VOLUME             CAPACITY   ACCESS MODES   STORAGECLASS   AGE
alertmanager-claim-alertmanager-main-0   Bound    alertmanager-pv1   40Gi       RWO            nfs            4m32s
alertmanager-claim-alertmanager-main-1   Bound    alertmanager-pv3   40Gi       RWO            nfs            4m32s
alertmanager-claim-alertmanager-main-2   Bound    alertmanager-pv2   40Gi       RWO            nfs            4m32s
prometheus-claim-prometheus-k8s-0        Bound    prometheus-pv1     40Gi       RWO            nfs            14s
prometheus-claim-prometheus-k8s-1        Bound    prometheus-pv2     40Gi       RWO            nfs            14s

And everything running correctly:

oc get pods
NAME                                           READY   STATUS    RESTARTS   AGE
alertmanager-main-0                            5/5     Running   0          26h
alertmanager-main-1                            5/5     Running   0          26h
alertmanager-main-2                            5/5     Running   0          26h
cluster-monitoring-operator-75f6b78475-4f4s9   2/2     Running   3          2d2h
grafana-74564f7ff4-sqw8g                       2/2     Running   0          2d2h
kube-state-metrics-b6fb95865-hzsst             3/3     Running   0          2d2h
node-exporter-ccmbm                            2/2     Running   0          2d2h
node-exporter-n5sdt                            2/2     Running   0          2d2h
node-exporter-psbt4                            2/2     Running   0          2d2h
openshift-state-metrics-5894b6c4df-fv9km       3/3     Running   0          2d2h
prometheus-adapter-58d9999987-9lltc            1/1     Running   0          27h
prometheus-adapter-58d9999987-lhtvc            1/1     Running   0          27h
prometheus-k8s-0                               7/7     Running   1          26h
prometheus-k8s-1                               7/7     Running   1          26h
prometheus-operator-68f6b4f6bb-4mxcn           2/2     Running   0          47h
telemeter-client-79d6fc74c-wjqgw               3/3     Running   0          2d2h
thanos-querier-66f4b4c758-2z4f6                4/4     Running   0          2d2h
thanos-querier-66f4b4c758-fsqfz                4/4     Running   0          2d2h

Node selectors

If using infra nodes, add node selectors to the configuration, here is a complete example for OCP 4.6:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |+
    alertmanagerMain:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      volumeClaimTemplate:
        metadata:
          name: alertmanager-claim
        spec:
          storageClassName: local-sc
          resources:
            requests:
              storage: 40Gi
    prometheusK8s:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      volumeClaimTemplate:
          metadata:
            name: prometheus-claim
          spec:
            storageClassName: local-sc
            resources:
              requests:
                storage: 40Gi
    prometheusOperator:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    grafana:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    k8sPrometheusAdapter:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    kubeStateMetrics:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    telemeterClient:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    openshiftStateMetrics:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    thanosQuerier:
      nodeSelector:
        nodename: worker1.cluster.lab.com
        nodename: worker2.cluster.lab.com

CERTIFICATES

Replicate a local Certificate Authority (CA) for generating SSL certificates and applying them to OpenShift. The first step is to generate a root certificate and a private key. Then add the root certificate to any host, and then certificates generated and signed will be inherently trusted.

Local CA

On a local Linux client, Generate a private key, you’ll be prompted for a pass phrase:

openssl genrsa -des3 -out local_ca.key 2048
Generating RSA private key, 2048 bit long modulus (2 primes)
.............+++++
.......................................................................................................................+++++
e is 65537 (0x010001)
Enter pass phrase for local_ca.key: changeme
Verifying - Enter pass phrase for local_ca.key: changeme

Which generated a private key file local_ca.key.

Generate a root certificate:

openssl req -x509 -new -nodes -key local_ca.key -sha256 -days 1825 -out local_ca.pem

Enter the password you just set and I used the following bogus details:

Country Name (2 letter code) [XX]:UK
State or Province Name (full name) []:CA County
Locality Name (eg, city) [Default City]:CA City
Organization Name (eg, company) [Default Company Ltd]:Local Certificate Authority
Organizational Unit Name (eg, section) []:CA Unit
Common Name (eg, your name or your server's hostname) []:ca.local
Email Address []:[email protected]

View root certificate:

openssl x509 -in local_ca.pem --text

Install root certificate

On a local Linux RHEL 8/CentOS 8 client:

sudo cp local_ca.pem /etc/pki/ca-trust/source/anchors/
update-ca-trust extract

Signed certificate

Create a private key:

openssl genrsa -out cluster.lab.com.key 2048

Create a CSR, with a Common Name (CN) in this case c`luster.lab.com:

openssl req -new -key cluster.lab.com.key -out cluster.lab.com.csr
This example relies on the alt_names, you might wish to create two certificates with the Common Names *.apps.cluster.lab.com and api.cluster.lab.com.
Country Name (2 letter code) [XX]:UK
State or Province Name (full name) []:OCP County
Locality Name (eg, city) [Default City]:OCP City
Organization Name (eg, company) [Default Company Ltd]:OpenShift Container Platform
Organizational Unit Name (eg, section) []:OCP Unit
Common Name (eg, your name or your server's hostname) []:cluster.lab.com
Email Address []:[email protected]

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

Create a configuration file needed to define the Subject Alternative Name (SAN) extension, this allows multiple, alternative DNS validations. For OpenShift there are two required, one for the ingress traffic for users to access application deployed. The second is for the API. This method means the one certificate can be used for both cases.

vi cluster.lab.com.ext
authorityKeyIdentifier=keyid,issuer
basicConstraints=CA:FALSE
keyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment
subjectAltName = @alt_names

[alt_names]
DNS.1 = *.apps.cluster.lab.com
DNS.2 = api.cluster.lab.com

Create the certificate, you’ll be prompted for the root certificate password again:

openssl x509 -req -in cluster.lab.com.csr -CA local_ca.pem -CAkey local_ca.key -CAcreateserial -out cluster.lab.com.crt -days 825 -sha256 -extfile cluster.lab.com.ext

You should have all these files:

cluster.lab.com.crt
cluster.lab.com.csr
cluster.lab.com.ext
cluster.lab.com.key
local_ca.key
local_ca.pem
local_ca.srl

An addition recommended, yet optional step is the add the root certificate at the end of the new client certificate file:

cat local_ca.pem >> cluster.lab.com.crt

View the final certificate:

openssl x509 -in cluster.lab.com.crt -text

Verify the certificate:

openssl verify -CAfile local_ca.pem cluster.lab.com.crt
cluster.lab.com.crt: OK

Ingress certificate

Create a secret in the openshift-ingress name-space containing both the certificate and private key:

oc create secret tls apps-cert --cert=cluster.lab.com.crt --key=cluster.lab.com.key -n openshift-ingress

And apply the patch, make sure the name matches the name of the secret just added, in this case apps-cert:

oc patch ingresscontroller.operator default --type=merge -p '{"spec":{"defaultCertificate": {"name": "apps-cert"}}}' -n openshift-ingress-operator

To view the changes taking place:

oc project openshift-ingress

You should see the two route pods rebuild:

oc get pods
NAME                             READY   STATUS    RESTARTS   AGE
router-default-8d9fbbfb7-55xpt   1/1     Running   0          102s
router-default-8d9fbbfb7-w6zhn   1/1     Running   0          118s

API certificate

Create a secret in the openshift-config name-space containing both the certificate and private key:

oc create secret tls api-cert --cert=cluster.lab.com.crt --key=cluster.lab.com.key -n openshift-config

Again, apply the patch, making sure the name matches the name of the secret just added, in this case api-cert, and the domain matches you API URL in this case api.cluster.lab.com:

oc patch apiserver cluster --type=merge -p '{"spec":{"servingCerts": {"namedCertificates":[{"names": ["api.cluster.lab.com"], "servingCertificate": {"name": "api-cert"}}]}}}'

To see the effect of the previous patch:

oc get apiserver cluster -o yaml
spec:
  servingCerts:
    namedCertificates:
    - names:
      - api.cluster.lab.com
      servingCertificate:
        name: api-cert

To view the changes taking place:

oc project openshift-kube-apiserver
oc get pods

You should see three kube-apiserver pods redeploy (this took a while for me):

kube-apiserver-master1.cluster.lab.com       4/4     Running     0          3m58s
kube-apiserver-master2.cluster.lab.com       4/4     Running     0          10m
kube-apiserver-master3.cluster.lab.com       4/4     Running     0          7m11s

Once all three pods have complete redeployment, check and validate the certificate has been applied:

openssl s_client -connect api.cluster.lab.com:6443

And/or:

curl -vvI https://api.cluster.lab.com:6443

Sometimes the trusted certs on a client doesn’t take full effect, you can provide the CA certificate explicitly:

curl --cacert local_ca.pem -vvI https://api.cluster.lab.com:6443

Test logging in:

oc login -u admin -p changme https://api.cluster.lab.com:6443

Another trick is to specify your certificate-authority certificate:

oc login --certificate-authority=ca.crt https://api.cluster.lab.com:6443

Replace certificates

To replace certificates the following commands can be used:

Example for ingress (*.apps):

oc create secret tls apps-cert --cert=api.cluster.lab.com.crt --key=api.cluster.lab.com.key -n openshift-ingress --dry-run=client -o yaml| oc replace -f -

Example for api:

oc create secret tls api-cert --cert=api.cluster.lab.com.crt --key=api.cluster.lab.com.key -n openshift-config --dry-run=client -o yaml| oc replace -f -

Add CA Bundle

Using your CA certificate:

vi user-ca-bundle.yaml
apiVersion: v1
data:
  ca-bundle.crt: |
    -----BEGIN CERTIFICATE-----
    MIIEKTCCAxGgAwIBAgIUTO5Cn1LKQtoaWrfcOnHSdRBmpvwwDQYJKoZIhvcNAQEL
    BQAwgaMxCzAJBgNVBAYTAlVLMRMwEQYDVQQIDApPQ1AgQ291bnR5MREwDwYDVQQH
    DAhPQ1AgQ2l0eTElMCMGA1UECgwcT3BlblNoaWZ0IENvbnRhaW5lciBQbGF0Zm9y
    bTERMA8GA1UECwwIT0NQIFVuaXQxETAPBgNVBAMMCGNhLmxvY2FsMR8wHQYJKoZI
    hvcNAQkBFhBub3JlcGx5QGNhLmxvY2FsMB4XDTIwMTExMjE1NDYxNVoXDTI1MTEx
    MTE1NDYxNVowgaMxCzAJBgNVBAYTAlVLMRMwEQYDVQQIDApPQ1AgQ291bnR5MREw
    DwYDVQQHDAhPQ1AgQ2l0eTElMCMGA1UECgwcT3BlblNoaWZ0IENvbnRhaW5lciBQ
    bGF0Zm9ybTERMA8GA1UECwwIT0NQIFVuaXQxETAPBgNVBAMMCGNhLmxvY2FsMR8w
    HQYJKoZIhvcNAQkBFhBub3JlcGx5QGNhLmxvY2FsMIIBIjANBgkqhkiG9w0BAQEF
    AAOCAQ8AMIIBCgKCAQEAuSidKVFVoKFv3QBHTTgjfhPyvsL4O8H530ehb7iap71b
    Bw2bzxSnrB84Vh4EeZ+pF4cAfK8jquvq2kJjPOGzuflc0aAVWzq6DYJLGRP5T6Sw
    v8Zzlnf0EwSBQRxKs3MNlfM36uRkJMsTxxlKYsBsMP51bT9PNYzPqQ6WcDZyclf+
    OGhnb2uUDud9oGLVapeHfibiyfSahgnnds3UyjWtYUP3sgWDPCKOpXIqFGcCqdfs
    rgRndEq6Leu3/yxnxNwQmB5v3+XAUybUSU8U+cJDYrsyxu5wtYDI75Eo6ocIbVxx
    T+waMwQPLhzMv8YfhNn91l4S0lHR5GL1c7RY3ms+xQIDAQABo1MwUTAdBgNVHQ4E
    FgQURK6H+pSQSQqce0NyZiEbVjbCdukwHwYDVR0jBBgwFoAURK6H+pSQSQqce0Ny
    ZiEbVjbCdukwDwYDVR0TAQH/BAUwAwEB/zANBgkqhkiG9w0BAQsFAAOCAQEAhMRz
    F+e6pV7eQXyyiExIMoTI3hqubsRTANmNbXkNBrCRswUoe7T1F3146G9B2wAFQAtH
    vda4NcS+i1yW4QG0cgnfJRcPsRnTSEmezia4aHn7vUW3oA8HGL47zc+tlQPV6EKd
    hjtdH8R2GIB5CeBhEp1I9DuX2owWEemAnrZxfGjQJxTvCEOprkCJBzozNumwMhZZ
    gmzBUeKYbQHVH0oGATGqKph8X36NGPtUdIDY80INThMS0XvvH7fndX1HOEuB37mn
    UW7CPsnoMWXf8SsPon4g6aMuKsDpKUqsuvT3RNFHofZIBnqXdCYzbdYbzrZ5ppBH
    sx3KXS6+lZijVVMwoA==
    -----END CERTIFICATE-----
kind: ConfigMap
metadata:
  name: user-ca-bundle
  namespace: openshift-config
oc create -f user-ca-bundle.yaml

Now edit the cluster proxy configuration (even though a proxy might not be in use):

oc edit proxy/cluster
This causes a scheduled reboot of all your nodes.

Replace the spec: with:

spec:
  trustedCA:
    name: user-ca-bundle

This change is a Machine Config and adds the bundle to each nodes ca-trust:

SSH to a node, for example:

ssh -i cluster_id_rsa [email protected]
sudo su -

The following file gets updated with your certificates:

/etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
openssl x509 -in openshift-config-user-ca-bundle.crt --text

Once all the nodes reboot your ca-bundle is included.

IDENTITY PROVIDERS

It is essential to break down OpenShift components and concepts into digestible chunks and avoid the risk of being overwhelmed with complexity.

  • An Identity provider deals with the authentication layer and is responsible for identifying a user.

  • The authorisation layer determines if requests are honoured, Role-based access control (RBAC) policy determines what a user is authorized to do.

The combination of groups and roles deals with authorisation.

HTPasswd

On a Linux client install the tools:

dnf install httpd-tools -y

Create an HTPasswd file containing users:

htpasswd -c -B -b users.htpasswd admin changeme
htpasswd -b users.htpasswd tom changeme
htpasswd -b users.htpasswd dick changeme
htpasswd -b users.htpasswd harry changeme

Which should look something like this:

cat users.htpasswd
admin:$2y$05$GTvOfcm2An9XdAIyDtwzGOvjGrroac78.NHrDdySO0KOBKAPaYGgi
tom:$apr1$kouuYCYa$wlB2AB4.Ykxn/4QgHUtD9.
dick:$apr1$IETeTG0v$g0P0gqR6aQJTCaGS15QWa0
harry:$apr1$qhyrJZzc$HBCYSf9OFHRpM6he0LJ9k.

The following command runs locally and generates the needed yaml file for OpenShift:

oc create secret generic htpasswd-secret --from-file=htpasswd=users.htpasswd -n openshift-config -o yaml --dry-run=client > htpasswd-secret.yaml

Which can then be used to create or replace the secret:

oc create -f htpasswd-secret.yaml
oc replace -f htpasswd-secret.yaml

For reference, is you wish to extract an existing htpasswd file out of OpenShift use the following:

oc get secret htpasswd-secret -ojsonpath={.data.htpasswd} -n openshift-config | base64 -d > test.htpasswd

Next, either via the web console, Administration → Cluster Settings → Global Configuration → OAuth → YAML

Or via the command line:

vi oauth.yaml
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
  - name: htpasswd_provider
    mappingMethod: claim
    type: HTPasswd
    htpasswd:
      fileData:
        name: htpasswd-secret
oc replace -f oauth.yaml

Cluster admin

Using RBAC to add the role cluster-admin to the new admin account:

oc adm policy add-cluster-role-to-user cluster-admin admin

If the account has not being used to log into the cluster a warning will display:

Warning: User 'admin' not found
clusterrole.rbac.authorization.k8s.io/cluster-admin added: "admin"

Log into OpenShift for each user first to "create the user" in OpenShift and avoid the warning.

oc adm policy add-cluster-role-to-user edit tom
clusterrole.rbac.authorization.k8s.io/edit added: "tom"

You can also add users limited to projects:

oc adm policy add-role-to-user edit harry -n logging-sensitive-data

Remove kubeadmin

OpenShift clusters are deployed with an install generated kubeadmin account. Once identity providers are fully configured it is recommend security best practice to remove this default account.

The kubeadmin password is stored in cluster/auth/kubeadmin-password.

Ensuring you have added at least one other user with cluster-admin role, the kubeadmin account can be removed using:

oc delete secrets kubeadmin -n kube-system

LDAP

Deploy LDAP

This process covers deploying an application using container images. In this case deploying a basic LDAP service for testing identity providers. Such a service would always be external to the cluster. Using this image by Rafael Römhild here https://github.com/rroemhild/docker-test-openldap

On another host on the same subnet as the cluster and load balancer, pull the image:

podman pull docker.io/rroemhild/test-openldap

Create a pod:

podman pod create -p 389 -p 636 -n ldappod
If you see an error error from slirp4netns while setting up port redirection: map[desc:bad request: add_hostfwd: slirp_add_hostfwd failed] you need to add the following kernel parameter
vi /etc/sysctl.conf
net.ipv4.ip_unprivileged_port_start = 0
sudo sysctl -p

Launch the container:

podman run --privileged -d --pod ldappod rroemhild/test-openldap

Open the firewall ports for LDAP for accessing it directly from external hosts:

firewall-cmd --permanent --add-port=389/tcp
firewall-cmd --permanent --add-port=636/tcp
firewall-cmd --reload

Test the service locally:

ldapsearch -h 127.0.0.1 -p 389 -D cn=admin,dc=planetexpress,dc=com -w GoodNewsEveryone -b "dc=planetexpress,dc=com" -s sub "(objectclass=*)"

Optionally, add a DNS entry ldap.cluster.lab.com and a load balancer in HAProxy:

frontend ldap
    bind 0.0.0.0:389
    option tcplog
    mode tcp
    default_backend ldap

backend ldap
    mode tcp
    balance roundrobin
    server ldap 192.168.0.15:389 check

List erveything:

ldapsearch -h ldap.cluster.lab.com -p 389 -D cn=admin,dc=planetexpress,dc=com -w GoodNewsEveryone -b "dc=planetexpress,dc=com" -s sub "(objectclass=*)"

List only users returning only the common names and uid:

ldapsearch -h ldap.cluster.lab.com -p 389 -D cn=admin,dc=planetexpress,dc=com -w GoodNewsEveryone -x -s sub -b "ou=people,dc=planetexpress,dc=com" "(objectclass=inetOrgPerson)" cn uid

List only groups:

ldapsearch -h ldap.cluster.lab.com -p 389 -D cn=admin,dc=planetexpress,dc=com -w GoodNewsEveryone -x -s sub -b "ou=people,dc=planetexpress,dc=com" "(objectclass=Group)"
LDAP Identity Provider

Add a secret to OpenShift that contains the LDAP bind password:

Admin account:

cn=admin,dc=planetexpress,dc=com

Bind password:

GoodNewsEveryone

Create a secret called ldap-bind-password in the openshift-config name-space:

oc create secret generic ldap-bind-password --from-literal=bindPassword=GoodNewsEveryone -n openshift-config

Either use the web console to append the LDAP identity by navigating to Administration → Cluster Settings → Global Configuration → OAuth.

Or via the CLI:

oc project openshift-authentication
oc get OAuth
oc edit OAuth cluster

Below spec:` add the -ldap part, for example:

spec:
  identityProviders:
    - htpasswd:
        fileData:
          name: htpass-secret
      mappingMethod: claim
      name: htpasswd_provider
      type: HTPasswd
    - ldap:
        attributes:
          email:
            - mail
          id:
            - dn
          name:
            - cn
          preferredUsername:
            - uid
        bindDN: 'cn=admin,dc=planetexpress,dc=com'
        bindPassword:
          name: ldap-bind-password
        insecure: true
        url: 'ldap://ldap.cluster.lab.com/DC=planetexpress,DC=com?uid?sub?(memberOf=cn=admin_staff,ou=people,dc=planetexpress,dc=com)'
      mappingMethod: claim
      name: ldap
      type: LDAP

In the openshift-authentication project, there will be two pods oauth-openshift-xxxxxxxxxx-xxxxx. These we be terminated and recreated every time you make a change to the configuration. Once saving changes, expect to see something like this:

oc get pods
NAME                                 READY   STATUS              RESTARTS   AGE
oauth-openshift-7f95bc7996-5vl2z     1/1     Terminating         0          13m
oauth-openshift-7f95bc7996-854xb     1/1     Terminating         0          13m
oauth-openshift-ccd6bc654-mrbc6      1/1     Running             0          17s
oauth-openshift-ccd6bc654-qh29m      1/1     Running             0          7s

For troubleshooting issues you can tail the logs for each of the running pods, for example:

oc logs oauth-openshift-ccd6bc654-mrbc6 -f

The web console will now have a "Log in with" option for LDAP, and in this case, the user hermes (with password hermes) should be able to log in because that user is a member of the admin_staff group. Trying the user fry (with password fry) fails because they are NOT a member of the admin_staff group.

The example in this document is fundamental. In the real world, there is often trial and error, the key is being able to search LDAP an understand the directory information tree (DIT).

For including more that one group in the LDAP identity provider, you can use the following syntax:

ldap://ldap.cluster.lab.com/DC=planetexpress,DC=com?uid?sub??(|(memberOf=cn=admin_staff,ou=people,dc=planetexpress,dc=com)(memberOf=cn=ship_crew,ou=people,dc=planetexpress,dc=com))

This will allow any user from either admin_staff or ship_crew group to login.

Where TLS is in use, add a configmap in the openshift-config namespace:

oc create configmap ldap-ca-bundle --from-file=ca.crt=/root/ocp4/ssl/ca.crt -n openshift-config

Include the following options and use the ldaps syntax for port 636:

ca:
  name: ldap-ca-bundle
insecure: false
url: >-
  ldaps://ldap.cluster.lab.com/...
LDAP Group Sync

Two groups exist in the testing directory admin_staff and ship_crew. To add groups in OpenShift that match those two groups in LDAP, automate this within OpenShift at regular internals using a Cron Job. The Cron job needs a Service Account, A Cluster Role, a Cluster Role binding and a ConfigMap.

Pre-testing

Before we create anything in OpenShift, try things out from the CLI first and make sure that the data for ldap-group-sync.yaml to be stored in the ConfigMap is correct and returns the desired results.

Create a file called ldap_sync_config.yaml:

kind: LDAPSyncConfig
apiVersion: v1
url: ldap://ldap.cluster.lab.com:389
insecure: true
bindDN: "cn=admin,dc=planetexpress,dc=com"
bindPassword: "GoodNewsEveryone"
rfc2307:
    groupsQuery:
        baseDN: "ou=people,dc=planetexpress,dc=com"
        scope: sub
        filter: "(objectClass=Group)"
        derefAliases: never
    groupUIDAttribute: dn
    groupNameAttributes: [ cn ]
    groupMembershipAttributes: [ member ]
    usersQuery:
        baseDN: "ou=people,dc=planetexpress,dc=com"
        scope: sub
        derefAliases: never
    userUIDAttribute: dn
    userNameAttributes: [ uid ]
    tolerateMemberNotFoundErrors: true
    tolerateMemberOutOfScopeErrors: true

Experiment with ldap_sync_config.yaml using this safe "dry-run" command to get your desired results:

oc adm groups sync --sync-config=ldap_sync_config.yaml

Nothing is final or committed until you add --confirm to the command:

oc adm groups sync --sync-config=ldap_sync_config.yaml --confirm

The example provided should return the two groups admin_staff and ship_crew.

You could just run it the once and create the groups in OpenShift as a one-off task, but in the real-world, directories can be huge and often changes with starters and leavers etc.

Cron Job

This requires the bind password in the project openshift-authentication:

oc create secret generic ldap-sync-bind-password --from-literal=bindPassword=GoodNewsEveryone -n openshift-authentication

The next three steps are generic, adding a service account, Cluster role and Cluster role binding. They can be applied individually or amalgamated into one file to create all three in one go, I’ve split them out for clarity of each component:

Service Account:

vi ldap_sync_sa.yaml
---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: ldap-group-syncer
  namespace: openshift-authentication
  labels:
    app: cronjob-ldap-group-sync
oc create -f ldap_sync_sa.yaml

Cluster Role:

vi ldap_sync_cr.yaml
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ldap-group-syncer
  labels:
    app: cronjob-ldap-group-sync
rules:
  - apiGroups:
      - ''
      - user.openshift.io
    resources:
      - groups
    verbs:
      - get
      - list
      - create
      - update
oc create -f ldap_sync_cr.yaml

Cluster Role Binding:

vi ldap_sync_crb.yaml
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: ldap-group-syncer
  labels:
    app: cronjob-ldap-group-sync
subjects:
  - kind: ServiceAccount
    name: ldap-group-syncer
    namespace: openshift-authentication
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ldap-group-syncer
oc create -f ldap_sync_crb.yaml

ConfigMap:

This ConfigMap adds the file ldap-group-sync.yaml used earlier for testing things out or synchronizing groups manually from the CLI. The ConfigMap is a resource made available in OpenShift that the final Cron Job will utilise:

vi ldap_sync_cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: ldap-group-syncer
  namespace: openshift-authentication
  labels:
    app: cronjob-ldap-group-sync
data:
  ldap-group-sync.yaml: |
    kind: LDAPSyncConfig
    apiVersion: v1
    url: ldap://ldap.cluster.lab.com:389
    insecure: true
    bindDN: "cn=admin,dc=planetexpress,dc=com"
    bindPassword:
      file: "/etc/secrets/bindPassword"
    rfc2307:
      groupsQuery:
          baseDN: "ou=people,dc=planetexpress,dc=com"
          scope: sub
          filter: "(objectClass=Group)"
          derefAliases: never
          pageSize: 0
      groupUIDAttribute: dn
      groupNameAttributes: [ cn ]
      groupMembershipAttributes: [ member ]
      usersQuery:
          baseDN: "ou=people,dc=planetexpress,dc=com"
          scope: sub
          derefAliases: never
          pageSize: 0
      userUIDAttribute: dn
      userNameAttributes: [ uid ]
      tolerateMemberNotFoundErrors: true
      tolerateMemberOutOfScopeErrors: true
oc create -f ldap_sync_cm.yaml

Add the Cron Job

vi ldap_sync_cj.yaml
kind: CronJob
apiVersion: batch/v1beta1
metadata:
  name: ldap-group-syncer
  namespace: openshift-authentication
  labels:
    app: cronjob-ldap-group-sync
spec:
  schedule: '*/2 * * * *'
  concurrencyPolicy: Forbid
  suspend: false
  jobTemplate:
    metadata:
      creationTimestamp: null
      labels:
        app: cronjob-ldap-group-sync
    spec:
      backoffLimit: 0
      template:
        metadata:
          creationTimestamp: null
          labels:
            app: cronjob-ldap-group-sync
        spec:
          restartPolicy: Never
          activeDeadlineSeconds: 500
          serviceAccountName: ldap-group-syncer
          schedulerName: default-scheduler
          terminationGracePeriodSeconds: 30
          securityContext: {}
          containers:
            - name: ldap-group-sync
              image: 'openshift/origin-cli:latest'
              command:
                - /bin/bash
                - '-c'
                - >-
                  oc adm groups sync
                  --sync-config=/etc/config/ldap-group-sync.yaml --confirm
              resources: {}
              volumeMounts:
                - name: ldap-sync-volume
                  mountPath: /etc/config
                - name: ldap-sync-bind-password
                  mountPath: /etc/secrets
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              imagePullPolicy: Always
          serviceAccount: ldap-group-syncer
          volumes:
            - name: ldap-sync-volume
              configMap:
                name: ldap-group-syncer
                defaultMode: 420
            - name: ldap-sync-bind-password
              secret:
                secretName: ldap-sync-bind-password
                defaultMode: 420
          dnsPolicy: ClusterFirst
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 5
oc create -f ldap_sync_cj.yaml

You can now pick out the key lines in this file to make sense of how it ties together, and it uses the service account created:

serviceAccountName: ldap-group-syncer

Mounts a volume for the ldap-group-sync.yaml file:

--sync-config=/etc/config/ldap-group-sync.yaml --confirm

And mounts password as a file:

    bindPassword:
      file: "/etc/secrets/bindPassword"

Study the volumeMounts, volumes and the command, it should be clear how all the components fit together.

The first schedule run will kick in after the designated time, in this case, two minutes, and takes a little longer to complete the run because it has to pull the image openshift/origin-cli:latest. Subsequent runs will be much quicker.

Testing

Test the schedule by deleting one of the groups User Management → Groups, wait for the Cron Job to run, and the group should get successfully recreated. Monitor the events to following the status.

oc project openshift-authentication
oc get events --watch

List the cronjob

oc get cronjobs.batch

Trigger a job run:

oc create job --from=cronjob/ldap-group-syncer test-sync-1
RBAC

Bind, for example, the cluster-admin OpenShift role to the admin_staff group:

oc adm policy add-cluster-role-to-group cluster-admin admin_staff

And for example basic-user to the ship_crew group:

oc adm policy add-cluster-role-to-group basic-user ship_crew

Logging into OpenShift with different accounts to test out the results.

For example, cluster administrators:

hermes/hermes
professor/professor

And basic users:

fry/fry
leela/leela

Make sure the LDAP Identity provider is configured to include both groups for basic users:

ldap://ldap.cluster.lab.com/DC=planetexpress,DC=com?uid?sub??(|(memberOf=cn=admin_staff,ou=people,dc=planetexpress,dc=com)(memberOf=cn=ship_crew,ou=people,dc=planetexpress,dc=com))

ETCD

Encryption

oc edit apiserver
spec:
  encryption:
    type: aescbc

Check status.progress of OpenShift API:

oc get openshiftapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="Encrypted")]}{.reason}{"\n"}{.message}{"\n"}'

Check status.progress of Kubernetes API:

oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="Encrypted")]}{.reason}{"\n"}{.message}{"\n"}'

Backups

Change project:

oc project openshift-config

Create Service Account:

oc create sa approver

Make service account cluster-admin:

oc adm policy add-role-to-user cluster-admin system:serviceaccount:approver

Add service account to scc "privileged":

oc edit scc privileged

Example, under users:

users:
- system:admin
- system:serviceaccount:openshift-infra:build-controller
- system:serviceaccount:approver

Provision an NFS share for backups Ref. https://www.richardwalker.dev/pragmatic-openshift/#_nfs_server

Example for /etc/exports on NFS server:

/mnt/openshift/backups          192.168.0.1/24(rw,sync,no_wdelay,no_root_squash,insecure)

Create directory on NFS server:

mkdir /mnt/openshift/backups
chmod 775 /mnt/openshift/backups

Create a PV using nfs storage class for backups:

vi backups-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: backups-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
  - ReadWriteMany
  nfs:
    path: /mnt/openshift/backups
    server: 192.168.0.15
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
oc create -f backups-pv.yaml
oc get pv

Create a PVC for backups:

vi backup-nfs-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: etcd-backup
  namespace: openshift-config
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 50Gi
  storageClassName: nfs
  mountOptions:
    - nfsvers=4.2
oc create -f backup-nfs-pvc.yaml
oc get pvc

Create a ConfigMap containing cluster-backup.sh.

At the end of this script is a modification, On each master node the cluster-backup.sh is unique, containing a hardcoded reference to its own hosts IP Address, so this is tweaked to obtain the IP address.

Take notice of the following line:

IP_ADDR=$(hostname -i)
ETCDCTL_ENDPOINTS="https://${IP_ADDR}:2379" etcdctl snapshot save "${SNAPSHOT_FILE}"
Add the config map continer customised cluster-backup.sh script:s

Create the ConfigMap:

vi backup-configmap.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: etcd-backup-script
  namespace: openshift-config
data:
  etcd-backup.sh: |
    #!/usr/bin/env bash

    ### Created by cluster-etcd-operator. DO NOT edit.

    set -o errexit
    set -o pipefail
    set -o errtrace

    # example
    # cluster-backup.sh $path-to-snapshot

    if [[ $EUID -ne 0 ]]; then
      echo "This script must be run as root"
      exit 1
    fi

    function usage {
      echo 'Path to backup dir required: ./cluster-backup.sh <path-to-backup-dir>'
      exit 1
    }

    # If the first argument is missing, or it is an existing file, then print usage and exit
    if [ -z "$1" ] || [ -f "$1" ]; then
      usage
    fi

    if [ ! -d "$1" ]; then
      mkdir -p "$1"
    fi

    # backup latest static pod resources
    function backup_latest_kube_static_resources {
      RESOURCES=("[email protected]")

      LATEST_RESOURCE_DIRS=()
      for RESOURCE in "${RESOURCES[@]}"; do
        # shellcheck disable=SC2012
        LATEST_RESOURCE=$(ls -trd "${CONFIG_FILE_DIR}"/static-pod-resources/"${RESOURCE}"-[0-9]* | tail -1) || true
        if [ -z "$LATEST_RESOURCE" ]; then
          echo "error finding static-pod-resource ${RESOURCE}"
          exit 1
        fi

        echo "found latest ${RESOURCE}: ${LATEST_RESOURCE}"
        LATEST_RESOURCE_DIRS+=("${LATEST_RESOURCE#${CONFIG_FILE_DIR}/}")
      done

      # tar latest resources with the path relative to CONFIG_FILE_DIR
      tar -cpzf "$BACKUP_TAR_FILE" -C "${CONFIG_FILE_DIR}" "${LATEST_RESOURCE_DIRS[@]}"
      chmod 600 "$BACKUP_TAR_FILE"
    }

    function source_required_dependency {
      local path="$1"
      if [ ! -f "${path}" ]; then
        echo "required dependencies not found, please ensure this script is run on a node with a functional etcd static pod"
        exit 1
      fi
      # shellcheck disable=SC1090
      source "${path}"
    }

    BACKUP_DIR="$1"
    DATESTRING=$(date "+%F_%H%M%S")
    BACKUP_TAR_FILE=${BACKUP_DIR}/static_kuberesources_${DATESTRING}.tar.gz
    SNAPSHOT_FILE="${BACKUP_DIR}/snapshot_${DATESTRING}.db"
    BACKUP_RESOURCE_LIST=("kube-apiserver-pod" "kube-controller-manager-pod" "kube-scheduler-pod" "etcd-pod")

    trap 'rm -f ${BACKUP_TAR_FILE} ${SNAPSHOT_FILE}' ERR

    source_required_dependency /etc/kubernetes/static-pod-resources/etcd-certs/configmaps/etcd-scripts/etcd.env
    source_required_dependency /etc/kubernetes/static-pod-resources/etcd-certs/configmaps/etcd-scripts/etcd-common-tools

    # TODO handle properly
    if [ ! -f "$ETCDCTL_CACERT" ] && [ ! -d "${CONFIG_FILE_DIR}/static-pod-certs" ]; then
      ln -s "${CONFIG_FILE_DIR}"/static-pod-resources/etcd-certs "${CONFIG_FILE_DIR}"/static-pod-certs
    fi

    IP_ADDR=$(hostname -i)

    #dl_etcdctl
    backup_latest_kube_static_resources "${BACKUP_RESOURCE_LIST[@]}"
    ETCDCTL_ENDPOINTS="https://${IP_ADDR}:2379" etcdctl snapshot save "${SNAPSHOT_FILE}"
    echo "snapshot db and kube resources are successfully saved to ${BACKUP_DIR}"
oc create -f backup-configmap.yaml

Before creating the cronjob, SSH to master node and create a directory /mnt/backup on master node/s:

ssh -i cluster_id_rsa [email protected]
sudo su -
mkdir /mnt/backup

SSH to a master node and get the correct quay.io/openshift-release-dev/ocp-v4.0-art-dev image from master in file /etc/kubernetes/manifests/etcd-pod.yaml

ssh -i cluster_id_rsa [email protected]
sudo su -
cat /etc/kubernetes/manifests/etcd-pod.yaml | grep quay.io/openshift-release-dev/ocp-v4.0-art-dev

Example:

spec:
  initContainers:
    - name: etcd-ensure-env-vars
      image: quay.io/openshift-release-dev/[email protected]:326516b79a528dc627e5a5d84c986fd35e5f8ff5cbd74ff0ef802473efccd285

Adjust the schedule as needed, use the right image as found in previous step and assuming you have created the /mnt/backup directory on each master node:

vi backup-cronjob.yaml
kind: CronJob
apiVersion: batch/v1beta1
metadata:
  name: cronjob-etcd-backup
  namespace: openshift-config
  labels:
    purpose: etcd-backup
spec:
  schedule: "10 10 * * *"
  startingDeadlineSeconds: 200
  concurrencyPolicy: Forbid
  suspend: false
  jobTemplate:
    spec:
      backoffLimit: 0
      template:
        spec:
          nodeSelector:
            node-role.kubernetes.io/master: ''
          restartPolicy: Never
          activeDeadlineSeconds: 200
          serviceAccountName: approver
          hostNetwork: true
          containers:
            - resources:
                requests:
                  cpu: 300m
                  memory: 250Mi
              terminationMessagePath: /dev/termination-log
              name: etcd-backup
              command:
                - /bin/sh
                - '-c'
                - >-
                  /usr/local/bin/etcd-backup.sh /mnt/backup
              securityContext:
                privileged: true
              imagePullPolicy: IfNotPresent
              volumeMounts:
                - name: certs
                  mountPath: /etc/ssl/etcd/
                - name: conf
                  mountPath: /etc/etcd/
                - name: kubeconfig
                  mountPath: /etc/kubernetes/
                - name: etcd-backup-script
                  mountPath: /usr/local/bin/etcd-backup.sh
                  subPath: etcd-backup.sh
                - name: etcd-backup
                  mountPath: /mnt/backup
                - name: scripts
                  mountPath: /usr/local/bin
              terminationMessagePolicy: FallbackToLogsOnError
              image: >-
                quay.io/openshift-release-dev/[email protected]:c9487f25868eafe55b72932010afa4b2728955a3a326b4823a56b185dd10ec50
          serviceAccount: approver
          tolerations:
            - operator: Exists
              effect: NoSchedule
            - operator: Exists
              effect: NoExecute
          volumes:
            - name: certs
              hostPath:
                path: /etc/kubernetes/static-pod-resources/etcd-member
                type: ''
            - name: conf
              hostPath:
                path: /etc/etcd
                type: ''
            - name: kubeconfig
              hostPath:
                path: /etc/kubernetes
                type: ''
            - name: scripts
              hostPath:
                path: /usr/local/bin
                type: ''
            - name: etcd-backup
              persistentVolumeClaim:
                claimName: etcd-backup
            - name: etcd-backup-script
              configMap:
                name: etcd-backup-script
                defaultMode: 493
oc create -f backup-cronjob.yaml

List the cronjob:

oc get cronjobs.batch

Run the cronjob on-demand:

oc create job --from=cronjob/cronjob-etcd-backup test-backup-001

The PVC should now be claimed:

oc get pvc
NAME          STATUS   VOLUME       CAPACITY   ACCESS MODES   STORAGECLASS   AGE
etcd-backup   Bound    backups-pv   50Gi       RWX            nfs            14m

The job should run and the backup create in the file share, on the NFS server (or mount the NFS share):

cd /mnt/openshift/backups/
-rw-------. 1 root root 140324896 Nov 27 11:18 snapshot_2020-11-28_111840.db
-rw-------. 1 root root     70093 Nov 27 11:18 static_kuberesources_2020-11-28_111840.tar.gz

SUMMARY

This document is likely to be updated and evolve. I hope the information and examples serve you well.