Kubernetes: Persistent Storage using NFS-client-provisioner.

In this post, I will be demonstrating how to use NFS-client as a provisioner. The NFS client provisioner is an automatic provisioner that leverages your existing NFS server to create persistent volumes. In this setup, we have one Master,one worker node and the NFS node.

Requirement:
NFS-server
Kubernetes Cluster.

How to Setup NFS-server:
On the NFS node, perform the following steps.

sudo apt update
sudo apt upgrade -y

next we install the NFS-kernel-server

sudo apt install nfs-kernel-server -y
sudo mkdir /var/nfs/general -p

next we change the ownership and group on the directory

sudo chown nobody:nogroup /var/nfs/general

next we will edit the file /etc/exports and add the lines that follows.

/var/nfs/general     x.x.x.x(rw,sync,no_subtree_check) y.y.y.y(rw,sync,no_subtree_check)

The x and y represents the IP address of the servers that will be accessing the NFS-server.

next we restart the NFS server

sudo systemctl restart nfs-kernel-server

for each node that requires access to the NFS node we need to install NFS Common.

sudo apt install nfs-common

At this point we have completed the setup of our nfs node for two nodes to access.
next we deploy NFS-client into our Kubernetes cluster.

Requirement:
nfs-deploy.yaml #install the nfs client
nfs-serviceaccount.yaml #create SA for nfs-client
nfs-class.yaml #creates storage class for nfs-client purposes
nfs-clusterrole.yaml #create cluster role for nfs client
nfs-clusterrolebind.yaml #create cluster role binding for nfs-client
test-claim.yaml #configuration to test that we can now dynamically provision persistent volumes.

we will create a directory called nfs-client for the following files

mkdir nfs-client
cd nfs-client

next create these files.

nfs-deploy.yaml

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: nfs-client-provisioner
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner
      containers:
        - name: nfs-client-provisioner
          image: quay.io/external_storage/nfs-client-provisioner:latest
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: fuseim.pri/ifs
            - name: NFS_SERVER
              value: 
            - name: NFS_PATH
              value: /var/nfs/general
      volumes:
        - name: nfs-client-root
          nfs:
            server: 
            path: /var/nfs/general

nfs-serviceaccount.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner

nfs-class.yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: managed-nfs-storage
provisioner: fuseim.pri/ifs

nfs-clusterrole.yaml

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]

nfs-clusterrolebind.yaml

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    namespace: default
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io

test-claim.yaml

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-claim
  annotations:
    volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Mi

The value for PROVISIONER_NAME in the nfs-deploy.yaml must be the same as the one in nfs-class.yaml.

kubectl create -f  nfs-client/

On the NFS server, verify that the file was written to the /var/nfs/general path

Kubernetes: Admission Controller

Admission controller is what allows us make API request to Kubernetes. After authentication and Authorization are passed, the Admission control attends to request. It allows us access the content of objects in a request. The admission controller can be enabled by passing the –admission-control in versions before 1.10 and –enable-admission-plugins parameter in version 1.10 and later to the API server or modifying the kubernetes api-server configuration file. The Admission controller can deny and validate or modify the content of a request. Example of controllers manage by Admission controller that come by default in kubernetes release v1.9 are as follows.

--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota

the order os Admission Controllers in the list matters.

Kubernetes: Custom Resource Definitions

A custom resource definition also known as CRD allows us to create custom resources. It is a way of extending the Kubernetes API and create resources for our own purposes.it is one of the ways to create custom resources other than using Aggregated APIs. Custom Resources Definition is currently under apiextensions.k8s.io/v1beta1.It is limited to only the existing functionality of the API. We can either have a custom resources scoped under a Cluster or Namespaced. Below is an example of how to create a Custom Resource Definition and the custom resource .

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: networks.alphatribe.com
spec:
  group: alphatribe.com
  version: v1
  scope: Cluster
  names:
    plural: networks
    singular: network
    shortNames:
    - net
    kind: Network
apiVersion: networks.alphatribe.com/v1
kind: Network
metadata:
  name: dev
spec:
  subnet: "10.5.2.0/24"
  bandwidthMb: 100
  enableARP: true

Kubernetes: CNI

Container Network interface (CNI) is a CNCF project that consist of specification and libraries for writing network plugins to configure network interfaces in Linux containers. When a Kubernetes cluster is bootstrapped with Kubeadm, it uses CNI as its default network interface mechanism. Some of the popular CNI’s are Flannel, Calico, Weave, Cilium and Contive networking. Kubernetes invokes the CNI as an executable. Each CNI’s have support for different features and it is up to the cluster administrator to determine the CNI suitable for their environment.

Kubernetes: Resource Quota

A Resource Quota in Kubernetes is an object that allow us to limit the amount of resource a pod can consume. Without a resource Quota, pods are able to consume as much resource as available in the cluster. Once a resource quota is applied to a namespace the resource consumption will be limited within that namespace. setting of a resource Quota is import because it allows fair usage and favorable QOS within the cluster. We have three kinds of resource Quotas which are storage, compute and object count quotes. Once a resource quota is specified and a pod makes a request that exceeds the Resource Quota, the pod won’t start. here is an example of a Resource Quota. In this example, we have set resource Quota for storage, cpu and object.

kind: ResourceQuota
apiVersion: v1
metadata:
  name: dev-quota
spec:
  hard:
   requests.cpu: "2"              
   requests.memory: 2Gi                     
   limits.cpu: "4"           
   limits.memory: 4Gi        
   requests.storage: 128Gi    
   pods: 3    

Kubernetes: Statefulset

Statefulset is a resource in Kubernetes that we use for stateful applications. It is a special type of Replicaset object for applications running in our cluster. Statefulset provide consistency and predictability to application deployment with stateful data. By default containers are inherently ephemeral, when a container dies it loses all the data or content running on it. we use volumes to persist the state of containers running in a pod after the pod dies.A storage class is needed for dynamic provisioning purposes whenever we use persistent volumes with Statefulset. We use volumeClaimTemplates to map to a storage that will be used for provisioning and storing data when using statefulset. Statefulset is capable of preserving a Persistent volume while a pod is being replaced. When statefulset are created the pods associated to the stateful applications are assigned an ordered numbers starting from 0 to N. Also ,when a Statefulset is terminated the pods are terminated from N to 0 by default. we can specify podManagementPolicy to be equal to Parallel if we want to delete all pods ate the same time. If a single pod in the statefulset is deleted, another one will be created. Due to how the Statefulset are named, Kubernetes easily associates specific pods with the network names and persistent volumes. Below is an example of a statefulset.

The service that will be used by the Statefulset Pods.

apiVersion: v1
kind: Service
metadata:
  name: atribe-svc
  labels:
    app: hello
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: hello

StatefulSet Configuration with 3 replicas.

kind: StatefulSet
apiVersion: apps/v1beta1
metadata:
  name: Aquatribe-StatefulSet
spec:
  serviceName: atribe-svc
  replicas: 3
  template:
    metadata:
      labels:
        app: hello
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: hello
        image: jonbaier/httpwhalesay:0.2
        command: ["node", "index.js", "Hello World!."]
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
      annotations:
        volume.beta.kubernetes.io/storage-class: solidstate
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

Kubernetes: Security Contexts

A security context is used in providing operating system level security limitation to pods and containers. A security context is used to control what processes can do within a container. It can be set within a pod specification or container specification. Limitations cans be set based on file-system group, UID and SELINUX roles. Security setting applied to pod can also apply to volumes attached to the pod. Below are sample Security Context examples.

The first example sets a security context that enforces that no container within this pod should run as Root user.

kind: Pod
apiVersion: v1
metadata:
  name: hello
spec:
  securityContext:
    runAsNonRoot: true
  containers:
  - image: busybox
    name: busybox

In this case, we are using a security context within the container that gives privilege to the container and also uses seLinuxOptions.

apiVersion: v1 
kind: Pod 
metadata: 
  name: hello
spec: 
  containers: 
  - image: busybox
    name: busybox
    securityContext: 
        privileged: true 
        seLinuxOptions: 
          level: "s0:c123,c456"

Kubernetes: Pod Disruption Budget (PDB)

Pod Disruption Budget is a resource in Kubernetes that allows us to manage disruption behavior in our cluster when voluntary changes are made by cluster Administrator or Human Operators. It provides a way for us to specify the Minimum available pods and as of Kubernetes version 1.7 it now allows us to specify the maximum unavailable pods in our cluster.

PDB uses a label selector to specify the pods it is trying to assign a budget to. It uses the fields minAvailable and maxUnavailable to specify both the minimum number of pods that should be available and the maximum number of pods that can be unavailable in the cluster. The field values can either be an absolute value or percentage.

We use PDB as a way to provide safety constraints on pods. PDB determines how cluster specific actions such as autoscaling, draining of pods and pod priority scheduling affects the pods that have disruption budget assigned to them. PDB is crucial when we have a Quorum-based application. When trying to evict pods we can use the eviction API by using curl command to make calls to the eviction API. Below are examples of how to use Pod Disruption Budget and making calls to the eviction API.

tunde:~ babatundeolu-isa$  kubectl create pdb aquatribe-pdb --selector=app=aquatribe --min-available=3
poddisruptionbudget.policy "aquatribe-pdb" created
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: aquatribe-pdb
spec:
  selector:
    matchLabels:
      app: aquatribe 
  minAvailable: 60%
  apiVersion: policy/v1beta1
  kind: PodDisruptionBudget
  metadata:
    name: aquatribe-pdb
  spec:
    selector:
      matchLabels:
        app: aquatribe 
    maxUnavailable: 40%
tunde:~ babatundeolu-isa$ curl -k -v -H 'Content-type: application/json' https://localhost:6443/api/v1/namespaces/default/pods/busybox-59485cf84c-2kxms/eviction -d @disruption.json