Project Calico recently made Calico for Windows open-source. Before it was only available through a subscription. This means there is now another open-source networking plugin next to Flannel (and cloud-specific ones like Azure-CNI) which you can use for your mixed Kubernetes cluster with both Linux and Windows nodes. Calico for Windows only supports VXLAN for now and like Flannel also uses overlay networking. In addition Calico also allows to create network policies to control the traffic in your cluster.
To quickly try it out I created Terraform definitons to automatically spin up a Kubernetes cluster on Azure for testing. If you want to set it up yourself, you can have a look at the provisioning scripts described later. After setting up the infrastructure we’ll have a look at how the network policies work.
TL;DR
You need to install the Azure CLI as well as Terraform. To setup the mixed cluster with Calico for Windows as a networking plugin run the following commands:
# Create a mixed Kubernetes cluster with Calico for Windows
git clone https://github.com/lippertmarkus/terraform-azurerm-k8s-calico-windows.git
cd terraform-azurerm-k8s-calico-windows
az login # log in to your Azure account
terraform init # initialize terraform
terraform apply -auto-approve # provision infrastructure
# Check the cluster and deploy an example Windows workload
ssh -i output/primary_pk azadmin@$(terraform output primary_ip)
kubectl get node
kubectl get pod -A
kubectl apply -f https://raw.githubusercontent.com/lippertmarkus/terraform-azurerm-k8s-calico-windows/master/example_workloads/win-webserver.yml
If you don’t want to know the details around the Calico for Windows installation, try setting up network policies.
Details
Kubernetes is a container orchestration system which makes deploying and managing containerized applications easy. A Kubernetes cluster consists of multiple nodes, mostly VMs or physical machines. Each of those nodes has one or many roles, for example to schedule the container workloads (Control Plane), storing the cluster data (etcd) or running workloads (Worker). Kubernetes supports both Linux and Windows worker nodes in mixed clusters.
Networking for Windows containers is done through Container Networking Interface (CNI) Plugins like Flannel or Calico for Windows which in a heterogeneous cluster must support both Windows and Linux nodes. Find out more about networking for Windows containers in the Kubernetes docs.
Creating a mixed cluster with Calico for Windows
The Terraform definitions spin up two VMs on Azure with a shared virtual network and public IPs to form a Kubernetes cluster:
- Linux VM
primary
using Ubuntu as a control plane, for etcd and for running Linux workloads - Windows VM
minion
using Windows Server 1903 with Docker preinstalled for running Windows workloads
To use the definitions you need to install the Azure CLI as well as Terraform and run the following commands to clone the Git repository, log in to your Azure account and provision the infrastructure:
# Create a mixed Kubernetes cluster with Calico for Windows
git clone https://github.com/lippertmarkus/terraform-azurerm-k8s-calico-windows.git
cd terraform-azurerm-k8s-calico-windows
az login # log in to your Azure account
terraform init # initialize terraform
terraform apply -auto-approve # provision infrastructure
The more or less interesting part happens inside the provisioning scripts. The script for the primary
node initializes a new Kubernetes cluster with kubeadm
through the steps outlined in the kubeadm
docs. Before the cluster is initialized the networking gets set up and the needed prerequisites docker
, kubelet
, kubeadm
and kubectl
are installed.
Afterwards Calico with etcd as a datastore is installed on the Linux node via the manifest like described in the Calico docs. I adapted the Calico manifest to use VXLAN and TLS for etcd. VXLAN is required for Calico for Windows and you also need to enable strict affinity for Calico to prevent Linux nodes from borrowing IP addresses from Windows nodes like stated in the docs and done by the provisioning script. The script also adds the secret values for the etcd TLS connection as well as the etcd endpoint itself to the manifest before installing it.
As the cluster is now created, the Windows node is automatically joined to the cluster. The PowerShell script which is run on the minion
Windows node first copies the kubeconfig as well as the etcd TLS certificates from the primary
node. The kubeconfig is needed by Calico for accessing the API server to correctly set up networking for each pod and the TLS certificates are used for securely connect to the etcd datastore. Lastly the installation script provided by Calico fully sets up Calico for Windows as well as the Kubernetes services and starts them. For creating the PowerShell provisioning script I followed the docs about the standard install as well as the quickstart and combined the information. The script mainly follows the documentation of the quickstart but this path unfortunately lacks options to automatically set up the calico-kubeconfig as well as TLS for etcd for self managed clusters although there are already functions for that. I therefore modified the installation script.
Accessing the cluster
After terraform apply
is finished, you can connect to the primary
node to access and manage the cluster as well as verify that everything works as expected:
ssh -i output/primary_pk azadmin@$(terraform output primary_ip)
kubectl get node
kubectl get pod -A
Now you can deploy a simple Windows workload to make sure Calico on Windows was installed properly:
kubectl apply -f https://raw.githubusercontent.com/lippertmarkus/terraform-azurerm-k8s-calico-windows/master/example_workloads/win-webserver.yml
The following network connections should work:
- Node-to-Pod from the
primary
Linux node using the Pod IP - Pod-to-Pod on the same node and across nodes
- Pod-to-Service from within a Pod and Service-to-Pod from the
primary
Linux node using the cluster IP of the service - Service Discovery via DNS from within a Pod
- Inbound-to-Pod from the
primary
Linux node using theminion
IP and the services’ NodePort
Like for Flannel you can’t access the Pod IP, the services’ cluster IP or the NodePort on the Windows container host. Let’s have a look at the Pod-to-Pod connections and how we can control them by applying network policies.
Creating network policies
The Calico docs have a simple guide illustrating the use of network policies. I adapted the examples a bit to work for Windows Server 1903 and to make them a bit more easy to understand. Let’s see how they are used by following the steps:
- Deploy a client and server Pod on each the Linux and Windows node
kubectl apply -f https://raw.githubusercontent.com/lippertmarkus/terraform-azurerm-k8s-calico-windows/master/example_workloads/policy-demo-pods.yml
- Verify the connection from
linux-client
towindows-server
:kubectl exec -n calico-demo linux-client -- nc -vz $(kubectl get po windows-server -n calico-demo -o 'jsonpath={.status.podIP}') 80 # This should succeed with output similar to # 192.168.115.74 (192.168.115.74:80) open
- Verify the connection from
windows-client
tolinux-server
:kubectl exec -n calico-demo windows-client -- powershell Invoke-WebRequest -Uri http://$(kubectl get po linux-server -n calico-demo -o 'jsonpath={.status.podIP}') -UseBasicParsing -TimeoutSec 5 # This should succeed with output similar to # ... # <title>Welcome to nginx!</title> # ...
- Verify the connection from
windows-client
towindows-server
:kubectl exec -n calico-demo windows-client -- powershell Invoke-WebRequest -Uri http://$(kubectl get po windows-server -n calico-demo -o 'jsonpath={.status.podIP}') -UseBasicParsing -TimeoutSec 5 # This should succeed with output similar to # ... # <html><body><H1>Windows Container Web Server</H1><p>IP # ...
- Apply the network policy to only allow connections from
linux-client
towindows-server
:wget https://raw.githubusercontent.com/lippertmarkus/terraform-azurerm-k8s-calico-windows/master/example_workloads/policy-demo-policy.yml calicoctl apply -f policy-demo-policy.yml
- Verify that the
linux-client
is still able to reachwindows-server
:kubectl exec -n calico-demo linux-client -- nc -vz $(kubectl get po windows-server -n calico-demo -o 'jsonpath={.status.podIP}') 80
- Verify that other connections like e.g. from
windows-client
towindows-server
are not working:kubectl exec -n calico-demo windows-client -- powershell Invoke-WebRequest -Uri http://$(kubectl get po windows-server -n calico-demo -o 'jsonpath={.status.podIP}') -UseBasicParsing -TimeoutSec 5 # This should time out like # Invoke-WebRequest : The operation has timed out. # ...
That’s how you can isolate Pod-to-Pod traffic. You can remove the demo resources with kubectl delete ns calico-demo
.
Conclusion
It’s great to have another option next to Flannel for networking within mixed clusters. While the basic networking works fine, keep in mind that there are still limitations for Calico for Windows.
Comments