Kubernetes with Gcloud and Terraform

Recently I was looking into ways to create a container cluster in Google Cloud Platform. In this article I’ll draw out the setup and configuration of Kubernetes.

First an exploratory script I ran per instructions here.

gcloud container clusters create working-space \
--zone us-central1-a \
--additional-zones us-central1-b,us-central1-c

A container cluster listing.

This is the list of instances running providing the nodes for the cluster.

I went ahead and deleted that cluster via the Google Cloud Console. The next command I tried out was gcloud container clusters create secondary-delete --network worker-space.

Since this was done on a freshly installed gcloud install on a new machine, I’d missed that I had not installed kubectl. However gcloud kindly informed me.

$ gcloud container clusters create secondary-delete \
> --network worker-space
WARNING: Accessing a Container Engine cluster requires the kubernetes commandline
client [kubectl]. To install, run
$ gcloud components install kubectl

ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=Subnetwork must be provided for manual-subnet Network "worker-space".

A quick gcloud components install kubectl fixed that. I ran the command again and oops, I needed to designate the subnetwork, not particularly the network.

I wanted to bake this into a Terraform config, so the next thing I wired together was exactly that. A quick change and reran the command.

$ gcloud container clusters create secondary-delete --network worker-space --subnetwork worker-space-default
ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=Subnetwork "projects/that-big-universe/regions/us-west1/subnetworks/worker-space-default" is not valid for Network "worker-space".

Dammit, error central. I went and did some RTFMing at this point. The section for network and subnetwork read as follows:

–network=NETWORK The Compute Engine Network that the cluster will connect to. Google Container Engine will use this network when creating routes and firewalls for the clusters. Defaults to the ‘default’ network.

–subnetwork=SUBNETWORK The name of the Google Compute Engine subnetwork (https://cloud.google.com/compute/docs/subnetworks) to which the cluster is connected. If specified, the cluster’s network must be a “custom subnet” network.

Ok, seems like gcloud container clusters create secondary-delete --network worker-space --subnetwork worker-space-default should have worked. I tried out a few more ideas to see what the issue could actually be.

$ gcloud compute networks subnets list
NAMEREGION NETWORK RANGE
...default networks were listed here...
worker-space-defaultus-central1worker-space10.128.0.0/20

So even gcloud finds that the network name is worker-space-default and the network is worker-space? Is that right? That’s actually a little confusing. In the interface it looks like this.

Next I try this.

$ gcloud container clusters create secondary-delete --network worker-space --subnetwork us-central1
ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=Subnetwork "projects/that-big-universe/regions/us-west1/subnetworks/us-central1" is not valid for Network "worker-space".

Then I realize that it’s inserting us-west1 into that string for the subnetwork. But that’s not true. My worker-space and its subnetwork is in central1 not in west1. So a quick RTFM again and realized that maybe adding the zone back, with these network and subnetwork parameters set, I might get a successful cluster creation. I tried this.

$ gcloud container clusters create secondary-delete --network worker-space --subnetwork worker-space-default --zone us-central1

This also didn’t work, so I went with a zone of us-central1a.

$ gcloud container clusters create secondary-delete --network worker-space --subnetwork worker-space-default --zone us-central1b
ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=zone "us-central1b" does not exist.

Ok, reading the docs didn’t help at this point either. I’m doubtfully going to use gcloud to build this in production anyway, so I’ll go ahead and try to get Terraform building it.

Terraform for Google Cloud Container Cluster

I added this to the script in the adron-infrastructure branch of my repo here.

resource "google_container_cluster" "development" {
name = "development-systems"
zone = "us-west1-b"
initial_node_count = 3

master_auth {
username = "someusername"
password = "willchange"
}

node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring"
]
}
}

The console container cluster displays once it completes.

The console instances listed. This time of course, just like the node count in the config above and the zone being only set to west1-b, I’ve got 3 instances.

Alright, that worked beautifully. If anybody has any idea what the issue is with my aforementioned attempts to create a cluster using gcloud and the network and subnetwork please ping me via @Adron and we’ll DM or email if you would. (Also, for more info on getting started with GCP with Terraform, check out my article “Working With Google Compute Engine (GCE) using Terraform (With a load of Bash Scripts too) ”)

Next step I needed to connect kubectl to the cluster. This is one of the spaces where a lot of Google documentation is lacking since it also assumes you’ve followed some pre-baked route to gain credentials. For instance, int he interface the thing I pointed to earlier, just uses kubectl and gcloud does some crazy black magic in the background that require tertiary RTFMing in order to figure out what is actually going on, retrace those steps, and connect yourself. The first thing I tried was to set the cluster.

kubectl config set-cluster development-systems

Then did a quick kubectl config view which showed that things were set correctly. Then I tried to run kubectl cluster-info just to see what’s up.

$ kubectl cluster-info
error: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.

So no go. Default credentials… another thing I’m not really sure. I tried to run gcloud container clusters get-credentials development-systems --zone us-west1-b --project that-big-universe and it failed. Grumble grumble come on. I tried again however, because something seemed odd with the network before, and this time I got some results! Win!

$ gcloud container clusters get-credentials development-systems \
> --zone us-west1-b --project that-big-universe
Fetching cluster endpoint and auth data.
kubeconfig entry generated for development-systems.

I then try to get a kubectl cluster-info.

$ kubectl cluster-info
error: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.

Nope. Oh yeah, I gotta set kube’s proxy!

$ kubectl proxy
error: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.

No. Alright, what then? At this point I’m pretty frustrated, but whatevs, this is gonna work so I keep researching. A quick search for default credentials leads to this. I read through it and this whole GOOGLE_APPLICATION_CREDENTIALS needs to be set. Ok, that’s cool. So I set it. It needs to be a credentials file for the service account used. Which actually makes sense but only because I’ve used Google Cloud before, but for somebody just diving into using the container service on Google this is kind of a whole derailment to go read up on a bunch of other topics. Albeit, it is necessary reading. Anyway… back to setting the credentials file. For more info on how to set this up and working with GCP, check out my previous article “Working With Google Compute Engine (GCE) using Terraform (With a load of Bash Scripts too)”.

export GOOGLE_APPLICATION_CREDENTIALS=~/thepathtosecrets/account.json

I added this to my bash_profile and sourced that file.

source ~/.bash_profile

Then I ran kubectl cluster-info.

$ kubectl cluster-info
Kubernetes master is running at https://104.196.234.30
GLBCDefaultBackend is running at https://104.196.234.30/api/v1/proxy/namespaces/kube-system/services/default-http-backend
Heapster is running at https://104.196.234.30/api/v1/proxy/namespaces/kube-system/services/heapster
KubeDNS is running at https://104.196.234.30/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://104.196.234.30/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

Ah, magic! It’s working! So a few other commands just to verify and determine what the state of things are.

$ kubectl top node
NAME CPU(cores) CPU%MEMORY(bytes) MEMORY%
gke-development-systems-default-pool-f30f476e-gtnj 34m3%1529Mi41%
gke-development-systems-default-pool-f30f476e-ncln 47m4%1879Mi50%
gke-development-systems-default-pool-f30f476e-s5pc 38m3%1691Mi45%

That looks good. So on to getting some containers launched.

A couple of days I wrote up the experience around getting a Google Container Cluster up and running and adding a Terraform config to automate the process. Today my plan is to dig into getting containers into the Google Container Repository and then using those containers to launch various things within the Google Container Cluster. With my end goal being to get a Drone.io setup for production use.

In the last blog entry of this series I wrote up, I covered the steps and some of the issues I ran into getting a Google Cloud Container cluster up and running. In this article I’m going to dive into working with that container, specifically via the gcloudand kubectl commands. I’m assuming prerequisites at this point include gcloudand kubectl being installed. With gcloud also being setup via the gcloud initcommand already.

I’ve also made a few small changes to the Terraform config file since the previous use. I’ve copied the file contents below. The changes I’ve included below the file contents.

 resource "google_container_cluster" "drone" {
 name = "drone"
 zone = "us-west1-b"
 initial_node_count = 3

 additional_zones = [
 "us-west1-b"
 ]

 network = "developer-space"
 subnetwork = "developer-space-west1"

 master_auth {
 username = "firsttry"
 password = "willchange"
 }

 node_config {
 oauth_scopes = [
 "https://www.googleapis.com/auth/compute",
 "https://www.googleapis.com/auth/devstorage.read_only",
 "https://www.googleapis.com/auth/logging.write",
 "https://www.googleapis.com/auth/monitoring"
 ]

 machine_type = "g1-small"
 }
 }

The changes included:

  • I changed the name of the cluster to drone.
  • I’ve added the additional zone, which is the same as the primary zone. Since the additional zone is identical to the primary zone there will only be 3 instances created. This is per the initial node count. If the additional zone were different than the primary zone, it would create 6 instances, 3 in the additional zone and 3 in the primary zone. I’m not particularly concerned about the high availability associated with the additional instances in another zone so I’ve configured it this way to cut down on prospective costs. The other effect of setting the additional zone, is that Terraform won’t re-create the entire cluster every single time it runs.
  • I’ve also set network and subnetwork, which if you’re following along you wouldn’t particularly need set, but I have my networks configured in a particular way so I like to designate which network which cluster or set of instances is going to be created in.
  • Just a note, not a change, but using a g1-small is about the smallest size you can go and have a working cluster. Anything smaller than that and it tends to choke on itself, which is unfortunate.

Next step is getting connected to this cluster. Here’s a few of the things you’ll do over and over again when working with a container cluster.

Using Google Container Cluster via gcloud and kubectl

For the next steps of what I need to do (to setup a cluster for use) there are a host of commands that are useful to determine what’s going on, troubleshoot any issues that might come up, and generally get any insight into the cluster. Here’s a run down of those key commands.

  • gcloud container clusters describe [NAME] - where NAME is the name of the cluster to get information from. The results which of which look like this.

    $ gcloud container clusters describe drone
    clusterIpv4Cidr: 10.132.0.0/14
    createTime: '2017-02-07T01:09:42+00:00'
    currentMasterVersion: 1.5.2
    currentNodeCount: 3
    currentNodeVersion: 1.5.2
    endpoint: 104.196.239.145
    initialClusterVersion: 1.5.2
    initialNodeCount: 3
    instanceGroupUrls:
    - https://www.googleapis.com/compute/v1/projects/that-big-universe/zones/us-west1-b/instanceGroupManagers/gke-drone-default-pool-d9a3b45a-grp
    locations:
    - us-west1-b
    loggingService: logging.googleapis.com
    masterAuth:
    clientCertificate: THIS IS WHERE A BUNCH OF AUTH KEY STUFF GOES FOR THE CLIENT CERT
    clientKey: THIS IS WHERE A BUNCH OF CLIENT KEY AUTH STUFF GOES
    clusterCaCertificate: THIS IS WHERE A BUNCH OF CLIENT AUTH CA CERTIFICATE STUFF GOES
    password: THIS SHOWS THE PASSWORD THE CLUSTER IS SETUP WITH - SEE ABOVE TERRAFORM FILE FOR CORRELATION.
    username: THIS SHOWS THE USERNAME THE CLUSTER IS SETUP WITH - SEE ABOVE TERRAFORM FILE FOR CORRELATION.
    monitoringService: monitoring.googleapis.com
    name: drone
    network: developer-space
    nodeConfig:
    diskSizeGb: 100
    imageType: GCI
    machineType: g1-small
    oauthScopes:
    - https://www.googleapis.com/auth/compute
    - https://www.googleapis.com/auth/devstorage.read_only
    - https://www.googleapis.com/auth/logging.write
    - https://www.googleapis.com/auth/monitoring
    serviceAccount: default
    nodeIpv4CidrSize: 24
    nodePools:
    - config:
    diskSizeGb: 100
    imageType: GCI
    machineType: g1-small
    oauthScopes:
    - https://www.googleapis.com/auth/compute
    - https://www.googleapis.com/auth/devstorage.read_only
    - https://www.googleapis.com/auth/logging.write
    - https://www.googleapis.com/auth/monitoring
    serviceAccount: default
    initialNodeCount: 3
    instanceGroupUrls:
    - https://www.googleapis.com/compute/v1/projects/that-big-universe/zones/us-west1-b/instanceGroupManagers/gke-drone-default-pool-d9a3b45a-grp
    management: {}
    name: default-pool
    selfLink: https://container.googleapis.com/v1/projects/that-big-universe/zones/us-west1-b/clusters/drone/nodePools/default-pool
    status: RUNNING
    version: 1.5.2
    selfLink: https://container.googleapis.com/v1/projects/that-big-universe/zones/us-west1-b/clusters/drone
    servicesIpv4Cidr: 10.135.240.0/20
    status: RUNNING
    subnetwork: developer-space-west1
    zone: us-west1-b
    

    More info:https://cloud.google.com/sdk/gcloud/reference/container/clusters/describe

  • gcloud container clusters get-credentials [NAME] - where NAME is the name of the cluster, this command retrieves credentials for the cluster in which work will be done against.

    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for drone.
    

    More info: https://cloud.google.com/sdk/gcloud/reference/container/clusters/get-credentials

  • gcloud container clusters list This command will provide a listing of the clusters that are in service. This is the first command to run to get the name of the cluster you want to work with. In this case, I’m working with the only cluster that I have named drone.

    $ gcloud container clusters list
    NAME ZONEMASTER_VERSIONMASTER_IPMACHINE_TYPENODE_VERSIONNUM_NODESSTATUS
    droneus-west1-b1.5.2 104.196.239.145g1-small1.5.2 3RUNNING
    

    More info: https://cloud.google.com/sdk/gcloud/reference/container/clusters/list

  • kubectl config set-cluster [NAME] - To use kubectl to manage the cluster, first the cluster to work against must be set. Use set-cluster to get this done. Where NAME is the name of the cluster to work against.

    $ kubectl config set-cluster drone
    Cluster "drone" set.
    

    More info: https://kubernetes.io/docs/user-guide/kubectl/kubectl_config_set-cluster/

  • gcloud container get-server-config - This gcloud command gets information about the configuration that is setup for the container cluster.

    $ gcloud container get-server-config
    Fetching server config for us-west1-b
    defaultClusterVersion: 1.5.2
    defaultImageType: GCI
    validImageTypes:
    - CONTAINER_VM
    - GCI
    validMasterVersions:
    - 1.5.2
    - 1.4.8
    validNodeVersions:
    - 1.5.2
    - 1.5.1
    - 1.4.8
    - 1.4.7
    - 1.4.6
    - 1.3.10
    - 1.2.7
    

    More info: https://cloud.google.com/sdk/gcloud/reference/container/get-server-config

  • kubectl config get-clusters - This command simply lists out the available clusters.

  • kubectl config current-context - This command prints out the current context in which the kubectl command is working in.

    $ kubectl config current-context
    gke_that-big-universe_us-west1-b_drone
    

    More info: https://kubernetes.io/docs/user-guide/kubectl/kubectl_config_current-context/

  • kubectl cluster-info - This command provides information about the cluster as shown.

    $ kubectl cluster-info
    Kubernetes master is running at https://104.196.239.145
    GLBCDefaultBackend is running at https://104.196.239.145/api/v1/proxy/namespaces/kube-system/services/default-http-backend
    Heapster is running at https://104.196.239.145/api/v1/proxy/namespaces/kube-system/services/heapster
    KubeDNS is running at https://104.196.239.145/api/v1/proxy/namespaces/kube-system/services/kube-dns
    kubernetes-dashboard is running at https://104.196.239.145/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
    
    To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
    

    More info: https://kubernetes.io/docs/user-guide/kubectl/kubectl_cluster-info/

  • kubectl describe nodes - This command provides a lot of information about the actual running nodes in the cluster.

    $ kubectl describe nodes
    Name: gke-drone-default-pool-d9a3b45a-l082
    Role: 
    Labels: beta.kubernetes.io/arch=amd64
    beta.kubernetes.io/instance-type=g1-small
    beta.kubernetes.io/os=linux
    cloud.google.com/gke-nodepool=default-pool
    failure-domain.beta.kubernetes.io/region=us-west1
    failure-domain.beta.kubernetes.io/zone=us-west1-b
    kubernetes.io/hostname=gke-drone-default-pool-d9a3b45a-l082
    Taints: <none>
    CreationTimestamp:Mon, 06 Feb 2017 17:13:11 -0800
    Phase:
    Conditions:
    TypeStatusLastHeartbeatTime LastTransitionTimeReasonMessage
    --------------------------- -------------------------------
    NetworkUnavailableFalse Mon, 06 Feb 2017 17:14:09 -0800 Mon, 06 Feb 2017 17:14:09 -0800 RouteCreatedRouteController created a route
    OutOfDisk False Mon, 06 Feb 2017 17:50:06 -0800 Mon, 06 Feb 2017 17:13:11 -0800 KubeletHasSufficientDiskkubelet has sufficient disk space available
    MemoryPressureFalse Mon, 06 Feb 2017 17:50:06 -0800 Mon, 06 Feb 2017 17:13:11 -0800 KubeletHasSufficientMemorykubelet has sufficient memory available
    DiskPressureFalse Mon, 06 Feb 2017 17:50:06 -0800 Mon, 06 Feb 2017 17:13:11 -0800 KubeletHasNoDiskPressurekubelet has no disk pressure
    Ready TrueMon, 06 Feb 2017 17:50:06 -0800 Mon, 06 Feb 2017 17:13:41 -0800 KubeletReadykubelet is posting ready status. AppArmor enabled
    Addresses:10.140.0.4,35.185.193.72,gke-drone-default-pool-d9a3b45a-l082
    Capacity:
     alpha.kubernetes.io/nvidia-gpu:0
     cpu: 1
     memory:1740088Ki
     pods:110
    Allocatable:
     alpha.kubernetes.io/nvidia-gpu:0
     cpu: 1
     memory:1740088Ki
     pods:110
    System Info:
     Machine ID:9d264f9182ffa64366cd05fda65a9c20
     System UUID: 9D264F91-82FF-A643-66CD-05FDA65A9C20
     Boot ID: eb0a6a3f-0316-4898-a0a8-03b8207a2125
     Kernel Version:4.4.21+
     OS Image:Google Container-VM Image
     Operating System:linux
     Architecture:amd64
     Container Runtime Version: docker://1.11.2
     Kubelet Version: v1.5.2
     Kube-Proxy Version:v1.5.2
    PodCIDR:10.132.1.0/24
    ExternalID: 7008904420460372122
    Non-terminated Pods:(5 in total)
    Namespace NameCPU RequestsCPU LimitsMemory Requests Memory Limits
    --------- ----------------------------------------- -------------
    kube-system fluentd-cloud-logging-gke-drone-default-pool-d9a3b45a-l082100m (10%)0 (0%)200Mi (11%) 200Mi (11%)
    kube-system heapster-v1.2.0-2168613315-m2sv2138m (13%)138m (13%)301856Ki (17%)301856Ki (17%)
    kube-system kube-dns-autoscaler-2715466192-2csx920m (2%)0 (0%)10Mi (0%) 0 (0%)
    kube-system kube-proxy-gke-drone-default-pool-d9a3b45a-l082100m (10%) 0 (0%)0 (0%)0 (0%)
    kube-system kubernetes-dashboard-3543765157-17nxs 100m (10%)100m (10%)50Mi (2%) 50Mi (2%)
    Allocated resources:
    (Total limits may be over 100 percent, i.e., overcommitted.
    CPU RequestsCPU LimitsMemory Requests Memory Limits
    ------------------------------------- -------------
    458m (45%)238m (23%)568096Ki (32%)557856Ki (32%)
    Events:
    FirstSeen LastSeenCount FromSubObjectPath TypeReasonMessage
    --------- ------------- ----------------- ---------------------
    37m 37m 1 {kubelet gke-drone-default-pool-d9a3b45a-l082}NormalStartingStarting kubelet.
    37m 37m 1 {kubelet gke-drone-default-pool-d9a3b45a-l082}Warning ImageGCFailed unable to find data for container /
    37m 37m 1 {kube-proxy gke-drone-default-pool-d9a3b45a-l082} NormalStartingStarting kube-proxy.
    37m 37m 19{kubelet gke-drone-default-pool-d9a3b45a-l082}NormalNodeHasSufficientDisk Node gke-drone-default-pool-d9a3b45a-l082 status is now: NodeHasSufficientDisk
    37m 37m 19{kubelet gke-drone-default-pool-d9a3b45a-l082}NormalNodeHasSufficientMemory Node gke-drone-default-pool-d9a3b45a-l082 status is now: NodeHasSufficientMemory
    37m 37m 19{kubelet gke-drone-default-pool-d9a3b45a-l082}NormalNodeHasNoDiskPressure Node gke-drone-default-pool-d9a3b45a-l082 status is now: NodeHasNoDiskPressure
    36m 36m 1 {kubelet gke-drone-default-pool-d9a3b45a-l082}NormalNodeReady Node gke-drone-default-pool-d9a3b45a-l082 status is now: NodeReady
    
    ETC ETC ETC EACH NODE WOULD HAVE INFORMATION DISPLAYED HERE
    
  • kubectl describe pods - This command describes what pods are running. Currently I have none running so this isn’t a super useful command until some pods are actually running in the kubernetes cluster.

  • kubectl get pods --all-namespaces - This command actually gives a full list of pod in all the namespaces of the cluster. This would provide insight into what other pods the Kubernetes system itself has put onto the server.

  • kubectl config view - This is another command that provides specific configuration information.