Blender Fox


CKAD

#

It's taken me nearly a year, but I finally figured out one of the questions that stumped me in my CKAD (writeup: https://blenderfox.com/2019/12/01/ckad-writeup/)

In the exam, the question was to terminate a cronjob if it lasts longer than 17 seconds. There’s a startup deadline but not a duration deadline. It could be implemented within the command of the application itself, or by specifying to replace any previous running version of the jobs.

Well, I finally had that situation recently at work and wanted to terminate a cronjob if it was active more than 5 minutes, since the job shouldn't take that long. Finally found out that the answer was not in the CronJob documentation, but in the Job documentation.

CronJobs spawn a Job resource, and within the specification, you can include spec.activeDeadlineSeconds. This will terminate the job pod at that time and will consider the job as failed.

CKAD Exam Passed

#

Whilst I was concerned about my scoring, I still passed. It scored 72%, with a pass mark of 66%

CKA Exam Passed

#

5 questions I could not answer, and one I could, but arguably that question was ambiguous

  1. Fix a broken cluster -- kubelet was started but couldn't connect to itself.
  2. Add node to cluster. Nodes do not have kubeadm installed.
  3. Static pod. Couldn't find where the path was to put the manifests for the yaml.

4 and 5 I can't remember the questions but will update if I remember

Ambiguous Question:

  1. Create a pod with a persistent volume, that isn't persistent, and doesn't tell you how big to make the PV. I used emptyDir, but that's not really a PV (didn't create a PV or a PVC)

CKAD Writeup

#

So I did the CKAD exam and it was one of the latest exams I've done, starting at 22:45 and finishing at 00:45. The CKAD exam is 2 hours versus the CKA's 3 hours

And I went into the exam feeling relatively confident. But, damn, the 2 hours goes by really quickly.

Had several questions I wasn't able to complete or only partially complete.

Liveness and Readiness Probes

This question wanted a pod to be restarted if an endpoint returns 500. Simple enough, but there was a catch, if another endpoint returns 500, then the application is starting, and so disregard the check.

I used similar by implementing this check as a curl command in a real life scenario (I should write a blog entry on that some time).

So in the exam, I did both the liveness and readiness checks to chain two curl commands together, if the first endpoint (/starting) in this case, returned 200, then it would do the next endpoint (/healthz) and return a fail if that gave a 500.

Buuuuut, the image didn't have curl installed so the probes failed. I could use the hack I've used in my image and install curl as part of the check, but time constraints wouldn't let me.

Persistent Volumes

Similar to the CKA question, there was a quirkily worded question here which wanted me to add a file to a node, create a pod that used hostPath and reserve a 1Gi PV. The documentation does not provide an example of that, just a pod with a hostPath as an internal volume: https://kubernetes.io/docs/concepts/storage/volumes/#hostpath

Network Policies

A technology I haven't used in Kubernetes yet. They gave several policies, one that allowed "app:proxy" and one that allowed "app:db" and wanted ius to edit a pod to only be allowed to talk to only those.

We were not allowed to modify the policies. I can't remember whether we were allowed to create new policies for this question

But both those policies use the app label. And the pod can't have the same label with two values (I did try)

Though thinking about it now, and after a few checks, the NetworkPolicy object describes how to restrict traffic to the pods in question -- so those selectors may be related to the pods the policy is restricting. I think I should have looked inside the policies more carefully to see what it was saying on the ingress rule and see if it was saying something like "app:frontend", and then making sure the pod was labelled accordingly.

"Ambassador" Sidecar Pattern

A big chunk of the exam time was taken up by the sidecar questions -- far more time than I would have liked, to be honest.

They had a question on adaptor, using fluentd, which was fine, I got that to work, but also had another where I had to use HAProxy to proxy requests do a different port (ambassador pattern). A useful use case, but I ran out of time to finish it. I wanted to come back and revisit it if I had time, but didn't.

CronJobs

Terminate a cronjob if it lasts longer than 17 seconds. There's a startup deadline but not a duration deadline. It could be implemented within the command of the application itself, or by specifying to replace any previous running version of the jobs.

Thoughts

I don't think I passed this, having so many issues is probably going to take me into the 60s mark.

CKA Exam Passed

#

I've totally forgotten to write this up, but I successfully passed my CKA exam on the third attempt with a 78% score, scraping a pass.

I'll write up details of some of the questions I couldn't answer so I can come back and look them up later.

CKA Exam: Strike #2

#

I took my CKA exam for the second time – and failed again. This time. however got much closer to the pass mark than my first time.

Things I think I fluffed on:

Cluster DNS

pods, services and how they can show up using nslookup. I got caught up in trying to figure out why my DNS wasn’t working, and I think it’s because I was trying to nslookup from outside the cluster, which obviously would not resolve the “.cluster.local” domain correctly. I forgot that you can do an interactive, in-cluster shell using

[code lang=text] kubectl run -i –tty busybox –image=busybox – sh [/code]

Not to mention that doing nslookup {service}.svc.cluster.local won’t work, and you have to use -type=a to nslookup to get the ip address of the service to confirm it is resolving

etcd Snapshots

This got me both times. The first time I had no idea why doing a snapshot command was failing. The second time I figured out how to do the backup and how to invoke it from the pod, but still got it wrong. Now I figured out (and it was right in front of my face):

[code lang=text] <br />WARNING: Environment variable ETCDCTL_API is not set; defaults to etcdctl v2. Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API.

USAGE: etcdctl [global options] command [command options] [arguments…]

VERSION: 3.2.18

[/code]

I wasn’t using the ETCDCTL_API variable beforehand so it was falling back to V2 api, which doesn’t have the snapshot command:

[code lang=text] <br /># etcdctl NAME: etcdctl - A simple command line client for etcd.

WARNING: Environment variable ETCDCTL_API is not set; defaults to etcdctl v2. Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API.

USAGE: etcdctl [global options] command [command options] [arguments…]

VERSION: 3.2.18

COMMANDS: backup backup an etcd directory cluster-health check the health of the etcd cluster mk make a new key with a given value mkdir make a new directory rm remove a key or a directory rmdir removes the key if it is an empty directory or a key-value pair get retrieve the value of a key ls retrieve a directory set set the value of a key setdir create a new directory or update an existing directory TTL update update an existing key with a given value updatedir update an existing directory watch watch a key for changes exec-watch watch a key for changes and exec an executable member member add, remove and list subcommands user user add, grant and revoke subcommands role role add, grant and revoke subcommands auth overall auth controls help, h Shows a list of commands or help for one command

GLOBAL OPTIONS: –debug output cURL commands which can be used to reproduce the request –no-sync don’t synchronize cluster information before sending request –output simple, -o simple output response in the given format (simple, extended or json) (default: “simple”) –discovery-srv value, -D value domain name to query for SRV records describing cluster endpoints –insecure-discovery accept insecure SRV records describing cluster endpoints –peers value, -C value DEPRECATED - “–endpoints” should be used instead –endpoint value DEPRECATED - “–endpoints” should be used instead –endpoints value a comma-delimited list of machine addresses in the cluster (default: “http://127.0.0.1:2379,http://127.0.0.1:4001”) –cert-file value identify HTTPS client using this SSL certificate file –key-file value identify HTTPS client using this SSL key file –ca-file value verify certificates of HTTPS-enabled servers using this CA bundle –username value, -u value provide username[:password] and prompt if password is not supplied. –timeout value connection timeout per request (default: 2s) –total-timeout value timeout for the command execution (except watch) (default: 5s) –help, -h show help –version, -v print the version

ETCDCTL_API=3 etcdctl

NAME: etcdctl - A simple command line client for etcd3.

USAGE: etcdctl

VERSION: 3.2.18

API VERSION: 3.2

COMMANDS: get Gets the key or a range of keys put Puts the given key into the store del Removes the specified key or range of keys [key, range_end) txn Txn processes all the requests in one transaction compaction Compacts the event history in etcd alarm disarm Disarms all alarms alarm list Lists all alarms defrag Defragments the storage of the etcd members with given endpoints endpoint health Checks the healthiness of endpoints specified in --endpoints flag endpoint status Prints out the status of endpoints specified in --endpoints flag watch Watches events stream on keys or prefixes version Prints the version of etcdctl lease grant Creates leases lease revoke Revokes leases lease timetolive Get lease information lease keep-alive Keeps leases alive (renew) member add Adds a member into the cluster member remove Removes a member from the cluster member update Updates a member in the cluster member list Lists all members in the cluster snapshot save Stores an etcd node backend snapshot to a given file snapshot restore Restores an etcd member snapshot to an etcd directory snapshot status Gets backend snapshot status of a given file make-mirror Makes a mirror at the destination etcd cluster migrate Migrates keys in a v2 store to a mvcc store lock Acquires a named lock elect Observes and participates in leader election auth enable Enables authentication auth disable Disables authentication user add Adds a new user user delete Deletes a user user get Gets detailed information of a user user list Lists all users user passwd Changes password of user user grant-role Grants a role to a user user revoke-role Revokes a role from a user role add Adds a new role role delete Deletes a role role get Gets detailed information of a role role list Lists all roles role grant-permission Grants a key to a role role revoke-permission Revokes a key from a role check perf Check the performance of the etcd cluster help Help about any command

OPTIONS: –cacert="" verify certificates of TLS-enabled secure servers using this CA bundle –cert="" identify secure client using this TLS certificate file –command-timeout=5s timeout for short running command (excluding dial timeout) –debug[=false] enable client-side debug logging –dial-timeout=2s dial timeout for client connections –endpoints=[127.0.0.1:2379] gRPC endpoints -h, –help[=false] help for etcdctl –hex[=false] print byte strings as hex encoded strings –insecure-skip-tls-verify[=false] skip server certificate verification –insecure-transport[=true] disable transport security for client connections –key="" identify secure client using this TLS key file –user="" username[:password] for authentication (prompt if password is not supplied) -w, –write-out=“simple” set the output format (fields, json, protobuf, simple, table)

[/code]

And then I can run

ETCDCTL_API=3 etcdctl snapshot save snapshot.db –cacert=/etc/kubernetes/pki/etcd/ca.crt –cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt –key=/etc/kubernetes/pki/etcd/healthcheck-client.key

To create the snapshot.

Certificate Rotation

I need to look this one up – I had no idea how to rotate the certificates

Static Pods

I’d never directly dealt with static pods before this exam, and I don’t think I had this question in my first run, so it was one I didn’t know the answer do. A bit of hunting on the k8s side led me to figure out it was a static pod question, but I couldn’t find out where the exam cluster was looking for its static pod manifests. The question told me a directory, but my yaml didn’t seem to be picked up by the kubelet.

 

Final note

Generally, a lot of the questions from my first exam run showed up again in this run, which let me run through over half of the exam fairly quickly. I thought I was going to do better than my first run, and I did, but not by much.

LPIC-1

#

Linux Professional Institute

I’ve finished studying for the first of two exams for the LPIC-1 certification, and I have found some exam questions (about 600 of them), and have started to go through them.

The first thing that struck me about these questions is either I’ve not been studying all the topics, or some topics have been removed out of the exam. For example, some of the questions reference LILO, but according to the LPI page on the 101 exam, there’s no mention of LILO (but there is mention of Grub 2 and Grub Legacy). Then again LILO and Grub Legacy are quite limited by today’s standards, so it could be that they really are removed out of the exam. Guess I’ll have to take that chance.