So, a weird thing occurred in Kubernetes on the GKE cluster we have at the office. I figured I would do a write up here, before I forget everything and maybe allow the Kubernetes devs to read over this as an issue (https://github.com/kubernetes/kubernetes/issues/93783)
We noticed some weirdness occurring on our cluster when Jobs and CronJobs started behaving strangely.
Jobs were spawning but seemed to not spawn any pods to go with it, even over an hour later, they were sitting there without a pod to go with it.
Investigating other jobs, I found a crazy large number of pods in one of our namespaces, over 900 to be exact. These pods were all completed pods from a CronJob.
The CronJob was scheduled to run every minute, and the definition of the CronJob had valid values for the history -- sensible values for .spec.successfulJobsHistoryLimit and .spec.failedJobsHistoryLimit were set. And even if they weren't, the defaults would (or should) be used.
So why did we have over 900 cron pods, and why weren't they being cleaned up upon completion?
Just in case the number of pods were causing problems, I cleared out the completed pods:
But that also didn't help, pods were still being generated. Which is weird -- why is a CronJob still spawning pods even when it's suspended?
So then I remembered that CronJobs actually generate Job objects. So I checked the Job objects and found over 3000 Job objects. Okay, something is seriously wrong here, there shouldn't be 3000 Job objects for something that only runs once a minute.
So I went and deleted all the CronJob related Job objects:
kubectl delete job -n {namespace} $(kubectl get jobs -n {namespace} | grep {cronjob-name} | awk '{print $1}' | xargs)
This reduced the pods down, but did not help us determine why the Job objects were not spawning pods.
I decided to get Google onto the case and raised a support ticket.
Their first investigation brought up something interesting. They sent me this snippet from the Master logs (redacted)
2020-08-05 10:05:06.555 CEST - Job is created
2020-08-05 11:21:16.546 CEST - Pod is created
2020-08-05 11:21:16.569 CEST - Pod (XXXXXXX) is bound to node
2020-08-05 11:24:11.069 CEST - Pod is deleted
2020-08-05 12:45:47.940 CEST - Job is created
2020-08-05 12:57:22.386 CEST - Pod is created
2020-08-05 12:57:22.401 CEST - Pod (XXXXXXX) is bound to node
Spot the problem?
The time between "Job is created" and "Pod is created" around 80 minutes in the first case, and 12 minutes in the second one. That's right, it took 80 minutes for the Pod to be spawned.
And this is where it dawned on me about what was possibly going on.
The CronJob spawned a Job object. It tried to spawn a pod, and that took a significant amount of time, far more than the 1 minute between runs
The next cycle, the CronJob looks to see if it has a running pod due to the .spec.concurrencyPolicy value.
The CronJob does not find a running pod so generates another Job object, which also gets stuck waiting for pod generation
And so on, and so on.
Each time, a new Job gets added, gets stuck waiting for pod generation for an abnormally long time, which causes another Job to be added to the namespace which also gets stuck...
Eventually, the pod will generate but by then there's now a backlog of Jobs, meaning even if I suspended the CronJob, it won't have any effect until the Jobs in the backlog are cleared or deleted (I had deleted them).
Google investigated further, and found the culprit:
Failed calling webhook, failing open www.up9.com: failed calling webhook "www.up9.com": Post https://up9-sidecar-injector-prod.up9.svc:443/mutate?timeout=30s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
We were testing up9 and this was using a webhook, so it looks like a misbehaving webhook was causing this problem. We removed the webhook and everything started working again.
So where does this leave us? Well, a few thoughts:
A misbehaving/misconfigured webhook can cause a Snowball effect in the cluster causing multiple runs of a single CronJob without cleanup -- successfulJobsHistoryLimit and failedJobsHistoryLimit values are seemingly ignored.
This could break systems where the CronJob is supposed to be run mutually exclusively, since the delay in pod generation could allow two cron pods to spawn together, even though the CronJob has a concurrencyPolicy set as Forbid.
If someone managed (whether intentionally or maliciously) to install a webhook that causes this pod spawning delay, and then adds a CronJob that runs once a minute -- and then maliciously crafts the job to never finish, this snowball effect will cause the cluster to run out of resource and/or scale up nodes forever or until it hits the max allowed by your configuration.
There is a new story doing the round about how Twitter found that it had stored user’s password in the clear in an internal log. Whilst reading it, I got this email from Twitter:
While this isn’t the first time a big company has done this (Github for one also did this), it seems unbelievable that a big company like Twitter would get itself caught out by this basic, common sense security practice. Pretty much every YouTube video and article about correctly handling passwords will tell you not to store them in the clear and only store them as hashes (with salts, preferably). Hashing algorithms are meant to be really difficult or impossible to reverse, meaning you can’t (easily) use the hashes to determine the original passwords.
Some examples from a quick YouTube search – Tom Scott’s video’s really good btw :), although is comment about “using login using Twitter and let them store your password for you” is a bit ironic :P
The fact that Twitter has our unencrypted passwords on disk… does this mean Twitter has been saving our original passwords before hashing them?
More to the point - whilst Twitter are quick to point out that no-one at the company can see the masked password, they don’t mention who has (or had) access to the unmasked passwords in the internal log. Or for how long…
Twitter users who had their accounts on private may not have been as private as they initially thought….
Facebook have been having a lot of bad publicity lately (and I would personally say it’s long overdue) and a lot of it over privacy. Now, there’s talk about Facebook lifting SMS and phone call information from Android phones with consent. Yes, Facebook asks for it, but you can (and should) refuse it access.
Later versions of Android allow you to revoke and change the permissions given to an app, and also prompt you again if the app asks for it.
My Facebook app has very little permissions on my device because I don’t trust it a single bit.
I also have Privacy Guard enabled and restricted. Whenever it wants to know my location, I can refuse it.
Whilst finding vulnerabilities is a bad thing, having them found by white hat hackers is a good thing. Hackathons like this one prove that it can be constructive to get a group of them in to find and help fix vulnerabilities in your system before they are found in public and exploited to death before you have a chance to fix them.
The US Air Force's second security hackathon has paid dividends... both for the military and the people finding holes in its defenses. HackerOne has revealed the results of the Hack the Air Force 2.0 challenge from the end of 2017, and it led to volunteers discovering 106 vulnerabilities across roughly 300 of the USAF's public websites. Those discoveries proved costly, however. The Air Force paid out a total of $103,883, including $12,500 for one bug -- the most money any federal bounty program has paid to date.
Kubernetes is an awesome piece of kit, you can set applications to run within the cluster, make it visible to only apps within the cluster and/or expose it to applications outside of the cluster.
As part of my tinkering, I wanted to setup a Docker Registry to store my own images without having to make them public via docker hub. Doing this proved a bit more complicated than expected since by default, it requires SSL which requires a certificate to be purchased and installed.
Enter Let’s Encrypt which allows you to get SSL certificates for free; and by using their API, you can set it to regularly renew. Kubernetes has the kube-lego project which allows this regular integration. So here, I’ll go through enabling an application (in this case, it’s a docker registry, but it can be anything).
First, lets ignore the lego project, and set up the application so that it is accessible normally. As mentioned above, this is the docker registry
I’m tying the registry storage to a pv claim, though you can modify this to tie to S3, instead etc.
Once you’ve applied this, verify your config is correct by ensuring you have an external endpoint for the service (use kubectl describe service registry | grep “LoadBalancer Ingress”). On AWS, this will be an ELB, on other clouds, you might get an IP. If you get an ELB, CNAME a friendly name to it. If you get an IP, create an A record for it. I’m going to use registry.blenderfox.com for this test.
Verify by doing this. Bear in mind it can take a while before DNS records updates so be patient.
host $(SERVICE_DNS)
So if I had set the service to be registry.blenderfox.com, I would do
host registry.blenderfox.com
If done correctly, this should resolve to the ELB then resolve to the ELB IP addresses.
Next, try to tag a docker image of the format registry-host:port/imagename, so, for example, registry.blenderfox.com:9000/my-image.
Next try to push it.
docker push registry.blenderfox.com:9000/my-image
It will fail because it can’t talk over https
docker push registry.blenderfox.com:9000/my-image
The push refers to repository [registry.blenderfox.com:9000/my-image]
Get https://registry.blenderfox.com:9000/v2/: http: server gave HTTP response to HTTPS client
So let’s now fix that.
Now let’s start setting up kube-lego
Checkout the code
git clone git@github.com:jetstack/kube-lego.git
cd into the relevant folder
cd kube-lego/examples/nginx
Open up nginx/configmap.yaml and change the body-size: “64m” line to a bigger value. This is the maximum size you can upload through nginx. You’ll see why this is an important change later.
Now, look for the external endpoint for the nginx service
kubectl describe service nginx -n nginx-ingress | grep “LoadBalancer Ingress”
Look for the value next to LoadBalancer Ingress. On AWS, this will be the ELB address.
CNAME your domain for your service (e.g. registry.blenderfox.com in this example) to that ELB. If you’re not on AWS, this may be an IP, in which case, just create an A record instead.
Open up lego/configmap.yaml and change the email address in there to be the one you want to use to request the certs.
docker tag registry.blenderfox.com:9000/my-image registry.blenderfox.com/my-image
docker push registry.blenderfox.com/my-image
Note we are not using a port this time as there is now support for SSL.
BOOM! Success.
The tls section indicates the host to request the cert on, and the backend section indicates which backend to pass the request onto. The body-size config is at the nginx level so if you don’t change it, you can only upload a maximum of 64m even if the backend service (docker registry in this case) can support it. I have it set here at “1g” so I can upload 1gb (some docker images can be pretty large)
And he’s got proof, sort of. Lama performed a test. For two days, all he talked about was Kit-Kats.
“The next day, all I saw on my Instagram and Facebook were Kit-Kat ads,” Lama said.
After his Kit-Kat experiment, he successfully repeated it with chatter about Lysol. The 23-year-old musician is now more convinced than ever that Facebook is listening to his conversations through his phone’s microphone.
“It listens to key words. If you say a word enough times, the algorithm catches those words and it sets off targeted ads,” Lama theorized.
Lama is far from alone. The belief that Facebook is actively listening to people through their phones has become a full-on phenomenon. Facebook has, of course, denied it does this. That has done little to dampen the ongoing paranoia around the theory.
The malware backdoor in this story is quite intriguing. They are targeting specific companies (Samsung, Akamai, Cisco, Microsoft amongst them) and only attempting the second level attack if they are detecting they are being installed there.
The advice mentioned in the article is that anyone who installed the software on their system should REFORMAT THEIR DRIVE. Quite an extreme recommendation. My suggestion - stop using Windows.
Torvalds is not a huge fan of the ‘security community’ as he doesn’t see it as black and white. He maintains that bugs are part of the software development process and they cannot be avoided, no matter how hard you try. “constant absolute security does not exist, even if we do a perfect job,” said Torvalds in a conversation with Jim Zemlin, the executive director of the Linux Foundation.
“As a technical person, I’m always very impressed by some of the people who are attacking our code,” Torvalds said. “I get the feeling that these smart people are doing really bad things that I wish they were on our side because they are so smart and they could help us.”
Another vulnerability hits the news. Whilst similar to heartbleed in leaking memory contents, it does not seem to be too risky if you’re running it as a single user, and the memory leak isn’t huge quantities.
Saying that, this vulnerability also may also affect cloud systems. For example, on AWS, (which has httpd), doing a version check:
$ httpd -v
Server version: Apache/2.4.27 (Amazon)
Server built: Aug 2 2017 18:02:45
However, without knowing how Amazon have setup Apache behind the scenes, are we able to say definitely that we are/aren’t affected?
Looking forward to when LineageOS can upgrade to Oreo. There’s a lot of new features that may make life a lot easier generally. Take a look in the article for details
We take a 20,000 word deep-dive on Android's "foundational" upgrades.
Microsoft patching systems as far back as XP? WannaCry is BIG, and the problem is…. there’s going to be systems out there still not patched due to laziness or no internet connection and are vulnerable.
Seen a couple of XP boxes around – some self-service tills, ATMs, and payphones all use XP…
Decommissioned for years, Windows XP, 8, and Server 2003 get emergency update.
A new randomware was doing the rounds, but rather than paying to unlock, it's asking for a high score in a bullet-hell (think Touhou Project) game. The game is fun anyway, but would you be willing to play it to get your files? High stakes :)
Creator apologizes for a “joke” that really requires expert play to unlock files.
For the most part, these aren't too much of a concern but these two might be:
Phone
read phone status and identity
Device ID & call information
read phone status and identity
These relate to reading the device information such as the IMEI and call information. I'm not too concerned about the call side -- you can block this with later version of Android's permission manager (and I use that a lot with different apps), but I'm not sure if I can block attempts to read phone status.
Their justification of this to track usage in China because it is blocked, I guess does make sense, but am I the only one who thinks doing it this way leaves it way too open for abuse and misuse?
Despite being a library that most people outside of the technology industry have never heard of, the Heartbleed bug in OpenSSL caught the attention of the mainstream press when it was uncovered in April 2014 because so many websites were vulnerable to theft of sensitive server and user data. At LinuxCon Europe, Rich Salz and Tim Hudson from the OpenSSL team did a deep dive into what happened with Heartbleed and the steps the OpenSSL team are taking to improve the project.
A lot of projects are corporate, but some, like NTP are small, or even solo projects. When these small projects become really important and still not have enough resource to maintain, the security issues can’t be patches as fast as, say Oracle, Microsoft or Google.
Everyone benefits from Network Time Protocol, but the project struggles to pay its sole maintainer or fund its various initiatives
Guys, gals, aardvarks, fishes: I'm running out of ways to say this. Your Android device is not in any immediate danger of being taken over a super-scary malware monster.
It’s a silly thing to say, I realize, but we go through this same song and dance every few months: Some company comes out with a sensational headline about how millions upon millions of Android users are in danger (DANGER!) of being infected (HOLY HELL!) by a Big, Bad Virus™ (A WHAT?!) any second now. Countless media outlets (cough, cough) pick up the story and run with it, latching onto that same sensational language without actually understanding a lick about Android security or the context that surrounds it.
To wit: As you’ve no doubt seen by now, our latest Android malware scare du jour is something an antivirus software company called Check Point has smartly dubbed “Quadrooter” (a name worthy of Batman villain status if I’ve ever heard one). The company is shouting from the rooftops that 900 million (MILLION!) users are at risk of data loss, privacy loss, and presumably also loss of all bladder control – all because of this hell-raising “Quadrooter” demon and its presence on Qualcomm’s mobile processors.
“Without an advanced mobile threat detection and mitigation solution on the Android device, there is little chance a user would suspect any malicious behavior has taken place,” the company says in its panic-inducing press release.
Well, crikey: Only an advanced mobile threat detection and mitigation solution can stop this? Wait – like the one Check Point itself conveniently sells as a core part of its business? Hmm…that sure seems awfully coincidental.
TL;DR: A “mobile threat detection and mitigration solution” is already present on practically all of those 900 million Android devices. It’s a native part of the Android operating system called Verify Apps, and it’s been present in the software since 2012….. Android has had its own built-in multilayered security system for ages now. There’s the threat-scanning Verify Apps system we were just discussing. The operating system also automatically monitors for signs of SMS-based scams, and the Chrome Android browser keeps an eye out for any Web-based boogeymen.
Everyone loves hearing about pentesting and ethical hacking distros these days, and it looks like it is even becoming a trend among aspiring security professionals.
Therefore, today we have some good news for those who want to try one of the best penetration testing and security auditing operating systems based on the Linux kernel, Kali Linux, the successor of the popular BackTrack, and don’t have the resources to run the Live CD or install the OS on their computers.
Network security specialist Jerry Gamblin has created a project called KaliBrowser, which, if you haven’t already guessed, it allows you to run the famous Kali Linux operating system on a web browser, using the Kali Linux Docker image, Openbox window manager, and NoVNC HTML5-based VNC client.
Tor Messenger is a cross-platform chat program that aims to be secure by default and sends all of its traffic over Tor. It supports a wide variety of transport networks, including Jabber (XMPP), IRC, Google Talk, Facebook Chat, Twitter, Yahoo, and others; enablesOff-the-Record (OTR) Messaging automatically; and has an easy-to-use graphical user interface localized into multiple languages.
SPOTIFY RELEASED A new privacy policy that is now in effect, and it turns out that the company wants to learn a lot more about you and there’s not much you can do about it.
We encourage everyone to read the whole privacy policy before downloading the update or checking off the “Accept” box, but in case you have better things to do, here are some highlights from it.
…
“With your permission, we may collect information stored on your mobile device, such as contacts, photos, or media files. Local law may require that you seek the consent of your contacts to provide their personal information to Spotify, which may use that information for the purposes specified in this Privacy Policy.” – Spotify
Like a jealous ex, Spotify wants to see (and collect) your photos and see who you’re talking to. What kind of media files Spotify will collect from you is vague, and why the company needs it is unclear, but it’s doing it regardless. Also, the fact that Spotify expects you to go through your contact list and ask everyone for their consent in sharing their data with Spotify is–what’s the word? Oh yes: it’s ridiculous.
…
“You may integrate your Spotify account with Third Party Applications. If you do, we may receive similar information related to your interactions with the Service on the Third Party Application, as well as information about your publicly available activity on the Third Party Application. This includes, for example, your “Like”s and posts on Facebook.” – Spotify
It shouldn’t surprise you that if you connect your Spotify account to Facebook, Spotify will be able to see the information you post there. If this bothers you, we suggest that you log into your Spotify preferences and disconnect Spotify from your Facebook account (more information on how to do this can be found here). After all, Facebook isn’t all that necessary to use Spotify (unless, of course, you want your friends to know you’re listening to Owl City).
…
“If you don’t agree with the terms of this Privacy Policy, then please don’t use the Service.” – Spotify
…
I value my privacy, so I’ll stop using Spotify. Bye Spotify, I won’t miss you.
As of January 2015, more than 23.3% of the top 10 million websites are using WordPress (source). To say it’s a popular choice for a content management system is an understatement. Part of its appeal are the thousands of free and commercial, pre-made themes available for the system. They are an enticing way to publish a website with little or no knowledge of programming required.
…
It helps to understand the motivation different parties may have in creating a WordPress theme for sale or free download.
Individual programmers are often motivated to create a theme to upload it to a site that sells them at low cost. Much like a stock photo, think of these themes as stock themes. You pay a fee that is a fraction of the cost of hiring a professional to create a custom design and theme, you download it for your website, and the individual programmer gets a small cut of that fee. With free themes, the original programmer usually requires that a link back to them appear on the site, gaining them more internet exposure.
However, there’s also a third, more nefarious reason for creating free themes – to spread malware and other malicious code. That’s right, some unscrupulous individuals will code nasty stuff right into a theme hoping to cash in on the popularity of themes and the ease of installing them, as well as uneducated or uninformed user. So how do you avoid this one? Of course I’d recommend going custom (more on that shortly), but if you’re determined to use a pre-made theme, be careful where you get them. There are several popular sites that sell themes, and WordPress.org has a directory of themes. Those are your best bets but you often have little recourse if you purchase or download a free theme and install it yourself any of these occur:
you manage to screw something up on the site
your site is hacked
your site is flagged by Google for containing malware