Jonathan Gazeley

The Liturgical Colour app

This is an article about the ancient traditions of the Christian Church, and the modern principles of developing software. Probably not much of an intersection there!

Seasons

For those who don’t know, most churches have a concept of liturgical seasons and colours. These vary a bit between denominations (i.e. Anglican, Catholic, Protestant, Episcopal, Lutheran, etc) and countries, so for this article I’m specifically talking about the Church of England, which is a member of the Anglican Church. There is a lot of jargon involved so I’ll try and keep to plain English wherever I can – that includes church jargon and software jargon!

Some seasons, everyone will have heard of (like Advent, Christmas and Lent) but there are some other seasons that you might only know about if you go to church (like Pentecost and Trinity). Some of these seasons are related to Christmas – which has a fixed date every year, and some are related to Easter – which moves around each year according to the cycles of the moon.

On top of the irregular seasons, there is a calendar of Holy Days which includes the big days that everyone knows (Christmas and Easter) but also many days throughout the year dedicated to saints, martyrs and Biblical events. There is a priority system to work out what happens if one of the movable days like Easter happens to land on one of the non-movable days. Sometimes they share, and the day therefore marks both dedications. Sometimes the more important dedication cancels the less important dedication. If a dedication is cancelled, sometimes it gets shunted to a different day. Other times it just gets skipped for that year.

Colours

Each season has its own colour which is typically reflected in the altar hangings (tablecloths), the garments worn by the priest, other decorations around the church building, and maybe even the flowers that are on display.

Some, but not all, of these Holy Days have their own colour which overrides the colour of the season.

Rules

All of these rules mean it is rather complicated to work out what this year’s seasons are, what dedications fall on which days, and what the colour is. Fortunately, the Church of England publishes an annual book called the Lectionary, which is basically a calendar that contains all of that year’s information about seasons, Holy Days, colours, readings, etc. This is usually good enough for typical church business like planning services and remembering when to change the altar hangings.

Algorithm

It’s not good enough for me, though. I want a way of automatically working out the dates, the seasons, and the colours. This means I have to reimplement the same algorithm used to work all this stuff out, and then seed the algorithm with a list of Holy Days.

I looked around for existing open-source projects that can calculate liturgical dates and colours. There were a few, but they were mostly focused on the Roman Catholic Church, and none of them were written in languages I knew well.

Then I found a Perl module called DateTime::Calendar::Liturgical::Christian which is aimed at the US Episcopal Church, which seems to have a fair bit of commonality with the Anglican church. I know a bit of Perl, so I ported this module to Python (the main language I use now), tweaked the algorithm to reflect the Church of England’s seasons, and replaced the Episcopal Church’s Calendar of the Church Year with the Church of England’s Calendar of Saints, along with its revised priority system.

The end result is a Python library that I have called Liturgical Colour, and which is published on PyPI. Here’s sample output when run for 31st March 2024 – Easter Day:

name : Easter
url : https://en.wikipedia.org/wiki/Easter
prec : 9
type : Principal Feast
type_url : https://en.wikipedia.org/wiki/Principal_Feast
season : Easter
season_url : https://en.wikipedia.org/wiki/Eastertide
weekno : 1
week : Easter 1
date : 2024-03-31
colour : white
colourcode : #ffffff

The output is basic, but includes all the key information – including the name of the colour and the HTML colour code, which leads neatly onto the next section.

App

Now we have a library that can calculate all the required information for any date, we just need a way of displaying the information. I’m no frontend developer, but I managed to throw together a simple Python app based on Flask which uses the Liturgical Colour library and renders a simple display for the user.

Here are some screenshots of example output from different dates. There are also two days per year when the app will display rose pink!

The Liturgical Colour App, aside from lacking a catchy name, has its source code on GitHub and its container image on Docker Hub. There’s also a Helm chart to allow you to deploy it on Kubernetes.

The Liturgical Colour app is publicly available at liturgical.gazeley.uk so everyone can check which saint’s day is on their birthday!

What’s in my Facebook feed?

I’ve been using Facebook since 2006, back when it still required a college or university email address to sign up. These days, I use it for two main purposes:

To keep up with friends and family
To participate in groups related to my interests

My use of Facebook groups to discuss and read about my interests has more-or-less completely replaced my use of forums and mailing lists for this purpose.

Recently, I’ve noticed an increasing number of posts in my feed that are not from groups or pages I follow and it feels like the whole thing is inundated with sponsored ads.

So I decided to do a little survey, and I scrolled through the first 150 items in my feed, categorising them into the following:

Posts from Friends – exactly what it says on the tin.
Posts from groups & pages I follow – for example, groups like Woodworking UK that I have joined
Posts from groups & pages I don’t follow – groups and pages that Facebook thinks I might be interested in, even though I haven’t joined
Sponsored ads – mostly picture and video ads which link to external sites like Amazon

This is not a discussion on how Facebook decides what to show me, and how “the algorithm” uses my personal data – merely the composition of my feed.

Posts from Friends	7	5%
Posts from groups & pages I follow	66	44%
Posts from groups & pages I don’t follow	31	21%
Sponsored ads	46	31%

It seems my gut feeling was correct! About a fifth of my feed consists of items from pages and groups I don’t follow, and a third is sponsored ads. In total, a little over half my feed is content that I didn’t ask for!

I saw ads from a variety of sectors – technology, food & drink, holidays, clothing, entertainment, and I saw some of the ads several times.

The content from groups I don’t follow is irritating to me. For example, I like classic Jaguars (the car, not the big cat) and so I’ve joined a couple of groups dedicated to Jaguars. “The algorithm” has picked up that I like cars, and a lot of the content it shows me is from groups dedicated to other cars (such as BMWs), or other geographic areas (such as the Pittsburgh Classic Car Meet).

I noticed that some of the posts from friends were several days old, so I don’t know why they are only now appearing in my feed.

I don’t have data to back this up, but I believe Facebook is prioritising content from groups over content from friends, and making a big effort to users into as many groups as possible and to follow as many pages as possible, not to mention serving as many ads as possible. It feels like its usefulness to me is declining.

Why don’t EVs have solar panels on the roof?

It’s a question I’ve heard asked quite a few times: why don’t EVs have solar panels on the roof? Then they wouldn’t need to be charged with a charger, would have infinite range, and would be using only green electricity!

It’s a fair question. To answer it, let’s have a look at some numbers.

How much energy do we actually get from the sun? Well, solar irradiance at the surface of the earth, the amount of sunlight energy falling on each square metre of Earth’s surface, is approximately 1100W/m². But this is an average figure across the whole globe. It is more at the equator and less at the poles. Here at my location in the UK at 51.5° North, my weather station has measured peak solar irradiance of ~1000W/m² at the height of summer. This limits the maximum energy we can ever get from the sun, regardless of the type of solar panels used.

This graph shows the solar irradiance at my weather station over the daylight hours (08:00 – 20:00) of a typical sunny September day. The peak solar irradiance was more like 700W/m² with an average much below that. Note the gaps in the yellow/orange area, where clouds pass over.

Solar irradiance over 12 hours, September, UK

Let’s be generous, and take 700W/m² as our figure of solar irradiance. We must also account for the efficiency of solar panels. The conversion efficiency of commodity solar panels is only about 25%, which further reduces the amount of useful electricity we can extract down to 175W/m².

How large actually is a car roof? I’ve just been outside to measure the useful roof space of a typical family hatchback (a Hyundai i30) and it measured 1.0×1.7 m, for a total area of 1.7m². A larger family car (a Ford Mondeo) would be able to offer 1.1×1.9 m for a total of 2.09m² of useful roof area if the roof rails weren’t fitted.

The Mondeo is more comparable in size to a Tesla Model 3, so let’s run with those dimensions. Multiplying the available roof area for solar panels (2.09m²) with the average useful solar irradiance in the UK (175W) yields a total electrical power of 366W or 0.366kW from the roof of a large family car.

To put this into perspective, a typical home EV charger operates at 7kW – more than 19 times more power (and therefore 19 times faster) than on-roof solar charging. Standard chargers found in public locations are typically 22kW and rapid chargers are 43kW, which is 60 or 117 times more power than solar charging respectively. Which? has a good overview of charger types.

Staying with the Tesla Model 3, the base model has a battery pack rated for 50kWh, which is fairly typical for today’s EVs. With solar charging alone, it would take 137 hours (almost 6 days) to fully charge. But solar panels only work in daylight hours, so in summer it would take more like 12 days!

The available power is obviously reduced even further in the winter months, on cloudy days or if you park in a multi-storey car park or under a tree.

So I guess this is our answer – it’s just too underpowered and too slow to be practical.

But why couldn’t EVs come with solar panels on the roof for emergency backup? Just enough charging capability to make sure you don’t get stranded? Or to add a few miles to the range?

Well, they could. But cars are manufactured with slim profit margins and solar panels are an expensive component to include for such little benefit. It just doesn’t make business sense.

Building a large format camera

Introduction

I’ve wanted to build a wooden large format camera for years, but I made myself wait until my skills were up to scratch. I reasoned that a wooden camera isn’t really too much more difficult than a sort of trinket box / picture frame hybrid, so I decided to take the plunge.

I looked at various plans for cameras that are available online, but eventually decided to follow Jon Grepstad‘s freely available plans for a monorail camera – partly because his instructions were complete and contained diagrams, and partly because his measurements were in metric. Can’t be doing with sixty-fourths of an inch!

My decision to get started was cemented when I found some wooden window frames being ripped out. I saved them from the skip, and they turned out to be meranti, sometimes also known as Philippine mahogany. It’s a hardwood commonly used for window frames, because of its resistance to warping when it gets wet. My camera won’t need to withstand bad weather, but the reddish hue of meranti is very pleasant.

Preparation

The meranti needed a bit of a clean-up – first scraping off remnants of glue and sealant, and secondly checking carefully for screws and nails, and removing them. The camera plans call for thin boards of meranti ranging between 6mm and 15mm rather than thick chunks, so I ripped these pieces on my table saw to reduce them down to an 80mm width that my band saw can take, then used the bandsaw to resaw them into boards that were slightly thicker than required. Finally I passed them through the thicknesser to reduce them down the precise dimension and leave a nice, planed finish. The old varnish and effect of weathering had only gone into the wood a couple of millimetres so I was left with beautiful fresh timber and no indication that it had been window frames for decades.

Construction

I didn’t actually remember to take many pictures of the construction process, but there was a theme with the woodwork side of the project: further rip the planed boards into narrower strips, and then glue them back together in various L-shaped profiles as specified in the plans. Then almost all of the camera parts are cut from these long L-shaped profiles with a mitre saw, working around the occasional nail hole, and assembled in the same way as a picture frame with ordinary wood glue. Band clamps came in very handy here.

Front and rear frames complete, balanced on sliders and monorail

I used plywood for a couple of parts, such as the lens board, and the main body of the spring back film holder. The spring back holder is perhaps the most complex part of the build, as the ground glass focusing screen is held to the back of the camera by leaf springs, and when it’s time to take a photo you slide the film holder underneath the ground glass, which springs up, clamping the film holder in place and ensuring that the film is in the same plane that the ground glass was in during focusing.

One aspect of this project that was totally new to me was metalwork. I’ve never worked with brass before, but I was pleasantly surprised how easy it was to cut 1-2mm brass sheets on the bandsaw, drill on a pillar drill and work with a file by hand. The parts I needed to make were all simple, but it took quite a lot of effort to sand and polish out the tool marks.

Jon Grepstad’s plans described a rotating back system, but I decided to omit this as I rarely shoot in portrait orientation, and it seemed like an opportunity to save a little weight and complexity.

I completed the build by adding a bellows and ground glass I ordered from a photographic supplier, and a lens I salvaged from a broken camera. The bellows is conveniently the same dimension as a Toyo 45G. The ground glass is the traditional, non-Fresnel variety and the image is quite dim, but I don’t mind that. The lens is a Bausch & Lomb Rapid Rectilinear in a Kodak shutter, dated 1897. The focal length isn’t given but the maximum aperture is US 4 (Uniform System, not United States), equivalent to f/8.

To really bring out the mahogany colour and deepen the pinkish-red shade of the meranti, I finished with several coats of mahogany-tinted Danish oil. This also gives the wood a pleasant sheen rather than a high gloss finish.

Tools

I wanted to make a quick note about the tools I used. There’s a lot of discussion on the various woodwork groups online about the “best” tools and how cheap tools aren’t even worth bothering with.

I’m using budget tools exclusively, and while the build quality and precision aren’t always great compared to expensive brands, they have been sufficient for my modest needs as a hobbyist. With careful setting up and use of tools, I’ve managed to achieve sufficient precision.

Titan TTB579PLN planer/thicknesser
Titan TTB763TAS table saw with Saxton blade
LIDL Parkside PBS 350 band saw with Tuffsaws blades for wood and metal
Aldi Workzone pillar drill with Bosch drill bits
Wickes sliding mitre saw with Saxton blade

The finished item

I took the camera out to Trooper’s Hill Nature Reserve to take pictures of it in a scenic background.

Movements

A key part of the specification of any large format camera is its movements. The length of the monorail is 380mm so this allows for generous extension of the bellows at typical focal lengths.

	Front	Rear
Rise/Fall	+55mm, -35mm	0mm
Shift	±50mm	±50mm
Tilt	Limited only by bellows	±20°
Swing	Limited only by bellows	Limited only by bellows

Testing

Here’s the first test shot taken with this camera, also at Trooper’s Hill.

I’m really happy with the results. For a lens that is 126 years old, it has good, even coverage and decent sharpness at f/16, two stops down from wide open. There is no evidence of light leaks.

I slightly missed critical focus, and I think that’s largely because I used wood for the monorail instead of the recommended aluminium tube. The varnish-on-varnish sliding made it quite hard to make tiny adjustments accurately. Hopefully waxing the monorail will help with a smoother action..

I also found that the leaf springs I chose are slightly too strong and clamp down on the film holder slightly too hard. You have to be quite forceful when inserting the film holder, and it is quite easy to accidentally knock the alignment of the camera. Hopefully I’ll be able to bend the springs a bit so they’re not quite as aggressive.

Finally the last thing I noticed while using the camera is that it can be quite tricky to get it aligned when setting up. None of the adjustments have a centre “click” detent, so you have to take your time aligning and adjusting when composing an image. This isn’t a fast camera to use!

Overall, I am very pleased with my first camera build. It has turned out well, and I have made a usable camera that I cam proud of, and learned some new skills along the way. I’m also very grateful to Jon Grepstad for publishing his plans.

Ecowitt weather stations, Prometheus, and Grafana

I recently received an Ecowitt WS2910 weather station for Fathers’ day. I’ve always wanted a weather station so I was very excited. This isn’t supposed to be a review of the product, but I think it would be helpful to go over the basics before we dive into the meat of this blog post (the integration with Prometheus).

The outdoor portion of this weather station (model number WS69 if you buy it separately) measures wind speed, wind direction, temperature, humidity, rainfall and solar irradiation. It takes AA batteries which are supposed to last for 2-3 years as it gets the bulk of its power from the solar panel during daylight hours. There are several different Ecowitt weather stations which include the same sensor array.

The sensor array communicates with the indoor base unit periodically, via UHF radio. My model uses 868MHz but judging by the box, there are at least 4 different frequencies available for different parts of the world. The base unit doesn’t need pairing with the outdoor part like a Bluetooth connection – it just waits for the next radio signal to come in and starts displaying data.

The base station is simple but effective. It shows almost all the information you want on the main screen (not a touch screen), and there are several touch-sensitive buttons that let you change view or look at old data. This is a bit fiddly, so I’m just using it as a “current status” dashboard and not touching the buttons. It’s also not a true colour screen, but a backlit LCD which seems to have coloured filters over the backlight to illuminate each section.

Ecowitt sell several different WiFi-enabled base stations and the main differentiator seems to be the size and quality of the screen. All of the models that have WiFi appear to behave the same way when it comes to data logging and integration with various online weather services. At the time of writing, everything from here onwards in this blog post should apply at least to the following models (and I’d love to hear from you if you find this works or doesn’t work for your model):

Ecowitt WS2910 with colour LCD display (this one)
Ecowitt WS2320 with mono LCD display
Ecowitt HP2551 with colour TFT display
Ecowitt HP3500B with colour TFT display
Ecowitt GW1101 headless weather station, no display

What I’m really interested in are the different options for integrations. There’s a smartphone app called WSView Plus (for iOS and Android) which pairs with the base station via local WiFi and allows you to configure your real WiFi network name and passphrase. It also allows you to set up integrations with various weather services, including the manufacturer’s own service Ecowitt Weather, Weather Underground, Weathercloud, and the Met Office.

The Ecowitt Weather service requires authentication to see the weather data, so it’s no use for sharing with other people. Weather Underground has a nice dashboard which is publicly available, so I registered as station IBRIST302.

There is also the option for a “custom” integration to send weather data to a custom endpoint. There are no details in the manual about the format of the protocol but from reading it seems that the Ecowitt weather station sends the data as an HTTP POST request with form encoded data.

I had a look around to see if anyone else had already written a tool that could receive data from the Ecowitt for local processing:

Almost all of the existing solutions very sensibly use InfluxDB, and that’s probably what you should use if you don’t have any existing infrastructure. However, I don’t want to run a new service unless I have to – and I already have a Prometheus instance that I want to import the weather data into. So I set about writing an exporter that could receive the HTTP POST requests from the weather station, transform the data where necessary, and present it where it can be scraped by Prometheus.

In the end I took parts of pa3hcm/ecowither and parts of bentasker/Ecowitt_to_InfluxDB, removed the InfluxDB bits and added the official Prometheus Python client to create an exporter. I included code to select output in Metric/SI or Imperial, and to toggle this independently for each instrument, to cope with the slightly odd “hybrid British units”, because we like most things in metric, except for wind speed, which we prefer in mph.

I did run into a bug where a malformed HTTP header being sent by the Ecowitt on firmware version v5.1.1 was causing bad behaviour in Flask. At the time of writing there is a PR in the pipeline to fix this, but I worked around the problem by fronting my Flask server with an NGINX reverse proxy – no special config required. NGINX seems to magically fix the broken header in flight.

My effort is excitingly named ecowitt-exporter and is available as a Docker image too.

But wait! That’s not all. If you simply run ecowitt-exporter, all it does is host an HTTP endpoint and waits for Prometheus to scrape it, which you will have to configure manually.

But as I’m running Kubernetes, I’m also running the Prometheus Operator which supports automatically configuring Prometheus to scrape targets based on the ServiceMonitor custom resource. I created an ecowitt-exporter Helm chart which installs ecowitt-exporter and configures a ServiceMonitor resource to enable scraping. It can optionally also configure an Ingress (which at the moment is required, to work around the header bug described above).

The last piece of the puzzle is a Grafana dashboard to visualise this data. I’ve tried to recreate the basic information shown on the weather station base unit, but with history. The top row is a “conditions right now” view, while all the other rows show 24 hours history (customisable).

At the time of writing, this dashboard is still evolving so I haven’t published it on Grafana’s list of community dashboards, but you can grab it from my git repo. It should “just work” if you are using the ecowitt-exporter for Prometheus. If you are using Grafana to visualise data from InfluxDB, this dashboard will need minor modification as the data sources will probably have different names.

All of these components are open source, so improvements, bug fixes, and new features are welcome.

Kubernetes Homelab Part 6: Deployments

Welcome to part 6 of my Kubernetes Homelab series. In the previous posts we’ve discussed the architecture of the hardware, networking, Kubernetes cluster, and infrastructure services. Today we’re going to look at deployment strategies for applications on Kubernetes.

My goal for my homelab is not necessarily 100% automation, but I would like the ability to deploy applications with a consistent config, redeploy them if necessary, manage upgrades, have versioned config, handle secrets safely, and to reduce the scope for human error.

Full CD or GitOps tools like Argo CD, Flux CD and others are a bit of a heavy hammer for this homelab environment, so let’s have a look at what it’s possible to achieve in a very lightweight solution.

Helm

Helm brands itself as a package manager for Kubernetes. Personally, I think calling it a package manager is a bit of a stretch as it lacks key functionality that we’ve seen with other package managers such as yum, pip and other distro and language-specific package managers.

What Helm can do is install & upgrade an application with a config file (which it calls a values file), which we can keep in git.

So I have created a private git repo called kubernetes-manifests (in hindsight I could’ve chosen something shorter), which contains a directory for each app I want to deploy. That directory contains a README to explain what the app is, a values file, and a deployment script.

aboutme/
├── deploy.sh
├── README.md
└── values.yaml

Every Helm chart ships with a values.yaml file that contains all possible values (i.e. config options) so I usually copy that file into my repo and edit it for my use case, removing redundant options to keep the file short.

The deployment script just wraps a Helm command so I don’t forget which args to pass next time I want to deploy.

Let’s have a look at a real example – my About Me page I use as a biosite, to pull some links together. It deploys an app called Homer.

#!/bin/sh
helm upgrade -i --create-namespace \
    -n about about \
    -f values.yaml \
    djjudas21/homer

My values.yaml is derived from the upstream values.yaml, with my config added – which is just a yaml structure listing all the links that appear on the live site.

Using helm upgrade -i instead of helm install just tells Helm to perform an upgrade, or do an installation if there is no existing deployment. I can safely run the deploy.sh script at any time and it will install the app if it needs to, upgrade the installation if necessary, configure it with new values if there are any, otherwise do nothing.

This satisfies my requirements of being able to make repeatable deployments, redeploy apps in the case of cluster loss, upgrade them at will, and keep my config in version control.

Secrets

The above example of my About Me app is a simple one because no secrets are required to deploy. What if I needed to provide the app with credentials, API keys or other secrets? I wouldn’t want to store those in git.

I use Helm Secrets with age to be able to store secrets encrypted in a git repo. I won’t go into the full procedure for setting it up because that is documented, but let’s have a look at how it works in practice.

For this example, let’s look at a photo sharing app called PhotoPrism. The default values.yaml requires you to set the root password. It’s mixed in with a bunch of other config values:

# -- environment variables. See docs for more details.
env:
  # -- Set the container timezone
  TZ: UTC
  # -- Photoprism storage path
  PHOTOPRISM_STORAGE_PATH: /photoprism/storage
  # -- Photoprism originals path
  PHOTOPRISM_ORIGINALS_PATH: /photoprism/originals
  # -- Initial admin password. **BE SURE TO CHANGE THIS!**
  PHOTOPRISM_ADMIN_PASSWORD: "please-change"

It’s possible to encrypt the entire values.yaml but I prefer to encrypt only the secret values, so it’s still easy to read the non-secrets. Helm lets you supply multiple values files, so let’s split out the secrets into a separate file called secrets.yaml, leaving the publicly readable values in values.yaml. The env hash will be merged by Helm upon deployment.

env:
  # -- Initial admin password. **BE SURE TO CHANGE THIS!**
  PHOTOPRISM_ADMIN_PASSWORD: "please-change"

# -- environment variables. See docs for more details.
env:
  # -- Set the container timezone
  TZ: UTC
  # -- Photoprism storage path
  PHOTOPRISM_STORAGE_PATH: /photoprism/storage
  # -- Photoprism originals path
  PHOTOPRISM_ORIGINALS_PATH: /photoprism/originals

Now we can encrypt secrets.yaml without affecting the readability of values.yaml.

$ helm secrets enc secrets.yaml 
Encrypting secrets.yaml
Encrypted secrets.yaml

The contents of the file are encrypted and safe to check into git:

env:
    #ENC[AES256_GCM,data:fhKUc7wQ1aRoUI2QWnu37hiIG84R5gAOvMr8mvGXRz7DpmxCgf7VRxtJtfOtjfA6gvgp33ocfg==,iv:Ldvmh5KSN1++g0VHRY01S5PQXQf51No7SXkahtSMi14=,tag:NM5wiY9Bl0VwrxFTAsm+Gw==,type:comment]
    PHOTOPRISM_ADMIN_PASSWORD: ENC[AES256_GCM,data:7Tt9mKkZ+U7zAtskQw==,iv:37AJkEmUk8VaA3wSaH5jPc2VwIB/hXCxM/FFxa9fPTc=,tag:S4hts6NO6zDA+DGsttIdoQ==,type:str]
sops:
    kms: []
    gcp_kms: []
    azure_kv: []
    hc_vault: []
    age:
        - recipient: age1xeguyqecm3zx2talea7jfawpgzfymula3f9e7cyr76czeh3qdqhs6ap9sp
          enc: |
            -----BEGIN AGE ENCRYPTED FILE-----
            YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBSQXpnczVlUWUrUGN3NkZM
            TndmVkdnOEY5UlJWMkFDWm1IL2tuUFozdndNCnQyYjVTVC9DKy9IV0s0SkdRbnFC
            bWNwRjBsNC9aNzcrenQ0MG1weWk5SGMKLS0tIGFLRlMrNmtEMVRJd1h5cU83QnI5
            TGdraWVHdjk2djZpUmN3bW05V2lsaUEKCKlvHPTDmr6sDCkqderSk0f+3w7x87pZ
            V5XZxx5cblsq9tYTutA2tnxxuLFloY/jFen2wdsvHSrxhmCwjdsJ9Q==
            -----END AGE ENCRYPTED FILE-----
    lastmodified: "2023-04-18T14:25:13Z"
    mac: ENC[AES256_GCM,data:F3pV6Ly+eP5ZfMTerWxfrgOny/CK6O2M3bhAQLM6+SxpmK2Ya+9rDYskcKSLaq8w7WVWJ/XAz3plg2Gx8gCAYJ+2SMgo6TVsENO8tu/xMRSgmbr6NViOlHTvNc/EcYl5NOj420r8TmF31B5OArvH4BSfoTijphKppnv/546hUco=,iv:2uwIeUXUVZigJR0j0FH2gYt4KlXAx9OMHh0yx52NqMw=,tag:ET16bfdlua7E8c0n9tGC3Q==,type:str]
    pgp: []
    unencrypted_suffix: _unencrypted
    version: 3.7.3

We can easily view or edit this file with helm secrets view secrets.yaml or helm secrets edit secrets.yaml.

The last piece of the puzzle is tweak the deploy script deploy.sh to be able to decrypt our secrets on the fly. We do this by changing helm upgrade -i to helm secrets upgrade -i and specifying two values files with -f. Values files on the right override ones on the left. In this case, both values files specify an env key but the values are merged.

#!/bin/sh
helm secrets upgrade -i --create-namespace \
    -n photoprism photoprism \
    -f values.yaml -f secrets.yaml \
    djjudas21/photoprism

Keeping up to date

We have the facility to upgrade a deployed app by first running

helm repo update

to update our Helm charts, then simply

./deploy.sh

to run the Helm upgrade from our deploy script.

As this is just my homelab, I don’t mind running upgrades manually, and I don’t need a fully automated solution. But it would be nice to know that there are updates available for my charts, without having to go checking manually.

There is a tool called Nova which can do exactly this.

$ nova find --format table --show-old
Release Name            Installed    Latest     Old     Deprecated
============            =========    ======     ===     ==========
about                   8.1.5        8.1.6      true    false
oauth2-proxy            6.8.0        6.10.1     true    false
graphite-exporter       0.1.5        0.1.6      true    false
node-problem-detector   2.3.3        2.3.4      true    false
prometheus-stack        45.8.0       45.10.1    true    false
rook-ceph               v1.11.2      1.11.4     true    false
rook-ceph-cluster       v1.11.3      1.11.4     true    false

This output lists the outdated Helm deployments on my cluster (in the current Kubernetes context). Nova doesn’t use local Helm chart repositories – it checks ArtifactHub as an index of Helm charts so any charts you want to check must be published there.

To update your Helm deployments, don’t forget to freshen your local Helm repositories so you have the latest charts:

helm repo update
./deploy.sh

At the moment, running Nova is a manual step that I do as and when I remember, but it does support output in different formats and could easily be run as a cron job or metrics exporter in the cluster.

I have started writing a Nova exporter to get the output of Nova into Prometheus so I can get alerts when I have outdated deployments, but it’s not finished yet. I’ll share here when I’ve had some time to finish it off.

Kubernetes Homelab Part 5: Hyperconverged Storage (again)

Part 4 of this series was supposed to cover hyperconverged storage with OpenEBS and cStor, but while I was in the middle of writing that guide, it all exploded and I lost data, so the blog post turned into an anti-recommendation for various reasons.

I’ve now redesigned my architecture using Rook/Ceph and it has had a few weeks to bed in, so let’s cover that today as a pattern that I’m happy recommending.

Ceph architecture

First, let’s have a brief look at Rook/Ceph architecture and topology, and clear up some of the terminology. I’ll keep it as short as I can – if you need more detail, check out the Ceph architecture guide and glossary.

Ceph is a long-standing software-defined clustered storage system, and also exists outside of Kubernetes. Rook is a Kubernetes operator that installs and manages Ceph on Kubernetes. On Kubernetes, people seem to interchangeably use the terms Rook, Ceph, and Rook/Ceph to refer to the whole system.

Ceph has many components to make up its stack. We don’t need to know about all of them here – partly because we don’t need to use all of them, and partly because Rook shields us from much of the complexity.

The most fundamental component of Ceph is the OSD (object storage daemon). Ceph runs exactly one OSD for each storage device available to the cluster. In my case, I have 4 NVMe devices, so I also have 4 OSDs – one per node. These run as pods. The OSD can only claim an entire storage device, separate from what you’re booting the node OS from. My nodes each have a SATA SSD for the OS and an NVMe for Ceph.

The part of my diagram that I have simply labelled Ceph Cluster consists of several components, but the key components are Monitors (mons). Monitors are the heart of the cluster, as they decide which pieces of data get written to each OSD, and maintain that mapping. Monitors run as pods and 3 are required to maintain quorum.

At its core, a Ceph cluster has a distributed object storage system called RADOS (Reliable Autonomic Distributed Object Store) – not to be confused with S3-compatible object storage. Everything is stored as a RADOS object. In order to actually use a Ceph cluster, an additional presentation layer is required, and 3 are available:

Ceph Block Device (aka RADOS Block Device, RBD) – a block device image that can be mounted by one pod as ReadWriteOnce
Ceph File System (aka CephFS) – a POSIX-compliant filesystem that can be mounted by multiple pods as ReadWriteMany
Ceph Object Store (aka RADOS Gateway, RGW) – an S3-compatible gateway backed by RADOS

As I already have a ReadWriteMany storage class provided by TrueNAS, and I don’t need S3 object storage, I’m only going to enable RBD to provide block storage, mostly for databases which don’t play nicely with NFS.

Deployment

As everything else on my cluster is deployed with Helm, I will also deploy Rook with Helm. It’s a slightly strange method as you have to install two Helm charts.

The first chart installs the Rook Ceph Operator, and sets up Ceph CRDs (custom resource definitions). The operator waits to be fed a CR (custom resource) which describes a Ceph cluster and its config.

The second chart generates and feeds in the CR, which the operator will use create the cluster and provision OSDs, Monitors and the other components.

I’m using almost entirely the default values from the charts. The only things I have customised are to:

enable RBD but disable CephFS and RGW
set appropriate requests & limits for my cluster (Ceph can be quite hungry)
to define which devices Ceph can claim By default, Ceph will claim all unused block devices in all nodes, which is a reasonable default. As all my Ceph devices are NVMe, there is only one in each node, and I’m not using NVMe for anything else, I can play it a bit safer by disabling useAllDevices and specifying a deviceFilter. I would need to change this if I had a node with multiple NVMes, or if I wanted to introduce a SATA device into the cluster.

cephClusterSpec:
  storage:
    useAllNodes: true 
    useAllDevices: false
    deviceFilter: "^nvme0n1"

Resources

As I mentioned above, a Ceph cluster can consume quite a lot of CPU and memory resources (which is one of the reasons I started off with cStor). Here’s a quick snapshot of the actual CPU and memory usage by Ceph in my cluster, which is serving 14 Ceph block volumes to a handful of not-very-busy databases.

$ kubectl top po -n rook-ceph
NAME                                         CPU   MEMORY
csi-rbdplugin-6s42w                          1m    80Mi            
csi-rbdplugin-l75kh                          1m    23Mi            
csi-rbdplugin-provisioner-694f54898b-67nnf   1m    47Mi            
csi-rbdplugin-provisioner-694f54898b-s9vpx   7m    108Mi           
csi-rbdplugin-pvhcx                          1m    76Mi            
csi-rbdplugin-vt7gs                          1m    20Mi            
rook-ceph-crashcollector-kube05-65d87b7d8b   0m    6Mi             
rook-ceph-crashcollector-kube06-64b798c4bc   0m    6Mi             
rook-ceph-crashcollector-kube07-887878456    0m    6Mi             
rook-ceph-crashcollector-kube08-688f948ddf   0m    6Mi             
rook-ceph-exporter-kube05-b6d6c6c9c-splt6    1m    16Mi            
rook-ceph-exporter-kube06-f9757c848-j47qm    1m    6Mi             
rook-ceph-exporter-kube07-5bdbb94f47-8kt8d   2m    16Mi            
rook-ceph-exporter-kube08-c98496b8b-8tnrz    3m    16Mi            
rook-ceph-mgr-a-6cb6484ff7-9gh8r             54m   571Mi           
rook-ceph-mgr-b-686bcb7f66-5nkvp             70m   446Mi           
rook-ceph-mon-a-86cbcbcfc7-6bsn6             28m   428Mi           
rook-ceph-mon-b-579f857b7f-rkkpc             23m   407Mi           
rook-ceph-mon-d-59f97f97-9r4b8               25m   427Mi           
rook-ceph-operator-6bcf46667-gv426           39m   57Mi            
rook-ceph-osd-0-77c56c774c-2jtff             24m   959Mi
rook-ceph-osd-1-67df8f6ccd-4qbrw             28m   1386Mi          
rook-ceph-osd-2-66cf8c8f55-6m6zt             31m   1310Mi          
rook-ceph-osd-3-74f794b458-hbhvr             31m   1296Mi          
rook-ceph-tools-c679447fc-cjpcs              3m    2Mi

There are quite a few pods in this deployment but the heaviest memory usage is by the OSDs, which consume over 1Gi each (my nodes have 16Gi RAM each). Bear this in mind if you’re running on a more lightweight cluster.

None of the pods have high CPU usage, but the Monitor pods tends to spike a little during activity (such as provisioning a new volume).

To save you the adding up, this is a total of 375m CPU (or 2% of the total cluster CPU) and 7721Mi memory (or 12% of the total cluster memory). Bear this in mind… it’s not exactly lightweight.

Monitoring

The Rook/Ceph Helm chart comes with metrics endpoints and Prometheus rules which I enabled. I then added the Ceph Cluster Grafana dashboard for an out-of-the-box dashboard.

The only problem I have found with this dashboard is the Throughput and IOPS counters towards the top-left usually display 0 even when this is not true, and intermittently show the real numbers, before returning to zero. Likewise, the IOPS and Throughput graphs in the middle always register 0, and don’t record the spikes. I haven’t had a chance to look at this yet.

You can see that my cluster isn’t being stressed at all, and I’m sure any storage experts are laughing at my rookie numbers. My OSDs are inexpensive consumer-grade NVMe devices, each of which claims performance up to 1700 MB/s throughput and 200,000 IOPS and a clustered systems can theoretically beat this, so I’m nowhere near any limits.

One thing to note is the available capacity. Ceph aggregates the size of all OSDs into a pool (4 × 256GB ≈ 1TB) but doesn’t account for the fact that it also stores multiple replicas of each object it stores (this is configurable). The default is 3 replicas, so a 1MB object would consume 3MB of the total capacity. My 1TB pool will actually store about 333GB of data.

Support

It’s hard to make a meaningful assessment of the support available for Rook/Ceph, but as lack of support was a key reason for abandoning OpenEBS/cStor it makes sense to have a look.

Ceph is a more mature product, and its documentation is more complete. There are pages about disaster recovery and detailed guides on how to restore/replace OSDs that break. There is also the Ceph Toolbox which provides a place to run the ceph command to perform a variety of maintenance and repair tasks.

Remember my problem with cStor wasn’t cStor’s fault – it was the Kubernetes control plane that lost quorum, and cStor used Kubernetes’ data store for its own state. This made it very hard to recover a cStor cluster. I was then unable to create a new cStor cluster and adopt the existing volumes, and no support was available to help me do that.

The Kubernetes control plane could explode again, so how would this affect Ceph? Sneakily, Ceph doesn’t use the Kubernetes data store for its state – it keeps it in /var/lib/rook on the host filesystem of each node. In the event of total cluster loss, it would be possible to create a new Kubernetes cluster and for Ceph to discover its state from the node filesystem.

$ tree -d /var/lib/rook
/var/lib/rook
├── exporter
├── mon-c
│   └── data
│       └── store.db
└── rook-ceph
    ├── crash
    │   └── posted
    ├── d4ec2a82-4b19-4b03-a4e0-7951a45eec35_47391857-0c95-4b47-9ab9-41721e101eff
    ├── e7b6c3ad-b460-4e77-9b5f-3522bc69c1e8_0d464f95-e171-4bed-b785-03c665b8e411
    └── log

In fact, as Ceph can work outside of Kubernetes, if for some reason a new Kubernetes cluster can’t be created, it should be possible to install Ceph right on the node OS, tell it where to find its state, and tell it where the local NVMe devices are. Ceph block devices can be mounted manually on the node as /dev/rbd0 similarly to iSCSI. It’s sketchy, but it should be OK to temporarily reconstruct a Ceph cluster, to pull the data off it. I’m not saying I would enjoy doing it, but it would an option in an emergency.

Lastly, I know of quite a few large corporations using Ceph, and it also forms the basis of Red Hat’s OpenShift Data Foundation product. This gives me confidence in its reliability.

Kubernetes Homelab Part 4: Hyperconverged Storage

Sorry it’s taken a while to get to the next part of my blog series. This section was supposed to be about hyperconverged clustered storage in my cluster, but I unfortunately ran into a total cluster loss due to some bugs in MicroK8s and/or dqlite that maintainers haven’t managed to get to the bottom of.

The volumes that were provisioned on my off-cluster storage, I was able to re-associate with my rebuilt-from-scratch cluster. The volumes that were provisioned on the containerised, clustered storage were irrecoverably lost.

Therefore, I have decided to rework this part of my blog series into a cautionary tale – partly about specific technologies, and partly to push my pro-backup agenda.

It’s worth looking at the previous posts in the series for some background, especially the overview.

My design

Let’s have a look at my original design for hyperconverged, containerised, clustered storage. And before we get stuck in, let me quickly demystify some of the jargon:

hyperconverged means the storage runs on the same nodes as the compute
containerised means the storage controller runs as Kubernetes pods inside the cluster
clustered means many nodes provide some kind of storage hardware, and your volumes are split up into replicas on more than one node, so you can tolerate a node failure without losing a volume

Several clustered storage solutions are available. Perhaps Rook/Ceph is the best known, but as MicroK8s packages OpenEBS, I decided to use that. The default setup you get if you simply do microk8s enable openebs creates a file on the root filesystem and provisions block volumes out of that file. In my case, that file would have ended up on the same SATA SSD as the OS, and I didn’t want that.

So I went poking at OpenEBS, and found that it offers various storage backends: Mayastor, cStor or Jiva. Mayastor is the newest engine, but has higher hardware requirements. In the end I decided on cStor as it appeared to be lightweight (i.e. didn’t consume much CPU or memory) and was also based on ZFS, which is a technology I already rely on in my TrueNAS storage. I ended up deploying OpenEBS from its Helm chart.

This diagram is quite complex, so let me walk you through it – starting at the bottom. Each physical node has an M.2 NVMe storage device installed, and this is separate from the SATA SSD that runs the OS. When you install OpenEBS, it creates a DaemonSet of a component called Node Disk Manager (NDM) which runs on each node and looks for available storage devices, and makes them available to OpenEBS as BlockDevices. When you have several BlockDevices, you can create a storage cluster. From this cluster, you can provision Volumes which will be replicated across multiple NVMe devices (by default you get 3 replicas). Creating a Volume also creates a Pod that acts as an iSCSI target for the volume. The Volume can now be mounted by workload Pods from any node in the usual way. It’s important to note that the workload Pod does not have to be on the same node as the Volume Target, and the three VolumeReplicas are placed according to the nodes with most capacity.

Architecture of OpenEBs/cStor on Kubernetes

The problem

MicroK8s uses dqlite as its cluster datastore instead of etcd like most other Kubernetes distributions. I ran into some problems with MicroK8s where dqlite started consuming all CPU, running at high latency and eventually silently lost quorum. The Kubernetes API server then also silently went read-only, so any requests to change cluster state would silently fail, and any requests to read cluster state would effectively be snapshots from the moment the cluster went read-only, and might vary depending on which of the dqlite replicas was being queried.

The further complication is that as a clustered storage engine, cStor uses CRDs to represent its objects and therefore relies on the Kubernetes API server and the underlying datastore to track its own volumes, replicas, block devices, etc. By default, cStor then also lost quorum.

Recovery

I followed through the how to restore lost quorum guide for MicroK8s, several times, but it never worked for me. I worked with MicroK8s developers for a while on recovery.

Even without cluster quorum, I attempted to recover my cStor volumes. However, actions like creating a pod to mount a volume rely on having a kube API that is not read only!

Eventually I had no other choice but to reset the cluster and start from scratch. I made sure I did not wipe the NVMe devices, and assumed I would be able to reassociate them on a new cluster. I exported all of the OpenEBS/cStor CRs to yaml files as a backup.

After the cluster was rebuilt, I reimported the BlockDevice resources but doing so did not discover the NVMe drives as they seemed to change UUID in the new cluster. I tweaked my yaml to adopt them under their new names, but I was not able to rebuild them as an OpenEBS cluster and rediscover my old volumes.

The documentation for cStor is quite minimal, and focuses on installing it rather than fixing it. The only relevant page is the Troubleshooting page, and it didn’t cater to my problem. Which seems surprising, because a common question with any storage system must be “how do I get my stuff back when it goes wrong?”

I contacted the OpenEBS community via Slack and my question was ignored for a week, despite my nudges. Eventually, an engineer contacted me and we worked through some steps, but were not able to reassociate a previous cluster’s cStor volumes with a new cluster.

All my cStor volumes were either MariaDB or PostgreSQL databases, and fortunately I had recent backups of all of them and was able to create new volumes on TrueNAS external storage (slower, but reliable) and restore the databases.

Lessons

First and foremost, take backups. Backups saved my bacon here in what would otherwise have been a significant data loss. I’ll cover my backup solutions in a later part of this blog post series.
Volume snapshots are not backups. cStor provides volume snapshot functionality and it is very easy to take snapshots automatically. However, using those snapshots requires a functioning kube API.
The control plane is fragile. It doesn’t take a lot for your datastore to lose quorum, and then all bets are off.
I advise against hyperconverged storage in your homelab, unless you really need it. As soon as there is persistent data stored in your cluster, it stops being ephemeral and you need to treat it as carefully as a NAS. It’s fine for caches and things that can be regenerated.
Check support arrangements before you commit to a product. MicroK8s developers have been responsive and helpful. However, cStor support has been useless. The product seems mature and the website looked shiny and makes claims about it being enterprise-grade, but the recovery documentation was useless and nobody was willing to help. Most of the chatter in the Slack channel are around Mayastor, so this must be the new shiny that gets all the attention.

Next steps

The root cause of this problem was dqlite and MicroK8s quorum. At the moment, I don’t yet understand why this incident happened and I don’t know how to prevent it from happening again. I’m not the only person to have been bitten by it.

For time being, I restored like-for-like on MicroK8s even though I don’t really trust dqlite any more. I’ve upped the frequency of my backups in the expectation that it will probably happen again at some point.

I think I’ve decided that if this happens again, I will consider rebuilding on K3s instead of MicroK8s, as they use the more standard etcd datastore.

I’m not currently using the NVMe disks, but it seems a waste just to leave them there doing nothing. I will probably fiddle with hyperconverged storage again one day – maybe either Mayastor or Rook/Ceph, both of which seem to get more attention than cStor.

My MIDI pipe organ workflow

Background

I’ve written a couple of times about playing about with a MIDI-enabled pipe organ and I’ve shared some of my results on YouTube. Today I want to say a bit about my workflow because a few people have asked, and it is a somewhat complicated but hopefully interesting.

This isn’t supposed to be instructional: this is just some notes about the way I’ve found that works for me. I’ll give some examples and demonstrate progress as we go along by working on a public domain piece, Prelude and Fugue in C major (BWV 553) by Johann Sebastian Bach.

Prerequisites

If you want to play along with this guide, you will need:

a pipe organ with MIDI ports
an installation of MuseScore
an installation of OrganAssist, configured for your organ
an installation of GrandOrgue, configured for your organ (optional)

Obtain score

The first thing I do when I decide I want to make the organ play something is obtain a score. I have three options:

Find and download a score on MuseScore

As well as being an notation editor app, MuseScore allows musicians to upload their own compositions to musescore.com, and it also contains various public domain works. There are also some copyrighted works with various licensing options.

When I’ve found an arrangement I like, as I’m a paid-up MuseScore Pro member, I can download the score directly in MuseScore format.

Here’s my score for BWV 553 on MuseScore, and for reference here’s the first line.

Enter a score from a physical book into MuseScore

If the work is only in physical form (a book or sheet score) then the only option is to manually enter it into MuseScore. There are various options for scanning it and getting MuseScore to “recognise” the notes, but I have found this inaccurate, and it takes as long to correct the mistakes as it does to just enter the music by hand.

I created my MuseScore version of the score by manually entering the notation from a physical book.

Import a plain old MIDI file into MuseScore

The last option is to import an ordinary MIDI file into MuseScore. The success of this method varies wildly depending on the quality and complexity of the original MIDI file, but you can often end up with an unreadable score that needs a lot of cleanup.

Arrange

No matter which of the three methods for getting a score you chose, you should now have a score in MuseScore. You will likely have to do some editing and arrangement to make it suitable for pipe organ.

Organ music arranged for humans would typically be written on 2 or 3 staves – right hand, left hand and optionally feet – and it is up to the organist to interpret the score and decide which manual (keyboard) to play each section on. There are often (but not always) written notes to tell the organist what to do.

Directions to the organist about choice of manual

But to a computer, an organ is several instruments – each manual (keyboard) and the pedalboard is its own instrument. So we need to arrange our score in this way – one stave for each manual, and we must pre-determine which manual each section will be played back on.

The specific organ I am arranging for has a Great manual, a Swell manual and a Pedal, so I need to arrange my score for 3 parts, the Swell and Great parts having 2 staves each and the Pedal part having 1 stave. In my own lingo I refer to this as SSGGP.

Here’s my version of BWV 553 re-arranged for SSGGP, and the first line for quick reference again.

First line of BWV 553, arranged for OrganAssist

Note that I have had to take out the convenient repeat and interpret the 1^st on Sw, 2^nd on Gt direction as playing the entire section through twice, once on each manual.

Finally, I export the MuseScore project as a MIDI file, which can be consumed by OrganAssist.

Add stops

Now I import this MIDI file into the OrganAssist library. The first thing it asks me to do is map the MIDI tracks to the organ manuals. We exported as SSGGP so that’s how we’ll set the mapping for import.

Importing a MuseScore score into OrganAssist

Mapping the SSGGP staves to organ manuals in OrganAssist

If we play this back now, the organ will make no sound, because although the keys are being pressed, no stops are drawn. We need to tell OrganAssist which stops we want it to use, which is something the human organist would decide when they played the piece on a real organ. In this case, the front of the book of Eight Short Preludes and Fugues gives this advice:

BWV 553 has a direction of mf, so let’s set those stops accordingly. Following the suggested registrations in the table, and knowing what I have available on the organ at St Mary’s, I’ve chosen these stops:

Great: Claribel Flute 8′, Flute 4′
Swell: Stopped Diapason 8′, Gemshorn 4′, Oboe 8′
Pedal: Bourdon 16′, Bass Flute 8′

To add these stops, we will use the OrganAssist editor. You can see the notes in a “piano roll” style view. Right click in the upper part of the screen to add stop changes and coupler changes. This obviously depends on your specific organ.

The editor view in OrganAssist shows notes in the main part of the screen, colour coded by manual (green for Swell, blue for Great, purple for Pedal). The top area is for events such as switching on or off stops, couplers, tremulants and any other controls the organ might have. Here I’ve turned on a bunch of stops at the beginning, and about two-thirds of the way across I’ve switched off the Swell to Pedal coupler, and switched on the Great to Pedal coupler, so the pedal notes are always coupled to the manual that is being currently played with the hands.

OrganAssist score editor, showing notes in the main display and stop/coupler events at the top

This step can be done away from the actual organ, as OrganAssist has rudimentary sound output which is sufficient to check for wrong notes, etc.

Playback on organ

If everything so far has been done properly, I should be ready for a first listening. No doubt there will be snags that show up when I listen to it, and I’ll probably want to make some tweaks.

The organ may be MIDI-controlled, but the mechanical components are still made of wood and leather and operated by springs and solenoids and pressurised air, so a little bit of latency creeps in

This video shows the score being played back on the organ at St Mary’s Church, Fishponds.

Changes to stops and small changes to durations of notes are easy to tweak in OrganAssist. Anything more usually means going back to MuseScore, editing there, and doing the export and import process again.

Playback on GrandOrgue

As I said above, OrganAssist only offers rudimentary playback when not attached to a real organ. It’s good enough for basic testing but not much good for hearing what it might sound like. Sure, I can go into the church and play the organ sometimes, but it would be nice have an approximation of the sound at home.

This is where GrandOrgue comes in. It’s a Virtual Pipe Organ (VPO) which is a virtual recreation of a pipe organ which receives input via MIDI – just like the real thing!

GrandOrgue uses real recordings of every single pipe on a real organ. Together these are known as a sampleset. Various samplesets are available online, some free, and some commercial. I haven’t (yet) had a chance to sample the organ at St Mary’s, so for now I am using a composite sampleset with similar-sounding stops taken from two free samplesets (Friesach by Piotr Grabowski, and Skinner op. 497 by Sonus Paradisi), and a basic graphical interface created with Organ Builder.

It takes a few minutes to configure a GrandOrgue organ to map the stop on/off events etc but after this is done, OrganAssist can play back through GrandOrgue via a MIDI loopback port, and make a surprisingly realistic sound. I can now make meaningful decisions about which stops to add to my OrganAssist scores at home.

In this video, OrganAssist (in the background) is “playing” the virtual organ by sending MIDI events, which GrandOrgue (in the corner) is receiving and generating the sound, using samples of real organ pipes.

I think this is a pretty good approximation of the real organ at St Mary’s – certainly good enough for playing around with at home.

Kubernetes Homelab Part 3: Off-Cluster Storage

Welcome to part 3 of the Kubernetes Homelab guide. In this section we’re going to look at how to provide off-cluster shared storage. If you haven’t read the other parts of this guide, I recommend you check those out too.

Out of the box, MicroK8s does provide a hostpath storage provider but this only works on a single-node cluster. It basically lets pods use storage within a subdirectory on the node’s root filesystem, so this obviously isn’t going to work in a multi-node cluster where your workload could end up on any node.

It’s important to me that any storage solution I choose is compliant with CSI, the Kubernetes framework for storage drivers. This allows you to simply tell Kubernetes that your pod requires a 10GB volume, and Kubernetes goes off and talks to its CSI driver, which provisions and mounts your volume automatically. This isn’t your typical fileserver.

TrueNAS

So I decided to go with TrueNAS SCALE (technically I started with TrueNAS CORE and then I migrated to TrueNAS SCALE). TrueNAS is a NAS operating system which uses the OpenZFS filesystem to manage its storage. By its nature, ZFS supports nested volumes and is ideal for this application.

I’m running a fairly elderly HP MicroServer N40L with 16GB memory and 4x4TB disks in a RAID-Z2 vdev, for a total of 8TB usable storage. It’s not the biggest or the fastest, but it works for me.

Democratic CSI

The magic glue that connects Kubernetes and TrueNAS is a project called Democratic CSI, which is a CSI driver that supports various storage appliances, including TrueNAS.

Note: Democratic CSI packaged an older driver called freenas-nfs which required SSH access to the NAS. For users running TrueNAS SCALE, there is a newer driver called freenas-api-nfs which does not require SSH and does all its work via an HTTP API. As I am running TrueNAS SCALE, I will deploy the freenas-api-nfs driver.

There are some steps to set up the root volume on your TrueNAS appliance but I wrote about these before, and they are pretty much the same, so please refer to my TrueNAS guide. There are also some Democratic CSI prerequisites you need to install on your Kubernetes nodes before deploying.

I’m installing via Helm, and the values file needed is quite complex as it is drawn from two upstream examples: the generic values.yaml for the Helm chart, and some more specific options for the freenas-api-nfs driver.

This is the local values.yaml I have come up with for my homelab:

driver:
  config:
    driver: freenas-api-nfs
    httpConnection:
      protocol: http
      username: root
      password: mypassword
      host: 192.168.0.4
      port: 80
      allowInsecure: true
    zfs:
      datasetParentName: hdd/k8s/vols
      detachedSnapshotsDatasetParentName: hdd/k8s/snaps
      datasetEnableQuotas: true
      datasetEnableReservation: false
      datasetPermissionsMode: "0777"
      datasetPermissionsUser: 0
      datasetPermissionsGroup: 0
    nfs:
      shareCommentTemplate: "{{ parameters.[csi.storage.k8s.io/pvc/namespace] }}-{{ parameters.[csi.storage.k8s.io/pvc/name] }}"
      shareHost: 192.168.0.4
      shareAlldirs: false
      shareAllowedHosts: []
      shareAllowedNetworks: []
      shareMaprootUser: root
      shareMaprootGroup: root
      shareMapallUser: ""
      shareMapallGroup: ""

node:
  # Required for MicroK8s
  kubeletHostPath: /var/snap/microk8s/common/var/lib/kubelet

csiDriver:
  # should be globally unique for a given cluster
  name: "org.democratic-csi.nfs-api"

storageClasses:
  - name: truenas
    defaultClass: true
    reclaimPolicy: Retain
    volumeBindingMode: Immediate
    allowVolumeExpansion: true
    parameters:
      fsType: nfs
    mountOptions:
      - noatime
      - nfsvers=4

volumeSnapshotClasses:
  - name: truenas

And it is installed like this:

helm upgrade \
    --install \
    --create-namespace \
    --values values.yaml \
    --namespace democratic-csi \
    truenas democratic-csi/democratic-csi

Testing

Once deployment has finished watch the pods until they have have spun up. Expect to see one csi-node pod per node, and one csi-controller.

[jonathan@latitude ~]$ kubectl get po -n democratic-csi
NAME                                                 READY   STATUS    RESTARTS   AGE
truenas-democratic-csi-node-rkmq8                    4/4     Running   0          9d
truenas-democratic-csi-node-w5ktj                    4/4     Running   0          9d
truenas-democratic-csi-node-k88cx                    4/4     Running   0          9d
truenas-democratic-csi-node-f7zw4                    4/4     Running   0          9d
truenas-democratic-csi-controller-54db74999b-5zjv2   5/5     Running   0          9d

Check to make sure there’s a truenas StorageClass:

[jonathan@latitude ~]$ kubectl get storageclasses
NAME                PROVISIONER                  RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
truenas (default)   org.democratic-csi.nfs-api   Retain          Immediate           true                   9d

Then apply a manifest to create a PersistentVolumeClaim, which should provision a volume in TrueNAS:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-claim-nfs
spec:
  storageClassName: truenas
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

Check to make sure it appears and is provisioned correctly:

[jonathan@latitude ~]$ kubectl get persistentvolumeclaim
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-claim-nfs   Bound    pvc-ac9940c4-29a8-4056-b0bf-d8ac0dd05beb   1Gi        RWX            truenas        15s

You should be able to see a Dataset and a corresponding Share for this volume in the TrueNAS web GUI:

Finally we can create a Pod that mounts this PersistentVolume to make sure we got the settings of the share right.

apiVersion: v1
kind: Pod
metadata:
  name: test-pod-nfs
spec:
  containers:
    - name: myfrontend
      image: nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: test-claim-nfs

If this pod starts up successfully, it means it was able to mount the volume from TrueNAS. Woo!

[jonathan@latitude ~]$ kubectl get pods
NAME           READY   STATUS    RESTARTS   AGE
test-pod-nfs   1/1     Running   0          46s

We can now start using the truenas storage class to run workloads which require persistent storage. In fact, you might already have noticed that this storage class is set as the default, so you won’t even need to explicitly specify it for many deployments.

As this storage class is backed by NFS, it intrinsically supports multi-user, and so the storage class supports ReadWriteOnce (aka RWO, can be mounted by one pod) and ReadWriteMany (aka RWX, can be mounted by many pods).