push sheeet
Some checks failed
Periodic Merges (6h) / master → staging-nixos (push) Failing after 12m50s
Periodic Merges (6h) / master → staging-next (push) Failing after 12m54s
Periodic Merges (24h) / merge-base(master,staging) → haskell-updates (push) Failing after 11m54s
Periodic Merges (6h) / staging-next → staging (push) Failing after 12m13s
Periodic Merges (24h) / staging-next-25.05 → staging-25.05 (push) Failing after 13m24s
Periodic Merges (24h) / release-25.05 → staging-next-25.05 (push) Failing after 14m28s

This commit is contained in:
Dark Steveneq
2025-10-09 14:15:47 +02:00
commit 646b892680
49168 changed files with 5897842 additions and 0 deletions

View File

@@ -0,0 +1,86 @@
# K3s Upkeep for Users
General documentation for the K3s user for cluster tasks and troubleshooting steps.
## Upkeep
### Changing K3s Token
Changing the K3s token requires resetting cluster. To reset the cluster, you must do the following:
#### Stopping K3s
Disabling K3s NixOS module won't stop K3s related dependencies, such as containerd or networking. For stopping everything, either run "k3s-killall.sh" script (available on $PATH under `/run/current-system/sw/bin/k3s-killall.sh`) or reboot host.
### Syncing K3s in multiple hosts
Nix automatically syncs hosts to `configuration.nix`, for syncing configuration.nix's git repository and triggering `nixos-rebuild switch` in multiple hosts, it is commonly used `ansible`, which enables automation of cluster provisioning, upgrade and reset.
### Cluster Reset
As upstream "k3s-uninstall.sh" is yet to be packaged for NixOS, it's necessary to run manual steps for resetting cluster.
Disable K3s instances in **all** hosts:
In NixOS configuration, set:
```
services.k3s.enable = false;
```
Rebuild NixOS. This is going to remove K3s service files. But it won't delete K3s data.
To delete K3s files:
Dismount kubelet:
```
KUBELET_PATH=$(mount | grep kubelet | cut -d' ' -f3);
${KUBELET_PATH:+umount $KUBELET_PATH}
```
Delete k3s data:
```
rm -rf /etc/rancher/{k3s,node};
rm -rf /var/lib/{rancher/k3s,kubelet,longhorn,etcd,cni}
```
When using Etcd, Reset Etcd:
Certify **all** K3s instances are stopped, because a single instance can re-seed etcd database with previous cryptographic key.
Disable etcd database in NixOS configuration:
```
services.etcd.enable = false;
```
Rebuild NixOS.
Delete etcd files:
```
rm -rf /var/lib/etcd/
```
Reboot hosts.
In NixOS configuration:
```
Re-enable Etcd first. Rebuild NixOS. Certify service health. (systemctl status etcd)
Re-enable K3s second. Rebuild NixOS. Certify service health. (systemctl status k3s)
```
Etcd & K3s cluster will be provisioned new.
Tip: Use Ansible to automate reset routine, like this.
## Troubleshooting
### Raspberry Pi not working
If the k3s.service/k3s server does not start and gives you the error FATA[0000] failed to find memory cgroup (v2) Here's the github issue: https://github.com/k3s-io/k3s/issues/2067 .
To fix the problem, you can add these things to your configuration.nix.
```
boot.kernelParams = [
"cgroup_enable=cpuset" "cgroup_memory=1" "cgroup_enable=memory"
];
```
### FailedKillPod: failed to get network "cbr0" cached result
> KillPodSandboxError: failed to get network "cbr0" cached result: decoding version from network config: unexpected end of JSON input
Workaround: https://github.com/k3s-io/k3s/issues/6185#issuecomment-1581245331

View File

@@ -0,0 +1,45 @@
# Onboarding Maintainer
Anyone willing can become a maintainer, no pre-requisite knowledge is required. Willingness to learn is enough.
A K3s maintainer, maintains K3s's:
- [documentation](https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/networking/cluster/k3s/README.md)
- [issues](https://github.com/NixOS/nixpkgs/issues?q=is%3Aissue+is%3Aopen+k3s)
- [pull requests](https://github.com/NixOS/nixpkgs/pulls?q=is%3Aopen+is%3Apr+label%3A%226.topic%3A+k3s%22)
- [NixOS tests](https://github.com/NixOS/nixpkgs/tree/master/nixos/tests/k3s)
- [NixOS service module](https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/cluster/k3s/default.nix)
- [update script](https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/networking/cluster/k3s/update-script.sh) (the process of updating)
- updates (the act of updating) and [r-ryantm bot logs](https://r.ryantm.com/log/k3s/)
- deprecations
- CVEs
- NixOS releases
- dependencies (runc, containerd, ipset)
Anything that is due, basically.
As a maintainer, feel free to improve anything and everything at your discretion. Meaning, at your pace and according to your capabilities and interests.
Only consensus is required to move forward any proposal. Consensus meaning the approval of others.
If you cause a regression (we've all been there), you are responsible for fixing it, but in case you can't fix it (it happens), feel free to ask for help. That's fine, just let us know.
To merge code, you need to be a committer, or use the merge-bot, but currently the merge-bot only works for packages located at `pkgs/by-name/`, which means, K3s still need to be migrated there before you can use merge-bot for merging. As a non-committer, once you have approved a PR you need to forward the request to a committer. For deciding which committer, give preference initially to K3s committers, but any committer can commit. A committer usually has a green approval in PRs.
K3s's committers currently are: marcusramberg, Mic92.
@euank is often silent but still active and has always handled anything dreadful, internal parts of K3s/Kubernetes or architecture things, he initially packaged K3s for nixpkgs, think of him as a last resort, when we fail to accomplish a fix, he comes to rescue us from ourselves.
@mic92 stepped up when @superherointj stepped down a time ago, as Mic92 has a broad responsibility in nixpkgs (he is responsible for far too many things already, nixpkgs-reviews, sops-nix, release manager, bot-whatever), we avoid giving him chore work for `nixos-unstable`, only pick him as committer last. As Mic92 runs K3s in a `nixos-stable` setting, he might help in testing stable backports.
On how to handle requests, it's the usual basics, such as, when reviewing PRs, issues, be welcoming, helpful, provide hints whenever possible, try to move things forward, assume good will, ignore [as don't react to] any negativity [since it spirals badly], delay and sort any (severe) disagreement in private. Even on disagrements, be thankful to people for their dedicated time, no matter what happens. In essence, on any unfortunate event, **always put people over code**.
Dumbshit happens, we make mistakes, the CI, reviews, fellow maintainers are there to nudge us on a better direction, no need to over think interactions, if a problem happens, we'll handle it.
We should optimize for maintainers satisfaction, because it is maintainers that make the service great. The best kind of win we have is when someone new steps up for being a maintainer. This multiplies our capabilities of doing meaningful work and increases our knowledge pool.
Know that your participation matters most for us. And we thank you for stepping up. It's good to have you here!
We welcome you and wish you the best in this new journey!
K3s Maintainers

View File

@@ -0,0 +1,60 @@
# K3s Upkeep for Maintainers
General documentation for the K3s maintainer and reviewer use for consistency in maintenance processes.
## NixOS Release Maintenance
This process split into two sections and adheres to the versioning policy outlined in [VERSIONING.md](VERSIONING.md).
### Pre-Release
* Prior to the breaking change window of the next release being closed:
* `nixos-unstable`: Ensure k3s points to latest versioned release
* `nixos-unstable`: Ensure release notes are up to date
* `nixos-unstable`: Remove k3s releases which will be end of life upstream prior to end-of-life for the next NixOS stable release are removed with proper deprecation notice (process listed below)
### Post-Release
* For major/minor releases of k3s:
* `nixos-unstable`: Create a new versioned k3s package
* `nixos-unstable`: Update k3s alias to point to new versioned k3s package
* `nixos-unstable`: Add NixOS Release note denoting:
* Removal of deprecated K3s packages
* Migration information from the Kubernetes and K3s projects
* `nixos-stable`: Backport the versioned package
* For patch releases of existing packages:
* `nixos-unstable`: Update package version (process listed below)
* `nixos-stable`: Backport package update done to nixos-unstable
## Patch Upgrade Process
Patch upgrades can use the [update script](../update-script.sh) in the root of the package. To update k3s 1.30.x, for example, you can run the following from the root of the nixpkgs git repo:
> ./pkgs/applications/networking/cluster/k3s/update-script.sh "30"
To update another version, just replace the `"30"` with the appropriate minor revision.
If the script should fail, the first goal would be to fix the script. If you are unable to fix the script, open an issue reporting the update script failure with the exact command used and the failure observed.
RyanTM bot can automatically do patch upgrades. Update logs are available at versioned urls, e.g. for 1.30.x: https://r.ryantm.com/log/k3s_1_30
## Package Removal Process
Package removal policy and timelines follow our reasoning in the [versioning documentation](VERSIONING.md#patch-release-support-lifecycle). In order to remove a versioned k3s package, create a PR achieving the following:
* Remove the versioned folder containing the chart and package version files (e.g. `./1_30/`)
* Remove the package block from [default.nix](../default.nix) (e.g. `k3s_1_30 = ...`)
* Remove the package reference from [pkgs/top-level/all-packages.nix](/pkgs/top-level/all-packages.nix)
* Add a deprecation notice in [pkgs/top-level/aliases.nix](/pkgs/top-level/aliases.nix), such as `k3s_1_26 = throw "'k3s_1_26' has been removed from nixpkgs as it has reached end of life"; # Added 2024-05-20`.
## Change Request Review Process
Quick checklist for reviewers of the k3s package:
* Is the version of the Go compiler pinned according to the go.mod file for the release?
* Update script will not pin nor change the go version.
* Do the K3s passthru.tests work for all architectures supported? (linux-x86_64, aarch64-linux)
* For GitHub CI, [OfBorg](https://github.com/NixOS/ofborg) can be used to test all platforms.
* For Local testing, the following can be run in nixpkgs root on the upgrade branch: `nix build .#k3s_1_29.passthru.tests.{etcd,single-node,multi-node}` (Replace "29" to the version tested)
* Anything unusual in the nix build logs or test logs?

View File

@@ -0,0 +1,100 @@
# K3s Usage
## Single Node
```
{
networking.firewall.allowedTCPPorts = [
6443 # k3s: required so that pods can reach the API server (running on port 6443 by default)
# 2379 # k3s, etcd clients: required if using a "High Availability Embedded etcd" configuration
# 2380 # k3s, etcd peers: required if using a "High Availability Embedded etcd" configuration
];
networking.firewall.allowedUDPPorts = [
# 8472 # k3s, flannel: required if using multi-node for inter-node networking
];
services.k3s.enable = true;
services.k3s.role = "server";
services.k3s.extraFlags = toString [
# "--debug" # Optionally add additional args to k3s
];
}
```
Once the above changes are active, you can access your cluster through `sudo k3s kubectl` (e.g. `sudo k3s kubectl cluster-info`) or by using the generated kubeconfig file in `/etc/rancher/k3s/k3s.yaml`.
Multi-node setup
## Multi-Node
it is simple to create a cluster of multiple nodes in a highly available setup (all nodes are in the control-plane and are a part of the etcd cluster).
The first node is configured like this:
```
{
services.k3s = {
enable = true;
role = "server";
token = "<randomized common secret>";
clusterInit = true;
};
}
```
Any other subsequent nodes can be added with a slightly different config:
```
{
services.k3s = {
enable = true;
role = "server"; # Or "agent" for worker only nodes
token = "<randomized common secret>";
serverAddr = "https://<ip of first node>:6443";
};
}
```
For this to work you need to open the aforementioned API, etcd, and flannel ports in the firewall. Official documentation on what ports need to be opened for specific use cases can be found on [k3s' documentation site](https://docs.k3s.io/installation/requirements#inbound-rules-for-k3s-nodes). Note that it is [recommended](https://etcd.io/docs/v3.3/faq/#why-an-odd-number-of-cluster-members) to use an odd number of nodes in such a cluster.
Tip: If you run into connectivity issues between nodes for specific applications (e.g. ingress controller), please verify the firewall settings you have enabled (example under [Single Node](#single-node)) against the documentation for that specific application. In the ingress controller example, you may want to open 443 or 80 depending on your use case.
## Quirks
### `prefer-bundled-bin`
K3s has a config setting `prefer-bundled-bin` (and CLI flag `--prefer-bundled-bin`) that makes k3s use binaries from the `/var/lib/rancher/k3s/data/current/bin/aux/` directory, as unpacked by the k3s binary, before the system `$PATH`.
This works with the official distribution of k3s but not with the package from nixpkgs, as it does not bundle the upstream binaries from [`k3s-root`](https://github.com/k3s-io/k3s-root) into the k3s binary.
Thus the `prefer-bundled-bin` setting **cannot** be used to work around issues (like [this `mount` regression](https://github.com/util-linux/util-linux/issues/3474)) with binaries used/called by the kubelet.
### Building from a different source
Because the package is split into multiple derivations and the build process is generally more complex, it is not very obvious how to build k3s from a different source (fork or arbitrary commit).
To build k3s from a different source, you must use `.override` together with `overrideBundleAttrs` (for the k3sBundle derivation) and another `.overrideAttrs` (for the final derivation):
```nix
{ fetchgit, k3s }:
let
k3sRepo = fetchgit {
url = "https://github.com/k3s-io/k3s";
rev = "99d91538b1327da933356c318dc8040335fbb66c";
hash = "sha256-vVqZzVp0Tea27s8HDVq4SgqlbHBdZcFzNKmPFi0Yktk=";
};
vendorHash = "sha256-jrPVY+FVZV9wlbik/I35W8ChcLrHlYbLAwUYU16mJLM=";
in
(k3s.override {
overrideBundleAttrs = {
src = k3sRepo;
inherit vendorHash;
};
}).overrideAttrs
{
src = k3sRepo;
inherit vendorHash;
}
```
- Additionally to `overrideBundleAttrs` there are also: `overrideCniPluginsAttrs` and `overrideContainerdAttrs`.
- `k3s --version` still prints the commit SHA (`k3sCommit` passed into `builder.nix`) from the "base" package instead of the actually used `rev`.
- Depending on the changes made in the fork / commit, the `k3s.override` (without the `overrideAttrs` of the final derivation) might already be enough.
- If the commit is for a different version of k3s, make sure to use the correct "base" package, e.g., `k3s_1_31.override`. Otherwise the build fails with `Tagged version 'v1.33.1+k3s1' does not match expected version 'v1.31.9[+-]*'`
- When adding an entirely new k3s version by calling `builder.nix`, keep in mind that the `k3sCommit` parameter is not used as the `k3sRepo` `rev` (it uses `v${k3sVersion}`). Therefore, you additionally must override the package, as shown above.

View File

@@ -0,0 +1,46 @@
# Versioning
K3s, Kubernetes, and other clustered software has the property of not being able to update atomically. Most software in nixpkgs, like for example bash, can be updated as part of a "nixos-rebuild switch" without having to worry about the old and the new bash interacting in some way.
K3s/Kubernetes, on the other hand, is typically run across several NixOS machines, and each NixOS machine is updated independently. As such, different versions of the package and NixOS module must maintain compatibility with each other through temporary version skew during updates.
The upstream Kubernetes project [documents this in their version-skew policy](https://kubernetes.io/releases/version-skew-policy/#supported-component-upgrade-order).
Within nixpkgs, we strive to maintain a valid "upgrade path" that does not run
afoul of the upstream version skew policy.
## Patch Release Support Lifecycle
K3s is built on top of K8s and typically provides a similar release cadence and support window (simply by cherry-picking over k8s patches). As such, we assume k3s's support lifecycle is identical to upstream K8s. The upstream K8s release and support lifecycle, including maintenance and end-of-life dates for current releases, is documented [on their support site](https://kubernetes.io/releases/patch-releases/#support-period). A more tabular view of the current support timeline can also be found on [endoflife.date](https://endoflife.date/kubernetes).
In short, a new Kubernetes version is released roughly every 4 months and each release is supported for a little over 1 year.
## Versioning in nixpkgs
There are two package types that are maintained within nixpkgs when we are looking at the `nixos-unstable` branch. A standard `k3s` package and versioned releases such as `k3s_1_28`, `k3s_1_29`, and `k3s_1_30`.
The standard `k3s` package will be updated as new versions of k3s are released upstream. Versioned releases, on the other hand, will follow the path release support lifecycle as detailed in the previous section and be removed from `nixos-unstable` when they are either end-of-life upstream or older than the current `k3s` package in `nixos-stable`.
## Versioning in NixOS Releases
Those same package types are also maintained on the release branches of NixOS, but have some special considerations within a release.
NixOS releases (24.05, 24.11, etc) should avoid having deprecated software or major version upgrades during the support lifecycle of that release wherever possible. As such, each NixOS release should only ever have one version of `k3s` when it is released. An example for the NixOS 24.05 release would be that `k3s` package points to `k3s_1_30` for the full lifecycle of its release with no other versions present at release.
However, this conflicts with our desire for users to be able to upgrade between stable NixOS releases without needing to make a large enough k3s version jump as to violate the skew policy listed previously. Given NixOS 24.05 has 1.30.x as its k3s version and the NixOS 24.11 release would have 1.32.x as its k3s version, we need to provide a way for users to upgrade k3s to 1.32.x before upgrading to the next NixOS stable release.
To be able to achieve the goal above, the k3s maintainers would backport `k3s_1_31` and `k3s_1_32` from `nixos-unstable` to NixOS 24.05 as they release. This means that when NixOS 24.11 is released with only the `k3s` package pointing to `k3s_1_32`, users will have an upgrade path on 24.05 to first upgrade locally to `k3s_1_31` and then to `k3s_1_32` (e.g. pointing `services.k3s.package` from `k3s` to `k3s_1_31`, upgrading the cluster, and repeating the process through versions).
Using the above as the example, a three NixOS release example would look like:
* NixOS 23.11
* k3s/k3s_1_27 (Release Version, patches backported)
* k3s_1_28 (Backported)
* k3s_1_29 (Backported)
* k3s_1_30 (Backported)
* NixOS 24.05
* k3s/k3s_1_30 (Release Version, patches backported)
* k3s_1_31 (Backported)
* k3s_1_32 (Backported)
* NixOS 24.11
* k3s/k3s_1_32 (Release Version, patches backported)

View File

@@ -0,0 +1,40 @@
# Using an external Containerd
K3s ships with its own containerd binary, however, sometimes it's necessary to use an external
containerd. This can be done in a few lines of configuration.
## Configure Containerd
```nix
{
virtualisation.containerd = {
enable = true;
settings.plugins."io.containerd.grpc.v1.cri".cni = {
bin_dir = "/var/lib/rancher/k3s/data/current/bin";
conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d";
};
# Optionally, configure containerd to use the k3s pause image
settings.plugins."io.containerd.grpc.v1.cri" = {
sandbox_image = "docker.io/rancher/mirrored-pause:3.6";
};
};
}
```
## Configure k3s
```nix
{
services.k3s = {
enable = true;
extraFlags = [ "--container-runtime-endpoint unix:///run/containerd/containerd.sock" ];
};
}
```
## Importing Container Images
K3s provides the `services.k3s.images` option to import container images at startup. This option
does **not** work with an external containerd, but you can import the images via
`ctr -n=k8s.io image import /var/lib/rancher/k3s/agent/images/*`. Note that you need to set the
`k8s.io` namespace to make the images available to the cluster.

View File

@@ -0,0 +1,256 @@
# Nvidia GPU Support
> Note: this article assumes `services.k3s.enable = true;` is already set
## Enable the Nvidia driver
```
hardware.nvidia = {
open = true;
package = config.boot.kernelPackages.nvidiaPackages.stable; # change to match your kernel
nvidiaSettings = true;
};
# Hack for getting the nvidia driver recognized
services.xserver = {
enable = false;
videoDrivers = [ "nvidia" ];
};
nixpkgs.config.allowUnfreePredicate = pkg: builtins.elem (lib.getName pkg) [
"nvidia-x11"
"nvidia-settings"
];
```
Also, enable the Nvidia container toolkit:
```
hardware.nvidia-container-toolkit.enable = true;
hardware.nvidia-container-toolkit.mount-nvidia-executables = true;
environment.systemPackages = with pkgs; [
nvidia-container-toolkit
];
```
Rebuild your NixOS configuration.
### Verify that the GPU is accessible
Use the following command to ensure the GPU is accessible:
```
nvidia-smi
```
If there is an error in the output, a reboot may be required for the driver to be assigned to the GPU.
Additionally, `lspci -k` can be used to ensure the driver has been assigned to the GPU:
```
# lspci -k | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 Rev. A] (rev a1)
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
```
## Configure k3s
You now need to create a new file in `/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl` with the following
```
{{ template "base" . }}
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v2"
```
Now apply the following runtime class to k3s cluster:
```yaml
apiVersion: node.k8s.io/v1
handler: nvidia
kind: RuntimeClass
metadata:
labels:
app.kubernetes.io/component: gpu-operator
name: nvidia
```
Restart k3s:
```
systemctl restart k3s.service
```
Ensure that the Nvidia runtime is detected by k3s:
```
grep nvidia /var/lib/rancher/k3s/agent/etc/containerd/config.toml
```
Apply the DaemonSet in the [generic-cdi-plugin README](https://github.com/OlfillasOdikno/generic-cdi-plugin):
```
apiVersion: v1
kind: Namespace
metadata:
name: generic-cdi-plugin
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: generic-cdi-plugin-daemonset
namespace: generic-cdi-plugin
spec:
selector:
matchLabels:
name: generic-cdi-plugin
template:
metadata:
labels:
name: generic-cdi-plugin
app.kubernetes.io/component: generic-cdi-plugin
app.kubernetes.io/name: generic-cdi-plugin
spec:
containers:
- image: ghcr.io/olfillasodikno/generic-cdi-plugin:main
name: generic-cdi-plugin
command:
- /generic-cdi-plugin
- /var/run/cdi/nvidia-container-toolkit.json
imagePullPolicy: Always
securityContext:
privileged: true
tty: true
volumeMounts:
- name: kubelet
mountPath: /var/lib/kubelet
- name: nvidia-container-toolkit
mountPath: /var/run/cdi/nvidia-container-toolkit.json
volumes:
- name: kubelet
hostPath:
path: /var/lib/kubelet
- name: nvidia-container-toolkit
hostPath:
path: /var/run/cdi/nvidia-container-toolkit.json
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "nixos-nvidia-cdi"
operator: In
values:
- "enabled"
```
Apply the following node label (replace `#CHANGEME` with your node name):
```
kind: Node
apiVersion: v1
metadata:
name: #CHANGEME
labels:
nixos-nvidia-cdi: enabled
```
Now, GPU-enabled pods can be run with this configuration:
```
spec:
runtimeClassName: nvidia
containers:
resources:
requests:
nvidia.com/gpu-all: "1"
limits:
nvidia.com/gpu-all: "1"
```
### Test pod
This is a complete pod configuration for reference/testing:
```
---
apiVersion: v1
kind: Pod
metadata:
name: gpu-test
namespace: default
spec:
runtimeClassName: nvidia # <- THIS FOR GPU
containers:
- name: gpu-test
image: nvidia/cuda:12.6.3-base-ubuntu22.04
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
env:
- name: NVIDIA_VISIBLE_DEVICES
value: all
- name: NVIDIA_DRIVER_CAPABILITIES
value: all
resources: # <- THIS FOR GPU
requests:
nvidia.com/gpu-all: "1"
limits:
nvidia.com/gpu-all: "1"
```
Once the pod is running, use the following command to test that the GPU was detected:
```
kubectl exec -n default -it pod/gpu-test -- nvidia-smi
```
If successful, the output will look like the following:
```
Thu Sep 25 04:17:42 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.09 Driver Version: 580.82.09 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 2060 Off | 00000000:01:00.0 On | N/A |
| 0% 36C P8 10W / 190W | 104MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
```

View File

@@ -0,0 +1,108 @@
# Storage Examples
The following are some NixOS specific considerations for specific storage mechanisms with kubernetes/k3s.
## Longhorn
NixOS configuration required for Longhorn:
```
environment.systemPackages = [ pkgs.nfs-utils ];
services.openiscsi = {
enable = true;
name = "${config.networking.hostName}-initiatorhost";
};
```
Longhorn container has trouble with NixOS path. Solution is to override PATH environment variable, such as:
```
PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/run/wrappers/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin
```
**Kyverno Policy for Fixing Longhorn Container for NixOS**
```
---
apiVersion: v1
kind: ConfigMap
metadata:
name: longhorn-nixos-path
namespace: longhorn-system
data:
PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/run/wrappers/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin
---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: longhorn-add-nixos-path
annotations:
policies.kyverno.io/title: Add Environment Variables from ConfigMap
policies.kyverno.io/subject: Pod
policies.kyverno.io/category: Other
policies.kyverno.io/description: >-
Longhorn invokes executables on the host system, and needs
to be aware of the host systems PATH. This modifies all
deployments such that the PATH is explicitly set to support
NixOS based systems.
spec:
rules:
- name: add-env-vars
match:
resources:
kinds:
- Pod
namespaces:
- longhorn-system
mutate:
patchStrategicMerge:
spec:
initContainers:
- (name): "*"
envFrom:
- configMapRef:
name: longhorn-nixos-path
containers:
- (name): "*"
envFrom:
- configMapRef:
name: longhorn-nixos-path
---
```
## NFS
NixOS configuration required for NFS:
```
boot.supportedFilesystems = [ "nfs" ];
services.rpcbind.enable = true;
```
## Rook/Ceph
In order to support Rook/Ceph, the following NixOS kernelModule configuration is required:
```
boot.kernelModules = [ "rbd" ];
```
## ZFS ContainerD Support
The [ZFS snapshotter](https://github.com/containerd/zfs) can be enabled for k3s' embedded ContainerD though it requires mounting a dataset to a specific path used by k3s: `/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.zfs`
For example:
```bash
$ zfs create -o mountpoint=/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.zfs <zpool name>/containerd
```
You can now configure k3s to use zfs by passing the `--snapshotter` flag.
```
services.k3s = {
...
extraFlags = [
"--snapshotter=zfs"
];
```