nyxd roadmap
Legend: [x] shipped in tree (still may need polish), [ ] not done, [~] partial / risky.
Architecture diagram
flowchart LR
U[User] --> C[nyx CLI]
C -->|HTTP over Unix socket| API[nyxd Control API]
subgraph D[nyxd Daemon]
API --> SUP[Supervisor]
API --> CMP[Compose Parser and Builder]
API --> IMG[Image Store and Puller]
CMP --> IMG
CMP --> SUP
SUP --> OVL[Overlay Manager]
SUP --> NET[Network Backend]
SUP --> BND[Bundle Generator]
SUP --> RT[Runtime Adapter]
SUP --> HLT[Health Subsystem]
SUP --> LOG[Log Collector]
SUP --> PER[Persist and Reconcile]
SUP --> DNS[DNS Backend]
end
RT --> CRUN[crun]
IMG --> REG[OCI Registry]
NET --> KRN[Linux kernel primitives]
NET -. optional .-> CNI[CNI plugins]
PER --> FS[nyxd data dir state]
LOG --> FS
IMG --> FS
OVL --> FS
BND --> FS
Already in the tree (high level)
- OCI runtime shell-out —
internal/runtime:cruncreate/start/run (--detachhelper),runforeground (--pid-file+ stdio), kill/delete/state/list,CrunContainerAbsent(missingstatus/ not-found) for wait/teardown, idempotentDeletewhen state already gone, state JSON parsing, dedicated--rootstate dir. - Supervisor skeleton —
internal/supervisor:Start/Stop/Kill/Remove/Shutdown, restart policies, overlay + network.Backend setup +bundle.Generate+crun runforeground (stdio → log collector) withresetLogIO+waitStoppedOrForceDeleteon stop/kill (missing OCI state tolerated;ListInfocan showabsent), supervised restart loop with backoff + jitter. On-disk JSON under{baseDir}/supervisor/containers/*.jsonplusbundles/<id>/nyxd-meta.jsonre-seednyx psafter an unclean restart when crun is still running (JSON first, then bundle meta + image resolve; no Bolt/SQLite KV). - Image pull (registry v2) —
internal/image:Store, anonymous token auth, manifest (+ index) fetch with configurable platform (Store.SetPlatform,nyxd -pull-platform), concurrent layer download with digest verify, atomic blob writes, resumable layer fetch (.partial+ HTTP Range),ParseRef,LoadImageMeta,LoadManifest, unreferenced blob prune afterRemoveImage. - CNI exec path (optional) —
internal/network: when-net-driver=cni, conflist generation,EnsureNetwork,Setup/Teardownvia plugin exec, netns under/run/nyxd/netns, portabledetachUnmountfor teardown. Default daemon mode uses native (no/opt/cni/bin). - Bundle / OCI config —
internal/bundle:config.jsongeneration with default caps, masked paths,noNewPrivileges, cgroup resource mapping, optionalExtraMountsmerged after default mounts. - Compose subset —
internal/compose: real YAML (gopkg.in/yaml.v3), image/restart validation, unknowndepends_ondetection, cycle detection,TopologicalOrderhelper, defaultno_new_privileges=truewhen unset. - Healthcheck library —
internal/health: exec (viacrun exec+CommandContext), HTTP, TCP, retries,onUnhealthycallback hook (caller must wire policy). - Logs —
internal/logs: JSONL append per container,Tail(reads whole file — see gaps). - Telemetry types —
internal/telemetry: counters + optional Prometheus-ish HTTP handler (ServeMetrics) — not called from daemon today. - Daemon entry —
cmd/nyxd: wiring for store, overlay,network.Backend(-net-drivernative default or cni), runtime, log collector, supervisor; graceful shutdown context (30s); root check; Unix socket HTTP control API (-socket, default/run/nyxd/nyxd.sock); startup lognetwork backend/driver;go.summaintained viago mod tidy. -
nyxCLI client —cmd/nyx:ping,version,pull(streamed progress + summary by default,--jsonfor single JSON; optional--username/--password),run(foreground: log follow + Ctrl+C →POST …/killSIGKILL;-d/--detach),ps,rm,stop,logs(-f/--tail),image ls/image rm/image prune(--dry-run),exec,containeraliases,nyx compose up|stop|down(compose file discovery,down -v);make build-nyx→bin/nyx. - Unit tests — compose parser, image ref parsing, native IPAM (Linux build); no end-to-end integration tests.
- Docs / packaging — documentation index, INSTALL / USAGE, OpenAPI, networking, kernel requirements, native network internals, QEMU Alpine guide, example service units (
Type=simpleinpackaging/nyxd.service,Type=notifyin reponyxd.servicewithoutsd_notifyyet).
Runtime / crun
- Zero-latency exit wait —
WaitForExitstill pollscrun stateevery 200ms; prefercrun events --format json, pidfd, or inotify on state dir where portable. - [~] Robust exit code — prefers
exit_codein state JSON when present, then<root>/<id>/exit_code, then a second raw JSON parse for alternate keys (exitCode,exit_status, …). Still nocrun deletestdout fallback. -
crun execfor one-off commands —runtime.Runtime.Exec+ control routePOST /v1/containers/{id}/exec+nyx exec ….
Done / partial
- Lifecycle via CLI — create/start/run/kill/delete/state/list implemented around one
Runtimetype. - [~] WaitForExit — works via polling; high latency vs event-driven.
Supervisor
- [~] Shutdown fairness — each parallel
Stopnow wrapped in its own 45s timeout (Shutdowncontext still shared); a hungcrun killno longer blocks others indefinitely, but there is no global “all must finish by T” budget beyond the caller’sctx. - Backoff jitter — exponential backoff adds small random jitter (
math/rand/v2) to reduce thundering herds. - Ordered multi-start —
supervisor.StartSequentialstarts specs in slice order;compose.BuildContainerSpecs+POST /v1/compose/up+nyx compose upwireTopologicalOrderend-to-end. - Readiness after
crun run— optionalContainerSpec.Healthcheck:health.WaitReadyafter the workload is running, beforeStartreturns; then a backgroundhealth.Checker. - Unhealthy → restart — checker
onUnhealthycallsKillForRestartwhenshouldRestart(..., exitCode=1)allows it (RestartNeverdoes not respawn on health alone).
Done / partial
- Restart policies — always / on-failure / unless-stopped / never +
MaxRestartsgate. - Backoff between restarts — exponential delay capped at 30s with jitter (
math/rand/v2). -
Healthcheckon spec —ContainerSpec.Healthcheck+ optionalhealthcheckonPOST /v1/containers/runJSON; normalized defaults viahealth.Normalize.
Image puller
- [~] Private / credentialed registries —
PullWithProgressacceptsRegistryAuth(Basic + Bearer token flow);POST /v1/images/pull/POST /v1/containers/run/ compose per-serviceregistry_username/registry_password;nyx pull -u/--password. NoDOCKER_CONFIG/ OAuth refresh yet. - Image signature / policy enforcement — no cosign / Notary-style verify-before-run; pulled blobs are digest-checked for transport integrity only.
- Registry mirror for pulls — no
HTTP_PROXY-style mirror host or--registry-mirrorfor air-gapped or restricted-network pulls. - Resume partial downloads — incomplete layer/config fetch keeps
<blob>.partialand resumes withRangewhen the registry returns206; falls back to full re-download if the server ignores ranges. - Platform override — index picks
wantOS/wantArchfromStore.SetPlatformorruntime.GOOS/GOARCH;nyxd -pull-platformsets the store. - Garbage collection (blobs) —
Store.PruneUnreferencedBlobsremovesblobs/sha256/*not referenced by anyimages/**/manifest.json; invoked fromRemoveImage(therefore also duringPruneImagesNotIn/ image prune). - [~]
fetchBlobreturn path — layer pulls usefetchBlobToDisk(stream + verify + rename, no full-blobReadFileafter write). Small config blobs still usefetchBlobwhich re-reads from disk (acceptable size).
Done / partial
- Public pull + verify — digest verify on write, atomic rename, bounded concurrency, 4MiB cap on manifest response body read (not full layer in RAM during copy).
-
ParseRef/LoadImageMeta/LoadManifest— for tooling and unpack helpers. -
PullWithProgress+ streamed pull API —POST /v1/images/pullwith{"stream":true}returnsapplication/x-ndjson(phase events + throttled byte progress);nyx pulluses it by default with progress UI + human summary;--jsonkeeps the legacy single JSON body. - Image prune —
Store.PruneImagesNotIn,POST /v1/images/prune(dry_run),nyx image prune; each removed image triggers blob prune for unreferenced digests.
Overlay
- Safe tree walk + dedup cache — overlay extraction uses
WalkDir, skips symlink directory targets, whiteout path checks withfilepath.Rel; per-digest extraction cache under_cache/<sha256>/with mutex. - Layer deduplication (cross-image) — shared
_cache/<digest>for extracted layers. - [~]
extractTarvia hosttar— pragmatic but not all OCI whiteout variants (e.g..wh..wh..plnkhardlink whiteouts) guaranteed; pure-Go or container-aware extractor still TBD.
Done / partial
- overlayfs mount lifecycle (Linux) — lower/upper/work/merged layout,
Removeunmount + cleanup; non-Linux stub returns clear error.
Native network (internal/network/native)
- Operator guide: networking —
-net-driver,network.Backend,nyxdvsnyx, systemd, CNI optional path.
Design / internals
- [~]
runNft/addPortMappings— usesexec.Command+withTimeoutinstead ofsyscall.Exec(daemon no longer loses the process). Rule syntax / nft availability may still fail at runtime; errors are logged. -
ensureNftTable— initial table load usesexec.Command("/usr/sbin/nft", "-f", file)insidesync.Once(nounix.Exec). -
withTimeout— used byrunNftfor each shell-out. -
portmapState/ DNAT teardown — nft rules tagged with a per-container comment prefix;removePortMappingsdeletes matching rules on teardown. - IPAM bounds — CIDR + gateway from
NYXD_CONTAINER_SUBNET/NYXD_GATEWAY_IP; allocator walks the real prefix (non-/16-only). - IPv6 — IPv4-only assumptions throughout bridge + NAT.
Done / partial
- Linux-only in-process bridge + netlink RTNETLINK — veth, bridge attach, basic IPAM file backend, tests for allocator on Linux.
- IPAM lazy init (CI / unprivileged) —
getIPAM()+NYXD_IPAM_DIR; falls back to$TMPDIR/nyxd-ipamwhen/var/lib/nyxd/ipamis not writable (no package-load panic). -
!linuxstub — builds on macOS/CI without native stack.
Product gaps (call out in operator comparisons)
- Embedded DNS / in-stack service discovery — no in-daemon resolver; containers use image/host
resolv.confunless the workload sets DNS;cnimode would need an extra plugin (e.g.dnsname) in the conflist — not wired by default. - Multi-network / Compose network isolation — native mode is effectively one cluster-wide bridge per daemon. Compose
networks:is parsed but not mapped to separate bridges, subnets, or policy per logical network (per-network isolation in native mode is TBD).
Compose parser
- Real YAML parsing —
compose.Parse([]byte)withyaml.v3(replaces oldparseYAMLstub narrative). -
depends_onat runtime —ParseFile(.env+${VAR}substitution),BuildContainerSpecs,POST /v1/compose/up,nyx compose up;POST /v1/compose/stop/down+nyx compose stop|down(default compose filename order: seeinternal/compose/discover.goand Usage). - Variable substitution —
${VAR},$VAR,${VAR:-default},$$;.envmerged then overridden by process env (compose-spec order). - Volume / bind mount model — compose
volumes:lines →ContainerSpec.ExtraMounts→bundle.Options.ExtraMounts; named volumes require a top-levelvolumes:entry; host dirs under{baseDir}/volumes/<project>/<name>/.compose down -vremoves those dirs whensupervisor.IsBindSourceInUsereports no remaining mount reference.
Gaps
- Implicit named volumes — Compose files that reference
name:/pathwithout a top-levelvolumes:entry fornameare not treated as named volumes (bind semantics or parse error instead). - Standalone volume UX — no
nyx volume ls/volume rm; named volume lifecycle is create-on-up + optional delete oncompose down -vonly. - Compose
networks:vs runtime — YAMLnetworks:/ per-servicenetworks:lists do not create isolated L2 domains or per-network DNS under native; see Networking product gaps above.
Done / partial
- Restart policy validation, cycle detection, unknown dependency errors, default
no_new_privileges.
Health checks
- Supervisor integration — readiness
WaitReady+Checker.Startfromsupervisor.Start/ re-adopt path;KillForRestarton sustained failure when restart policy allows. - [~] Exec hang —
checkExecusesexec.CommandContextwith timeout ctx; still depends on crun honoring signals/cancellation.
Done / partial
- Checker implementation — interval, timeout, retries, start period, HTTP/TCP/exec/
none.
Log collector
- Rotation by size —
Rotateexists but no max-size trigger; growth unbounded. -
Tailmemory — reads/decodes entire JSONL file then truncates to last n — unsafe for large logs. - Follow mode (API + client) —
GET /v1/containers/{id}/logs?follow=1&tail=…&plain=…(daemon tails JSONL file);nyx logs -f/nyx container logs -f; foregroundnyx runstreams the same endpoint. - Alternate sinks — JSONL to disk only; no syslog / journald forwarder interface.
Done / partial
- Stream from reader to JSONL file —
Collector.Streamwith context cancellation.
Telemetry
- Wire-up —
telemetry.New/ServeMetricsnever referenced fromcmd/nyxdor supervisor; counters stay at zero. Intended to aggregate cgroup + (future) eBPF counters — see eBPF section.
Done / partial
- Package scaffold — HTTP
/metricshandler skeleton ininternal/telemetry.
eBPF (observability and networking)
Goal: use the kernel’s eBPF subsystem where it beats userspace polling or duplicate netfilter accounting — especially on resource-constrained hosts — and coordinate with host-wide BPF policy when the machine already runs signed, attested programs (shared BTF/CO-RE expectations, loader lifecycle, pinning conventions) so nyxd does not fork a second incompatible BPF stack on the same machine.
Kernel / packaging assumptions
- Document BPF prerequisites — kernel CONFIG_BPF, CONFIG_BPF_SYSCALL, CONFIG_DEBUG_INFO_BTF (for CO-RE), and minimum versions for chosen program types; fail closed when BTF or permissions are missing.
- Capability model —
CAP_BPF/CAP_PERFMON/ locked memory (ulimit -l) and secure boot implications documented for operators.
Networking (native path first)
- Per-container traffic accounting — byte/packet counters per cgroup or netns (tc
clsact/ cgroup skb hooks) exported for Prometheus or JSON status without scraping/proc/net/devper veth. - Published-port flow visibility — optional eBPF conntrack-adjacent or socket-level stats for DNAT paths that today rely on nft rules only (complements, not replaces, existing
nftport maps). - DDoS / abuse rudiments (optional) — rate-limit or SYN cookies at XDP for host-facing services on selected interfaces (strictly opt-in; not default).
Observability beyond net
- Scheduler / latency profiles (optional) —
runqlat-style programs for run-queue latency debugging when cgroup CPU throttling is suspected. - OOM / memcg visibility — BPF tracepoints or ringbuf events for memory pressure correlated with container cgroup id (pairs with Production hardening (soak / OOM) below).
Signed eBPF / host policy integration
- Single loader / object policy — where the site already uses a signed BPF object pipeline, reuse compatible pinning paths, attach/detach rules, and lifecycle tied to
nyxdsupervisor start/stop so programs are not orphaned across daemon restarts. - Attestation / policy gate (optional) — allow only operator-approved eBPF objects when local policy requires it.
API / product
- Expose metrics — wire
internal/telemetry+ eBPF counters intoGET /metricsorGET /v1/containers?detail=1fields (design TBD); keep Unix socket trust model in mind.
nyx CLI & Unix control socket
Yes: nyxd starts an HTTP server on a Unix domain socket by default (-socket=/run/nyxd/nyxd.sock, override or set -socket="" to disable). The nyx binary is the thin client (cmd/nyx).
- Socket server —
internal/control:GET /v1/ping,GET /v1/version,GET /v1/containers,POST /v1/images/pull(optional NDJSON stream),GET /v1/images,POST /v1/images/remove,POST /v1/images/prune,POST /v1/containers/run,POST /v1/compose/up,POST /v1/compose/stop,POST /v1/compose/down,POST /v1/containers/{id}/stop,POST /v1/containers/{id}/kill,GET /v1/containers/{id}/logs,POST /v1/containers/{id}/remove,POST /v1/containers/{id}/exec. -
nyx run+ kill/stop —POST /v1/containers/runthen foreground CLI follows logs; Ctrl+C →POST /v1/containers/{id}/kill(SIGKILL);nyx run -ddetach;nyx stop→/stop(graceful then force inside supervisor). Resolves pulled image (ResolvePulledImage), buildsContainerSpec,supervisor.Start. Optionalrestartin JSON. - Auth / TLS — socket is world-group writable (
0660); no peer cred check, no token yet (local trust model only). - Structured errors — failed
execstill begins200+ stream body in some cases; tighten status codes and cap output size.
cmd/nyxd / control plane
- Imports / build — compiles;
go run ./cmd/nyxdworks. - [~] Subcommands on
nyxditself — still nonyxd pullsubcommand; usenyx pullagainst the socket, or the HTTP API. (See API, clients & UI for OpenAPI, SDK samples, and UI.) - Control socket flag —
-socket(default/run/nyxd/nyxd.sock,""disables). - Native network selection —
-net-driver(defaultnative: in-processinternal/network/native);cniuses exec plugins under-cni-bin-dir. -
systemd-notify—nyxd.serviceusesType=notifybut process never sendsREADY=1/ reloading state; switch toType=simpleor implement sd_notify.
Done / partial
- Operational flags — base dir, crun path,
-net-driver(native|cni), CNI paths (cni mode), network name (cni mode), log level, version, socket.
API, clients & UI
- OpenAPI spec — docs/openapi.yaml documents
GET/POST /v1/*including compose (/v1/compose/up|stop|down), pull stream, run, registry fields on pull/run, … - SDK samples — small runnable examples (e.g. Go, Python, shell+curl) that call the API for pull, run, status, logs; live under
examples/or docs and stay in sync with the spec. - UI — operator-facing web (or desktop) UI for host/node view, container lifecycle, compose stacks, log tail, and metrics; consumes the same API + optional WebSocket/SSE for streaming.
Security
- Default seccomp JSON artifact — bundle supports hardening fields; no curated default profile shipped beside comments in older specs.
- AppArmor — no profile generation or integration.
- Rootless — requires root today; no user-namespace rootless path.
Done / partial
- Bundle defaults — dropped caps, masked paths,
noNewPrivilegesin generated JSON where bundle applies.
Default hardening & ergonomics
Context: gaps and defaults inside nyxd itself (bundles, compose, supervisor, pulls). Overlap with Security, eBPF, and General / engineering is intentional.
Defaults & isolation
- Read-only rootfs by default — flip policy so read-write root is opt-in; add bundle tmpfs for
/tmp,/run(and any writable paths images require) whenread_onlyis default. - User-namespace UID/GID remap (rootful nyxd) — map container
root→ high UIDs on host without full rootless nyxd (breaks many escape-to-host primitives); distinct from[ ]Rootless above. -
restart=on-failure:Nparity — surfaceMaxRestarts(or equivalent) throughnyx run/ compose in a way that matches operator mental models (on-failure:3).
Service discovery without an in-daemon DNS server
-
/etc/hostsinjection — at container start, write staticname → IPlines derived from the Compose graph (or explicit flags) — no long-lived resolver process inside nyxd. - Optional mDNS (
.local) — small multicast DNS for dynamic LAN names; opt-in binary / feature flag to avoid base image bloat.
Lifecycle vs external supervisors
- Tombstone / reattach contract — document and implement a small on-disk contract (beyond today’s
supervisor/*.json+nyxd-meta.json) so an external init can detect “nyxd died but workload should survive” vs “clean stop”; today nyxd crash generally tears down supervised workloads (no containerd-style shim per container).
Image trust & air gap
- Image signature verification — see Image puller (
cosign/ policy); enforce beforerun/compose upwhen configured. - Registry mirror — see Image puller; helps when uplinks are slow, lossy, or policy-restricted.
OOM & cgroup operator knobs
- Per-container OOM tuning —
oom_score_adj, cgroupmemory.oom.group/memory.maxinteraction, and documented behaviour vs kernel OOM killer (pairs with eBPF mem visibility and General / engineering soak item).
General / engineering
- [~] CI —
.github/workflows/ci.yml:go test, Trivy (fs scan), Grype (informationalcontinue-on-error); expand withgolangci-lint/ cross-build as needed. - Integration tests — no crun-in-container tests; only targeted unit tests.
- Production hardening (soak / OOM) — no documented stress policy for many concurrent containers, large image pulls + overlay extract, or kernel OOM interaction with cgroup limits; behavior is mostly “kernel + crun default”.
- [~] Cgroup v2 / resource limits —
bundlemaps somedeploy.resourceslimits from compose into the OCI spec where implemented; not a full audit of cgroup v2 controllers, corner cases, or full resource-limit parity with every OCI consumer. -
go.sum— committed / maintained viago mod tidy(verify in CI).
Done / partial
- Small module footprint — stdlib +
yaml.v3+x/sys+ OCI spec packages +hedzr/progressbar(nyx pull UI only) as ingo.mod.
Last reviewed against repository layout on 2026-05-17. Update [x] / [~] / [ ] when merging features — keep this file aligned with shipped behavior (CLI flags, API routes, and daemon defaults).