feat: add valkey/redis

This commit is contained in:
Clément Désiles 2025-11-11 00:02:42 +01:00
parent e7dbe470da
commit e692d4df98
No known key found for this signature in database
9 changed files with 782 additions and 0 deletions

564
roles/valkey/README.md Normal file
View File

@ -0,0 +1,564 @@
# Redis/Valkey Role
This Ansible role installs and configures Valkey (a Redis fork) for local use only. It provides a shared Redis-compatible instance that multiple services can use as a cache or message broker.
The role also performs required kernel tuning for optimal Valkey performance.
## About Valkey
Valkey is a high-performance key/value datastore and a drop-in replacement for Redis. It was created as a community-driven fork after Redis changed its license from BSD to proprietary licenses (RSALv2 and SSPLv1) in March 2024.
**Key points:**
- Valkey is 100% API-compatible with Redis
- Backed by the Linux Foundation
- Uses permissive open-source license (BSD 3-Clause)
- No code changes needed in your applications
- Same commands, same protocol, same performance
**Distribution support:**
- **Arch Linux**: Installs Valkey (redis package replaced in April 2024)
- **Debian/Ubuntu**: Installs Valkey from official repositories
## Features
- Installs Redis/Valkey
- Local-only access (localhost)
- Configurable memory limits and eviction policies
- Persistence enabled
- Systemd integration
- Automatic kernel tuning (memory overcommit, THP)
- ACL-based user authentication
- Firewall configuration (UFW)
## Requirements
- Systemd-based Linux distribution
- Root/sudo access
- `ansible.posix` collection (for sysctl module)
- `community.general` collection (for ufw module)
## Role Variables
Available variables with defaults (see `defaults/main.yml`):
```yaml
# Bind address (localhost only for security)
valkey_bind: 127.0.0.1
# Port
valkey_port: 6379
# Authentication (REQUIRED - must be set explicitly)
# valkey_admin_password: "" # Intentionally undefined - role will fail if not set
# ACL users (services register their users here)
valkey_acl_users: []
# Example:
# valkey_acl_users:
# - username: immich
# password: "secretpassword"
# keypattern: "immich_bull* immich_channel*" # Space-separated patterns (template converts to ~pattern1 ~pattern2)
# commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
# Max memory (0 = unlimited)
valkey_maxmemory: 256mb
# Eviction policy when max memory is reached
valkey_maxmemory_policy: allkeys-lru
# Data directory
valkey_dir: /var/lib/valkey
# ACL file location
valkey_acl_file: /etc/valkey/users.acl
# Log level
valkey_loglevel: notice
```
**Security Note:** This role uses ACL-based authentication. You must set `valkey_admin_password` and configure service users via `valkey_acl_users`.
**System Requirements:** This role automatically config
## Dependencies
None.
## Example Playbook
```yaml
---
- hosts: servers
become: true
roles:
- role: valkey
- role: immich # Will connect to system Valkey
```
### Custom Configuration with ACL Users
```yaml
---
- hosts: servers
become: true
roles:
- role: valkey
vars:
valkey_admin_password: "{{ vault_valkey_password }}"
valkey_maxmemory: 512mb
valkey_maxmemory_policy: volatile-lru
valkey_acl_users:
- username: immich
password: "{{ immich_valkey_password }}"
keypattern: "immich_bull* immich_channel*"
commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
- username: nextcloud
password: "{{ nextcloud_valkey_password }}"
keypattern: "nextcloud*"
commands: "+@read +@write -@dangerous +auth +ping +info"
```
## How Services Connect
Services running on the same host can connect to Valkey at:
- **Host**: `localhost` or `127.0.0.1`
- **Port**: `6379` (default)
### From Containers
Containers need special handling to reach the host's Valkey:
**Use `host.containers.internal`:**
```yaml
REDIS_HOSTNAME: host.containers.internal
REDIS_PORT: 6379
```
This special DNS name resolves to the host machine from inside containers.
**Note:** Environment variables often still use `REDIS_*` naming for compatibility, since Valkey is API-compatible with Redis.
## Security
- **Local-only**: Valkey binds to `127.0.0.1` only (configurable for container access)
- **Protected mode**: Enabled
- **ACL authentication**: Each service gets its own user with restricted permissions
- **No remote access**: Cannot be reached from network by default
### ACL-Based Authentication
This role uses Valkey's ACL (Access Control List) system for fine-grained security. Each service gets:
- **Dedicated credentials**: Unique username and password
- **Key pattern restrictions**: Can only access specific key patterns
- **Command restrictions**: Limited to required commands only
- **Defense-in-depth**: Multiple layers of isolation
### Configuring ACL Users
Define ACL users in your inventory or host_vars:
```yaml
# inventory/host_vars/yourserver.yml
valkey_admin_password: "your-strong-admin-password"
valkey_acl_users:
- username: immich
password: "{{ immich_valkey_password }}"
keypattern: "immich_bull* immich_channel*"
commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
- username: nextcloud
password: "{{ nextcloud_valkey_password }}"
keypattern: "nextcloud*"
commands: "+@read +@write -@dangerous +auth +ping +info"
- username: gitea
password: "{{ gitea_valkey_password }}"
keypattern: "gitea*"
commands: "+@read +@write -@dangerous +auth +ping +info +select"
```
### ACL Configuration Guide
**Key Pattern (`keypattern`):**
- Single pattern: `"myservice*"` - matches keys starting with `myservice`
- Multiple patterns: `"pattern1* pattern2*"` - space-separated (automatically converted to `~pattern1* ~pattern2*` in ACL file)
- All keys: `"*"` - not recommended for security
**Note:** In the inventory, specify patterns as space-separated strings. The Ansible template automatically adds the `~` prefix to each pattern when generating the ACL file.
### Kernel Tuning
The role automatically configures kernel parameters required by Valkey (see `tasks/kernel-tuning.yml`):
**1. Memory Overcommit:**
```
vm.overcommit_memory = 1
```
- Required for background saves and replication
- Configured via `/etc/sysctl.conf`
- Applied immediately and persists across reboots
**2. Transparent Huge Pages (THP):**
```
transparent_hugepage=madvise
```
- Reduces latency and memory usage issues
- Safely appended to existing GRUB kernel parameters (does not overwrite)
- Only adds parameter if `transparent_hugepage=` is not already present
- Applied at runtime immediately via `/sys/kernel/mm/transparent_hugepage/enabled`
- Persists across reboots via `/etc/default/grub`
- Automatically detects and uses `update-grub` (Debian) or `grub-mkconfig` (Arch)
These settings are required to eliminate Valkey startup warnings and ensure optimal performance.
**Note:** The role preserves existing GRUB parameters. If you have `GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet"`, it will become `GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet transparent_hugepage=madvise"`.
**Commands (`commands`):**
- `&*` - Allow all pub/sub channels (required for job queues like BullMQ)
- `+allchannels` - Alternative to `&*`
- `+@read` - Allow all read commands (GET, MGET, etc.)
- `+@write` - Allow all write commands (SET, DEL, etc.)
- `+@pubsub` - Allow pub/sub commands (SUBSCRIBE, PUBLISH, etc.)
- `-@dangerous` - Deny dangerous commands (FLUSHDB, FLUSHALL, KEYS, CONFIG, etc.)
- `+commandname` - Allow specific command (e.g., `+select`, `+auth`, `+ping`)
- `-commandname` - Deny specific command
**Common Command Sets:**
| Service Type | Recommended Commands |
|-------------|---------------------|
| **Simple cache** | `+@read +@write -@dangerous +auth +ping +info` |
| **Session store** | `+@read +@write -@dangerous +auth +ping +info +select` |
| **Job queue (BullMQ)** | `&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha` |
| **Pub/sub** | `+@pubsub +@read +@write -@dangerous +auth +ping +info` |
**Security Best Practices:**
- Always include `-@dangerous` to prevent accidental data loss
- Use specific key patterns to isolate services
- Only grant `+eval` and `+evalsha` if required (job queues)
- Only grant `&*` or `+allchannels` if using pub/sub
- Use unique passwords for each service
### Setting Secure Passwords
Use Ansible Vault to encrypt all passwords:
```bash
# Admin password
ansible-vault encrypt_string 'your-strong-admin-password' --name 'valkey_admin_password'
# Service passwords
ansible-vault encrypt_string 'immich-password' --name 'immich_valkey_password'
ansible-vault encrypt_string 'nextcloud-password' --name 'nextcloud_valkey_password'
```
Add encrypted values to your inventory:
```yaml
valkey_admin_password: !vault |
$ANSIBLE_VAULT;1.1;AES256
...
immich_valkey_password: !vault |
$ANSIBLE_VAULT;1.1;AES256
...
```
## Service Management
```bash
# Check status
systemctl status redis # Debian/Ubuntu
systemctl status valkey # Arch Linux
# Restart service
systemctl restart redis # Debian/Ubuntu
systemctl restart valkey # Arch Linux
# View logs
journalctl -u redis -f # Debian/Ubuntu
journalctl -u valkey -f # Arch Linux
# Connect to CLI (both systems have redis-cli compatibility)
redis-cli # Works on both
valkey-cli # Also available on Arch Linux
```
## Persistence
Valkey is configured with RDB persistence:
- Save after 900 seconds if at least 1 key changed
- Save after 300 seconds if at least 10 keys changed
- Save after 60 seconds if at least 10000 keys changed
Data is stored in `{{ valkey_dir }}` (default: `/var/lib/valkey`)
## Memory Management
When `valkey_maxmemory` is reached, Valkey will behave based on `valkey_maxmemory_policy`:
- `noeviction`: Return errors when memory limit is reached (default, recommended for BullMQ/job queues)
- `allkeys-lru`: Evict least recently used keys (good for pure caching)
- `volatile-lru`: Evict LRU keys with TTL set
- `allkeys-random`: Evict random keys
- `volatile-random`: Evict random keys with TTL
**Important for Immich and BullMQ:**
Services using BullMQ for job queues (like Immich) require `noeviction` policy. Evicting job queue data can cause:
- Lost background tasks
- Failed job processing
- Data corruption
Only use eviction policies (`allkeys-lru`, etc.) for pure caching use cases where data loss is acceptable.
## Monitoring
Check Valkey info (authenticate as admin):
```bash
redis-cli
AUTH default <valkey_admin_password>
INFO
INFO memory
INFO stats
```
Check connected clients:
```bash
redis-cli
AUTH default <valkey_admin_password>
CLIENT LIST
```
View ACL configuration:
```bash
redis-cli
AUTH default <valkey_admin_password>
ACL LIST # List all users
ACL GETUSER immich # View specific user permissions
ACL GETUSER default # View admin user
ACL CAT # List all command categories
```
Check generated ACL file:
```bash
cat /etc/valkey/users.acl
# Example output:
# user default on >password ~* &* +@all
# user immich on >password ~immich_bull* ~immich_channel* &* -@dangerous +@read ...
# Note: Multiple patterns appear as separate ~pattern entries
```
## Troubleshooting
### Check if Valkey is running
```bash
systemctl status valkey # Arch Linux
systemctl status valkey-server # Debian/Ubuntu
```
### Test admin connection
```bash
# With authentication (default user)
redis-cli
AUTH default <valkey_admin_password>
PING
# Should return: PONG
```
### Test service user connection
```bash
# Test Immich user
redis-cli
AUTH immich <immich_valkey_password>
SELECT 0
PING
# Should return: PONG
# Try restricted command (should fail)
FLUSHDB
# Should return: (error) NOPERM This user has no permissions to run the 'flushdb' command
```
### View ACL configuration
```bash
# Check ACL file
cat /etc/valkey/users.acl
# Check runtime ACL
redis-cli
AUTH default <valkey_admin_password>
ACL LIST
ACL GETUSER immich
```
### Debug permission issues
```bash
# Monitor all commands (useful for debugging)
redis-cli
AUTH default <valkey_admin_password>
MONITOR
# In another terminal, run your application
# You'll see all commands being executed
```
### View configuration
```bash
redis-cli
AUTH default <valkey_admin_password>
CONFIG GET "*"
```
### Check memory usage
```bash
redis-cli
AUTH default <valkey_admin_password>
INFO memory
```
### Common ACL Errors
**"NOAUTH Authentication required"**
- Client didn't authenticate
- Service needs to set `REDIS_USERNAME` and `REDIS_PASSWORD`
**"WRONGPASS invalid username-password pair"**
- Incorrect username or password
- Verify ACL user exists: `ACL GETUSER username`
- Check password in inventory matches service configuration
**"NOPERM No permissions to run the 'command' command"**
- Command not allowed in ACL
- Check ACL: `ACL GETUSER username`
- Add required command to `commands:` in inventory
**"NOPERM No permissions to access a key"**
- Key doesn't match allowed patterns
- Check key pattern: `ACL GETUSER username`
- Verify service is using correct key prefix
**"NOPERM No permissions to access a channel"**
- Pub/sub channel not allowed
- Add `&*` or `+allchannels` to ACL commands
- Required for BullMQ and other job queues
## Performance Tuning
For high-traffic services, consider:
```yaml
valkey_maxmemory: 1gb # Increase memory limit
valkey_maxmemory_policy: noeviction # No eviction (for job queues)
# Or for pure caching:
# valkey_maxmemory_policy: allkeys-lru # LRU eviction
```
**Kernel Tuning (automatically configured):**
The role automatically sets optimal kernel parameters:
- Memory overcommit enabled (`vm.overcommit_memory=1`)
- Transparent Huge Pages set to `madvise`
To verify kernel settings:
```bash
# Check memory overcommit
sysctl vm.overcommit_memory
# Should show: vm.overcommit_memory = 1
# Check THP status
cat /sys/kernel/mm/transparent_hugepage/enabled
# Should show: always [madvise] never
```
## License
MIT
## Author Information
Created for managing shared Valkey instances in NAS/homelab environments.
## Multi-Layer Isolation Strategy
This role implements **defense-in-depth** with three isolation layers:
### 1. ACL Users (Primary Isolation)
Each service gets its own user with restricted permissions:
- Unique credentials
- Key pattern restrictions
- Command restrictions
### 2. Database Numbers (Secondary Isolation)
Valkey provides 16 logical databases (0-15) for additional isolation:
| Service | Database | Key Pattern | ACL User |
|---------|----------|-------------|----------|
| Immich | 0 | `immich_bull*` `immich_channel*` | `immich` |
| Nextcloud | 1 | `nextcloud*` | `nextcloud` |
| Gitea | 2 | `gitea*` | `gitea` |
| Grafana | 3 | `grafana*` | `grafana` |
| Custom | 4-15 | Custom | Custom |
### 3. Key Prefixes (Tertiary Isolation)
Services use unique key prefixes enforced by ACL patterns.
### Testing Isolation
```bash
# Test as Immich user (database 0)
redis-cli
AUTH immich <immich_valkey_password>
SELECT 0
SET immich_bull_test "data"
# Success
# Try to access other service's keys (should fail)
GET nextcloud_test
# Success (key doesn't exist, not a permission error)
# But ACL prevents SET on non-matching patterns:
SET nextcloud_test "data"
# Error: NOPERM No permissions to access a key
# Try dangerous command (should fail)
FLUSHDB
# Error: NOPERM This user has no permissions to run the 'flushdb' command
```
### Complete Example Configuration
```yaml
# inventory/host_vars/myserver.yml
valkey_admin_password: "{{ vault_valkey_admin_password }}"
valkey_acl_users:
# Immich - Photo management (needs BullMQ job queue)
- username: immich
password: "{{ vault_immich_valkey_password }}"
keypattern: "immich_bull* immich_channel*"
commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
# Nextcloud - Simple caching
- username: nextcloud
password: "{{ vault_nextcloud_valkey_password }}"
keypattern: "nextcloud*"
commands: "+@read +@write -@dangerous +auth +ping +info +select"
# Gitea - Session store
- username: gitea
password: "{{ vault_gitea_valkey_password }}"
keypattern: "gitea*"
commands: "+@read +@write -@dangerous +auth +ping +info +select"
# Service variables
immich_valkey_db: 0
nextcloud_valkey_db: 1
gitea_valkey_db: 2
```
### Best Practices
- **ACL first**: Always use ACL users with key pattern restrictions
- **Database numbers**: Use for additional logical separation
- **Key prefixes**: Enforce via ACL patterns, not trust
- **Document**: Keep a table of service assignments
- **Testing**: Reserve database 15 for testing/debugging
- **Monitor**: Use `MONITOR` to verify services stay within their patterns

View File

@ -0,0 +1,46 @@
---
# Valkey bind address
# Default: localhost only
# To allow container access, set to "127.0.0.1 {{ podman_subnet_gateway }}" in your inventory
# Example: "127.0.0.1 10.89.0.1"
valkey_bind: 127.0.0.1
# Valkey port
valkey_port: 6379
# Valkey authentication (REQUIRED - must be set explicitly)
# Set via inventory, host_vars, or ansible-vault
# valkey_admin_password: "" # Intentionally undefined - role will fail if not set
# Valkey max memory (0 = unlimited)
valkey_maxmemory: 256mb
# Valkey max memory policy
# noeviction: Return errors when memory limit is reached (recommended for job queues like BullMQ)
# allkeys-lru: Evict least recently used keys (good for pure caching)
valkey_maxmemory_policy: noeviction
# Valkey data directory (overridden by OS-specific vars)
valkey_dir: /var/lib/valkey
# Valkey ACL file location
valkey_acl_file: /etc/valkey/users.acl
# Valkey ACL users
# Services can register their users here
# Each user should have: username, password, keypattern, commands
valkey_acl_users: []
# Example:
# valkey_acl_users:
# - username: immich
# password: "secretpassword"
# keypattern: "*" # Keys this user can access
# commands: "+@all -@dangerous" # Allowed commands
# Valkey log level (debug, verbose, notice, warning)
valkey_loglevel: notice
# Firewall configuration
valkey_firewall_allowed_sources:
- 127.0.0.0/8 # Localhost
- "{{ podman_subnet | default('10.88.0.0/16') }}" # Podman bridge network

View File

@ -0,0 +1,18 @@
---
- name: Restart Valkey
ansible.builtin.systemd:
name: "{{ valkey_service_name }}"
state: restarted
- name: Reload Valkey
ansible.builtin.systemd:
name: "{{ valkey_service_name }}"
state: reloaded
- name: Update GRUB
ansible.builtin.shell: |
if command -v update-grub &> /dev/null; then
update-grub
else
grub-mkconfig -o /boot/grub/grub.cfg
fi

View File

@ -0,0 +1,43 @@
---
- name: Configure kernel memory overcommit
ansible.posix.sysctl:
name: vm.overcommit_memory
value: "1"
state: present
sysctl_set: true
reload: true
- name: Check if transparent_hugepage is set in GRUB
ansible.builtin.shell: grep -E '^GRUB_CMDLINE_LINUX_DEFAULT=.*transparent_hugepage=' /etc/default/grub
register: thp_check
changed_when: false
failed_when: false
- name: Add transparent_hugepage if not present
ansible.builtin.lineinfile:
path: /etc/default/grub
regexp: '^(GRUB_CMDLINE_LINUX_DEFAULT="[^"]*)"$'
line: '\1 transparent_hugepage=madvise"'
backrefs: true
when: thp_check.rc != 0
notify: Update GRUB
register: grub_updated
- name: Check current THP runtime setting
ansible.builtin.shell: cat /sys/kernel/mm/transparent_hugepage/enabled
register: current_thp
changed_when: false
- name: Disable THP at runtime (if not already set to madvise)
ansible.builtin.shell: |
echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
echo madvise > /sys/kernel/mm/transparent_hugepage/defrag
when: "'[madvise]' not in current_thp.stdout"
- name: Warn user about reboot requirement
ansible.builtin.debug:
msg: |
WARNING: GRUB configuration has been updated with transparent_hugepage=madvise
A REBOOT IS REQUIRED for this change to take effect permanently.
The setting has been applied at runtime temporarily.
when: grub_updated is changed

View File

@ -0,0 +1,58 @@
---
- name: Validate required password is set
ansible.builtin.assert:
that:
- valkey_admin_password is defined
- valkey_admin_password | length >= 12
fail_msg: |
valkey_admin_password is required (min 12 chars).
See roles/valkey/defaults/main.yml for configuration instructions.
success_msg: "Password validation passed"
- name: Configure kernel tuning for Valkey
ansible.builtin.import_tasks: kernel-tuning.yml
- name: Load OS-specific variables
ansible.builtin.include_vars: "{{ item }}"
with_first_found:
- "{{ ansible_facts['os_family'] }}.yml"
- debian.yml
- name: Install Valkey
ansible.builtin.package:
name: "{{ valkey_package }}"
state: present
- name: Deploy Valkey configuration
ansible.builtin.template:
src: valkey.conf.j2
dest: "{{ valkey_config_file }}"
owner: "{{ valkey_user }}"
group: "{{ valkey_group }}"
mode: "0640"
notify: Restart Valkey
- name: Deploy Valkey ACL file
ansible.builtin.template:
src: users.acl.j2
dest: "{{ valkey_acl_file }}"
owner: "{{ valkey_user }}"
group: "{{ valkey_group }}"
mode: "0640"
notify: Restart Valkey
- name: Enable and start Valkey service
ansible.builtin.systemd:
name: "{{ valkey_service_name }}"
enabled: true
state: started
- name: Setup firewall rules for Valkey
community.general.ufw:
rule: allow
src: "{{ item }}"
port: "{{ valkey_port }}"
proto: tcp
direction: in
comment: "Valkey"
loop: "{{ valkey_firewall_allowed_sources }}"

View File

@ -0,0 +1,4 @@
user default on >{{ valkey_admin_password }} ~* &* +@all
{% for acl_user in valkey_acl_users %}
user {{ acl_user.username }} on >{{ acl_user.password }} {% for pattern in acl_user.keypattern.split() %}~{{ pattern }} {% endfor %}{{ acl_user.commands | default('+@all -@dangerous') }}
{% endfor %}

View File

@ -0,0 +1,33 @@
# Valkey configuration managed by Ansible
# Bind to localhost only (security)
bind {{ valkey_bind }}
# Port
port {{ valkey_port }}
# Data directory
dir {{ valkey_dir }}
# Log level
loglevel {{ valkey_loglevel }}
# Memory management
maxmemory {{ valkey_maxmemory }}
maxmemory-policy {{ valkey_maxmemory_policy }}
# Persistence
save 900 1
save 300 10
save 60 10000
# Security
protected-mode yes
# ACL configuration
# Use ACL file for user management (modern approach)
aclfile {{ valkey_acl_file }}
# Daemon mode
daemonize no
supervised systemd

View File

@ -0,0 +1,8 @@
---
# Arch Linux uses Valkey (Redis fork) since 2024
valkey_package: valkey
valkey_service_name: valkey
valkey_config_file: /etc/valkey/valkey.conf
valkey_user: valkey
valkey_group: valkey
valkey_dir: /var/lib/valkey

View File

@ -0,0 +1,8 @@
---
# Debian/Ubuntu uses Valkey from Universe repository
valkey_package: valkey-server
valkey_service_name: valkey-server
valkey_config_file: /etc/valkey/valkey.conf
valkey_user: valkey
valkey_group: valkey
valkey_dir: /var/lib/valkey