565 lines
16 KiB
Markdown
565 lines
16 KiB
Markdown
# Redis/Valkey Role
|
|
|
|
This Ansible role installs and configures Valkey (a Redis fork) for local use only. It provides a shared Redis-compatible instance that multiple services can use as a cache or message broker.
|
|
|
|
The role also performs required kernel tuning for optimal Valkey performance.
|
|
|
|
## About Valkey
|
|
|
|
Valkey is a high-performance key/value datastore and a drop-in replacement for Redis. It was created as a community-driven fork after Redis changed its license from BSD to proprietary licenses (RSALv2 and SSPLv1) in March 2024.
|
|
|
|
**Key points:**
|
|
- Valkey is 100% API-compatible with Redis
|
|
- Backed by the Linux Foundation
|
|
- Uses permissive open-source license (BSD 3-Clause)
|
|
- No code changes needed in your applications
|
|
- Same commands, same protocol, same performance
|
|
|
|
**Distribution support:**
|
|
- **Arch Linux**: Installs Valkey (redis package replaced in April 2024)
|
|
- **Debian/Ubuntu**: Installs Valkey from official repositories
|
|
|
|
## Features
|
|
|
|
- Installs Redis/Valkey
|
|
- Local-only access (localhost)
|
|
- Configurable memory limits and eviction policies
|
|
- Persistence enabled
|
|
- Systemd integration
|
|
- Automatic kernel tuning (memory overcommit, THP)
|
|
- ACL-based user authentication
|
|
- Firewall configuration (UFW)
|
|
|
|
## Requirements
|
|
|
|
- Systemd-based Linux distribution
|
|
- Root/sudo access
|
|
- `ansible.posix` collection (for sysctl module)
|
|
- `community.general` collection (for ufw module)
|
|
|
|
## Role Variables
|
|
|
|
Available variables with defaults (see `defaults/main.yml`):
|
|
|
|
```yaml
|
|
# Bind address (localhost only for security)
|
|
valkey_bind: 127.0.0.1
|
|
|
|
# Port
|
|
valkey_port: 6379
|
|
|
|
# Authentication (REQUIRED - must be set explicitly)
|
|
# valkey_admin_password: "" # Intentionally undefined - role will fail if not set
|
|
|
|
# ACL users (services register their users here)
|
|
valkey_acl_users: []
|
|
# Example:
|
|
# valkey_acl_users:
|
|
# - username: immich
|
|
# password: "secretpassword"
|
|
# keypattern: "immich_bull* immich_channel*" # Space-separated patterns (template converts to ~pattern1 ~pattern2)
|
|
# commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
|
|
|
|
# Max memory (0 = unlimited)
|
|
valkey_maxmemory: 256mb
|
|
|
|
# Eviction policy when max memory is reached
|
|
valkey_maxmemory_policy: allkeys-lru
|
|
|
|
# Data directory
|
|
valkey_dir: /var/lib/valkey
|
|
|
|
# ACL file location
|
|
valkey_acl_file: /etc/valkey/users.acl
|
|
|
|
# Log level
|
|
valkey_loglevel: notice
|
|
```
|
|
|
|
**Security Note:** This role uses ACL-based authentication. You must set `valkey_admin_password` and configure service users via `valkey_acl_users`.
|
|
|
|
**System Requirements:** This role automatically config
|
|
|
|
## Dependencies
|
|
|
|
None.
|
|
|
|
## Example Playbook
|
|
|
|
```yaml
|
|
---
|
|
- hosts: servers
|
|
become: true
|
|
roles:
|
|
- role: valkey
|
|
- role: immich # Will connect to system Valkey
|
|
```
|
|
|
|
### Custom Configuration with ACL Users
|
|
|
|
```yaml
|
|
---
|
|
- hosts: servers
|
|
become: true
|
|
roles:
|
|
- role: valkey
|
|
vars:
|
|
valkey_admin_password: "{{ vault_valkey_password }}"
|
|
valkey_maxmemory: 512mb
|
|
valkey_maxmemory_policy: volatile-lru
|
|
valkey_acl_users:
|
|
- username: immich
|
|
password: "{{ immich_valkey_password }}"
|
|
keypattern: "immich_bull* immich_channel*"
|
|
commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
|
|
- username: nextcloud
|
|
password: "{{ nextcloud_valkey_password }}"
|
|
keypattern: "nextcloud*"
|
|
commands: "+@read +@write -@dangerous +auth +ping +info"
|
|
```
|
|
|
|
## How Services Connect
|
|
|
|
Services running on the same host can connect to Valkey at:
|
|
- **Host**: `localhost` or `127.0.0.1`
|
|
- **Port**: `6379` (default)
|
|
|
|
### From Containers
|
|
|
|
Containers need special handling to reach the host's Valkey:
|
|
|
|
**Use `host.containers.internal`:**
|
|
```yaml
|
|
REDIS_HOSTNAME: host.containers.internal
|
|
REDIS_PORT: 6379
|
|
```
|
|
|
|
This special DNS name resolves to the host machine from inside containers.
|
|
|
|
**Note:** Environment variables often still use `REDIS_*` naming for compatibility, since Valkey is API-compatible with Redis.
|
|
|
|
## Security
|
|
|
|
- **Local-only**: Valkey binds to `127.0.0.1` only (configurable for container access)
|
|
- **Protected mode**: Enabled
|
|
- **ACL authentication**: Each service gets its own user with restricted permissions
|
|
- **No remote access**: Cannot be reached from network by default
|
|
|
|
### ACL-Based Authentication
|
|
|
|
This role uses Valkey's ACL (Access Control List) system for fine-grained security. Each service gets:
|
|
- **Dedicated credentials**: Unique username and password
|
|
- **Key pattern restrictions**: Can only access specific key patterns
|
|
- **Command restrictions**: Limited to required commands only
|
|
- **Defense-in-depth**: Multiple layers of isolation
|
|
|
|
### Configuring ACL Users
|
|
|
|
Define ACL users in your inventory or host_vars:
|
|
|
|
```yaml
|
|
# inventory/host_vars/yourserver.yml
|
|
valkey_admin_password: "your-strong-admin-password"
|
|
|
|
valkey_acl_users:
|
|
- username: immich
|
|
password: "{{ immich_valkey_password }}"
|
|
keypattern: "immich_bull* immich_channel*"
|
|
commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
|
|
|
|
- username: nextcloud
|
|
password: "{{ nextcloud_valkey_password }}"
|
|
keypattern: "nextcloud*"
|
|
commands: "+@read +@write -@dangerous +auth +ping +info"
|
|
|
|
- username: gitea
|
|
password: "{{ gitea_valkey_password }}"
|
|
keypattern: "gitea*"
|
|
commands: "+@read +@write -@dangerous +auth +ping +info +select"
|
|
```
|
|
|
|
### ACL Configuration Guide
|
|
|
|
**Key Pattern (`keypattern`):**
|
|
- Single pattern: `"myservice*"` - matches keys starting with `myservice`
|
|
- Multiple patterns: `"pattern1* pattern2*"` - space-separated (automatically converted to `~pattern1* ~pattern2*` in ACL file)
|
|
- All keys: `"*"` - not recommended for security
|
|
|
|
**Note:** In the inventory, specify patterns as space-separated strings. The Ansible template automatically adds the `~` prefix to each pattern when generating the ACL file.
|
|
|
|
### Kernel Tuning
|
|
|
|
The role automatically configures kernel parameters required by Valkey (see `tasks/kernel-tuning.yml`):
|
|
|
|
**1. Memory Overcommit:**
|
|
```
|
|
vm.overcommit_memory = 1
|
|
```
|
|
- Required for background saves and replication
|
|
- Configured via `/etc/sysctl.conf`
|
|
- Applied immediately and persists across reboots
|
|
|
|
**2. Transparent Huge Pages (THP):**
|
|
```
|
|
transparent_hugepage=madvise
|
|
```
|
|
- Reduces latency and memory usage issues
|
|
- Safely appended to existing GRUB kernel parameters (does not overwrite)
|
|
- Only adds parameter if `transparent_hugepage=` is not already present
|
|
- Applied at runtime immediately via `/sys/kernel/mm/transparent_hugepage/enabled`
|
|
- Persists across reboots via `/etc/default/grub`
|
|
- Automatically detects and uses `update-grub` (Debian) or `grub-mkconfig` (Arch)
|
|
|
|
These settings are required to eliminate Valkey startup warnings and ensure optimal performance.
|
|
|
|
**Note:** The role preserves existing GRUB parameters. If you have `GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet"`, it will become `GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet transparent_hugepage=madvise"`.
|
|
|
|
**Commands (`commands`):**
|
|
- `&*` - Allow all pub/sub channels (required for job queues like BullMQ)
|
|
- `+allchannels` - Alternative to `&*`
|
|
- `+@read` - Allow all read commands (GET, MGET, etc.)
|
|
- `+@write` - Allow all write commands (SET, DEL, etc.)
|
|
- `+@pubsub` - Allow pub/sub commands (SUBSCRIBE, PUBLISH, etc.)
|
|
- `-@dangerous` - Deny dangerous commands (FLUSHDB, FLUSHALL, KEYS, CONFIG, etc.)
|
|
- `+commandname` - Allow specific command (e.g., `+select`, `+auth`, `+ping`)
|
|
- `-commandname` - Deny specific command
|
|
|
|
**Common Command Sets:**
|
|
|
|
| Service Type | Recommended Commands |
|
|
|-------------|---------------------|
|
|
| **Simple cache** | `+@read +@write -@dangerous +auth +ping +info` |
|
|
| **Session store** | `+@read +@write -@dangerous +auth +ping +info +select` |
|
|
| **Job queue (BullMQ)** | `&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha` |
|
|
| **Pub/sub** | `+@pubsub +@read +@write -@dangerous +auth +ping +info` |
|
|
|
|
**Security Best Practices:**
|
|
- Always include `-@dangerous` to prevent accidental data loss
|
|
- Use specific key patterns to isolate services
|
|
- Only grant `+eval` and `+evalsha` if required (job queues)
|
|
- Only grant `&*` or `+allchannels` if using pub/sub
|
|
- Use unique passwords for each service
|
|
|
|
### Setting Secure Passwords
|
|
|
|
Use Ansible Vault to encrypt all passwords:
|
|
|
|
```bash
|
|
# Admin password
|
|
ansible-vault encrypt_string 'your-strong-admin-password' --name 'valkey_admin_password'
|
|
|
|
# Service passwords
|
|
ansible-vault encrypt_string 'immich-password' --name 'immich_valkey_password'
|
|
ansible-vault encrypt_string 'nextcloud-password' --name 'nextcloud_valkey_password'
|
|
```
|
|
|
|
Add encrypted values to your inventory:
|
|
|
|
```yaml
|
|
valkey_admin_password: !vault |
|
|
$ANSIBLE_VAULT;1.1;AES256
|
|
...
|
|
|
|
immich_valkey_password: !vault |
|
|
$ANSIBLE_VAULT;1.1;AES256
|
|
...
|
|
```
|
|
|
|
## Service Management
|
|
|
|
```bash
|
|
# Check status
|
|
systemctl status redis # Debian/Ubuntu
|
|
systemctl status valkey # Arch Linux
|
|
|
|
# Restart service
|
|
systemctl restart redis # Debian/Ubuntu
|
|
systemctl restart valkey # Arch Linux
|
|
|
|
# View logs
|
|
journalctl -u redis -f # Debian/Ubuntu
|
|
journalctl -u valkey -f # Arch Linux
|
|
|
|
# Connect to CLI (both systems have redis-cli compatibility)
|
|
redis-cli # Works on both
|
|
valkey-cli # Also available on Arch Linux
|
|
```
|
|
|
|
## Persistence
|
|
|
|
Valkey is configured with RDB persistence:
|
|
- Save after 900 seconds if at least 1 key changed
|
|
- Save after 300 seconds if at least 10 keys changed
|
|
- Save after 60 seconds if at least 10000 keys changed
|
|
|
|
Data is stored in `{{ valkey_dir }}` (default: `/var/lib/valkey`)
|
|
|
|
## Memory Management
|
|
|
|
When `valkey_maxmemory` is reached, Valkey will behave based on `valkey_maxmemory_policy`:
|
|
|
|
- `noeviction`: Return errors when memory limit is reached (default, recommended for BullMQ/job queues)
|
|
- `allkeys-lru`: Evict least recently used keys (good for pure caching)
|
|
- `volatile-lru`: Evict LRU keys with TTL set
|
|
- `allkeys-random`: Evict random keys
|
|
- `volatile-random`: Evict random keys with TTL
|
|
|
|
**Important for Immich and BullMQ:**
|
|
Services using BullMQ for job queues (like Immich) require `noeviction` policy. Evicting job queue data can cause:
|
|
- Lost background tasks
|
|
- Failed job processing
|
|
- Data corruption
|
|
|
|
Only use eviction policies (`allkeys-lru`, etc.) for pure caching use cases where data loss is acceptable.
|
|
|
|
## Monitoring
|
|
|
|
Check Valkey info (authenticate as admin):
|
|
```bash
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
INFO
|
|
INFO memory
|
|
INFO stats
|
|
```
|
|
|
|
Check connected clients:
|
|
```bash
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
CLIENT LIST
|
|
```
|
|
|
|
View ACL configuration:
|
|
```bash
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
ACL LIST # List all users
|
|
ACL GETUSER immich # View specific user permissions
|
|
ACL GETUSER default # View admin user
|
|
ACL CAT # List all command categories
|
|
```
|
|
|
|
Check generated ACL file:
|
|
```bash
|
|
cat /etc/valkey/users.acl
|
|
# Example output:
|
|
# user default on >password ~* &* +@all
|
|
# user immich on >password ~immich_bull* ~immich_channel* &* -@dangerous +@read ...
|
|
# Note: Multiple patterns appear as separate ~pattern entries
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Check if Valkey is running
|
|
```bash
|
|
systemctl status valkey # Arch Linux
|
|
systemctl status valkey-server # Debian/Ubuntu
|
|
```
|
|
|
|
### Test admin connection
|
|
```bash
|
|
# With authentication (default user)
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
PING
|
|
# Should return: PONG
|
|
```
|
|
|
|
### Test service user connection
|
|
```bash
|
|
# Test Immich user
|
|
redis-cli
|
|
AUTH immich <immich_valkey_password>
|
|
SELECT 0
|
|
PING
|
|
# Should return: PONG
|
|
|
|
# Try restricted command (should fail)
|
|
FLUSHDB
|
|
# Should return: (error) NOPERM This user has no permissions to run the 'flushdb' command
|
|
```
|
|
|
|
### View ACL configuration
|
|
```bash
|
|
# Check ACL file
|
|
cat /etc/valkey/users.acl
|
|
|
|
# Check runtime ACL
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
ACL LIST
|
|
ACL GETUSER immich
|
|
```
|
|
|
|
### Debug permission issues
|
|
```bash
|
|
# Monitor all commands (useful for debugging)
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
MONITOR
|
|
# In another terminal, run your application
|
|
# You'll see all commands being executed
|
|
```
|
|
|
|
### View configuration
|
|
```bash
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
CONFIG GET "*"
|
|
```
|
|
|
|
### Check memory usage
|
|
```bash
|
|
redis-cli
|
|
AUTH default <valkey_admin_password>
|
|
INFO memory
|
|
```
|
|
|
|
### Common ACL Errors
|
|
|
|
**"NOAUTH Authentication required"**
|
|
- Client didn't authenticate
|
|
- Service needs to set `REDIS_USERNAME` and `REDIS_PASSWORD`
|
|
|
|
**"WRONGPASS invalid username-password pair"**
|
|
- Incorrect username or password
|
|
- Verify ACL user exists: `ACL GETUSER username`
|
|
- Check password in inventory matches service configuration
|
|
|
|
**"NOPERM No permissions to run the 'command' command"**
|
|
- Command not allowed in ACL
|
|
- Check ACL: `ACL GETUSER username`
|
|
- Add required command to `commands:` in inventory
|
|
|
|
**"NOPERM No permissions to access a key"**
|
|
- Key doesn't match allowed patterns
|
|
- Check key pattern: `ACL GETUSER username`
|
|
- Verify service is using correct key prefix
|
|
|
|
**"NOPERM No permissions to access a channel"**
|
|
- Pub/sub channel not allowed
|
|
- Add `&*` or `+allchannels` to ACL commands
|
|
- Required for BullMQ and other job queues
|
|
|
|
## Performance Tuning
|
|
|
|
For high-traffic services, consider:
|
|
|
|
```yaml
|
|
valkey_maxmemory: 1gb # Increase memory limit
|
|
valkey_maxmemory_policy: noeviction # No eviction (for job queues)
|
|
# Or for pure caching:
|
|
# valkey_maxmemory_policy: allkeys-lru # LRU eviction
|
|
```
|
|
|
|
**Kernel Tuning (automatically configured):**
|
|
The role automatically sets optimal kernel parameters:
|
|
- Memory overcommit enabled (`vm.overcommit_memory=1`)
|
|
- Transparent Huge Pages set to `madvise`
|
|
|
|
To verify kernel settings:
|
|
```bash
|
|
# Check memory overcommit
|
|
sysctl vm.overcommit_memory
|
|
# Should show: vm.overcommit_memory = 1
|
|
|
|
# Check THP status
|
|
cat /sys/kernel/mm/transparent_hugepage/enabled
|
|
# Should show: always [madvise] never
|
|
```
|
|
|
|
## License
|
|
|
|
MIT
|
|
|
|
## Author Information
|
|
|
|
Created for managing shared Valkey instances in NAS/homelab environments.
|
|
|
|
## Multi-Layer Isolation Strategy
|
|
|
|
This role implements **defense-in-depth** with three isolation layers:
|
|
|
|
### 1. ACL Users (Primary Isolation)
|
|
Each service gets its own user with restricted permissions:
|
|
- Unique credentials
|
|
- Key pattern restrictions
|
|
- Command restrictions
|
|
|
|
### 2. Database Numbers (Secondary Isolation)
|
|
Valkey provides 16 logical databases (0-15) for additional isolation:
|
|
|
|
| Service | Database | Key Pattern | ACL User |
|
|
|---------|----------|-------------|----------|
|
|
| Immich | 0 | `immich_bull*` `immich_channel*` | `immich` |
|
|
| Nextcloud | 1 | `nextcloud*` | `nextcloud` |
|
|
| Gitea | 2 | `gitea*` | `gitea` |
|
|
| Grafana | 3 | `grafana*` | `grafana` |
|
|
| Custom | 4-15 | Custom | Custom |
|
|
|
|
### 3. Key Prefixes (Tertiary Isolation)
|
|
Services use unique key prefixes enforced by ACL patterns.
|
|
|
|
### Testing Isolation
|
|
|
|
```bash
|
|
# Test as Immich user (database 0)
|
|
redis-cli
|
|
AUTH immich <immich_valkey_password>
|
|
SELECT 0
|
|
SET immich_bull_test "data"
|
|
# Success
|
|
|
|
# Try to access other service's keys (should fail)
|
|
GET nextcloud_test
|
|
# Success (key doesn't exist, not a permission error)
|
|
# But ACL prevents SET on non-matching patterns:
|
|
SET nextcloud_test "data"
|
|
# Error: NOPERM No permissions to access a key
|
|
|
|
# Try dangerous command (should fail)
|
|
FLUSHDB
|
|
# Error: NOPERM This user has no permissions to run the 'flushdb' command
|
|
```
|
|
|
|
### Complete Example Configuration
|
|
|
|
```yaml
|
|
# inventory/host_vars/myserver.yml
|
|
valkey_admin_password: "{{ vault_valkey_admin_password }}"
|
|
|
|
valkey_acl_users:
|
|
# Immich - Photo management (needs BullMQ job queue)
|
|
- username: immich
|
|
password: "{{ vault_immich_valkey_password }}"
|
|
keypattern: "immich_bull* immich_channel*"
|
|
commands: "&* -@dangerous +@read +@write +@pubsub +select +auth +ping +info +eval +evalsha"
|
|
|
|
# Nextcloud - Simple caching
|
|
- username: nextcloud
|
|
password: "{{ vault_nextcloud_valkey_password }}"
|
|
keypattern: "nextcloud*"
|
|
commands: "+@read +@write -@dangerous +auth +ping +info +select"
|
|
|
|
# Gitea - Session store
|
|
- username: gitea
|
|
password: "{{ vault_gitea_valkey_password }}"
|
|
keypattern: "gitea*"
|
|
commands: "+@read +@write -@dangerous +auth +ping +info +select"
|
|
|
|
# Service variables
|
|
immich_valkey_db: 0
|
|
nextcloud_valkey_db: 1
|
|
gitea_valkey_db: 2
|
|
```
|
|
|
|
### Best Practices
|
|
|
|
- **ACL first**: Always use ACL users with key pattern restrictions
|
|
- **Database numbers**: Use for additional logical separation
|
|
- **Key prefixes**: Enforce via ACL patterns, not trust
|
|
- **Document**: Keep a table of service assignments
|
|
- **Testing**: Reserve database 15 for testing/debugging
|
|
- **Monitor**: Use `MONITOR` to verify services stay within their patterns
|