User activities data in a static web application hosted on Azure Kubernetes Services (AKS) needed fast and persistent database reads/writes.
Original solution in 2023 used opensource Redis v6.x with RedisGears for write-behind to a Snowflake database. Though the configuration worked, AKS security constraints made implementation complicated, fragile, and difficult to maintain. After Redis dropped its opensource license in 2024, security vulnerabilities could no longer be kept up to date. Support for RedisGears dwindled, and by mid-2025, a Snowflake table autoincrement issue caused the write-behind process to fail entirely.
When opensource Redis returned in 2025, I saw an opportunity to modernize, streamline, and simplify all parts of the solution using Redis OSS v8.x. My approach used a Stream rather than String data type, made write-behind configurable, and automatically restored data from Snowflake to Redis after a persistent volume failure.
Preserved user data during migration by writing utilities for populating Snowflake tables from Redis dump files. Minimized down-time by populating the missing historical data to the database in advance of the migration. Uploaded 10k records from Redis to Snowflake under 6 minutes with code that leveraged temporary tables and followed bulk loading practices.
Wrote tests and concise README for future self and others engineers team members could read in minutes.
Redis Stream to Snowflake
Instantiate Redis OSS cache and asynchronously move user activity from Redis to Snowflake tables from a pod in Kubernetes with two containers:
- Redis (main - stores data in stream).
- Consumer (sidecar - reads stream and writes rows to Snowflake).
Plain Flow
- UI or API sends a JSON batch with
XADDtouserdata-stream. - Redis stores each batch as a stream entry.
- Consumer reads entries, builds MERGE / UPDATE SQL, writes to Snowflake.
- Freshness window skips stale unless allowed.
- Optional restore seeds Redis from Snowflake when empty.
Terminology
| Term | Meaning |
|---|---|
| Visited | View / open action |
| Saved | Bookmark / saved search |
| Deleted | Saved turned off (soft delete) |
| Fresh | Inside freshness window |
| Stale | Outside freshness window (skip unless override) |
Stream Shapes
Create User:
Track Visited / Saved batch:
Visited items use key segment visited and action v. Saved uses key segment /saved and action s. Delete uses action d and the database UPDATE sets saved=0.
If track_id is blank and HASH_NULL_TRACK_ID=1, consumer builds a stable auto_<md5prefix> from title|url|id. If disabled it leaves an empty string.
Key Rules
| Part | Rule |
|---|---|
| Key path | /visited or /saved decides logic |
| Page suffix | If KEY_APPEND_PAGE=1 visited restore keys include /visited/ |
| Action | v visited, s saved, d delete |
| Timestamp | lastplus UTC YYYY-MM-DD HH:MM:SS |
| Stale | Outside freshness window (skip unless override) |
Environment Variables
| Var | Default | Purpose |
|---|---|---|
| ACK_INVALID_JSON | 0 | Ack invalid JSON messages |
| ALLOW_NO_SNOWFLAKE | 0 | Force dry run if Snowflake secrets incomplete |
| ALWAYS_PROCESS_STALE_FIRST | 0 | Process first stale record in a batch |
| AUTO_QUOTE_LASTPLUS | 0 | Quote bare lastplus tokens / datetimes |
| CONFIG_MAX_ATTEMPTS | 12 | Snowflake config retry attempts |
| CONFIG_SLEEP_SECONDS | 5 | Sleep between config retries |
| CONSUMER | consumer-1 | Redis consumer name |
| DRY_RUN | 0 | Log SQL only, no execute |
| ENABLE_FALLBACK_RECORD_LIST | 1 | Treat pure list of tracked dicts as batch |
| ENABLE_FALLBACK_SINGLE_RECORD | 1 | Treat lone tracked dict as single-record batch |
| FALLBACK_JSON_FIELDS | value,payload,data | Additional JSON field names to inspect |
| GROUP | snowflake-writer | Redis consumer group |
| HASH_NULL_TRACK_ID | 0 | Hash blank track_id into auto_<hash> |
| KEY_APPEND_PAGE | 1 | Append page segment to visited restore keys |
| KEY_PREFIX | userdata-events | Prefix for stream batch keys |
| LENIENT_JSON_REPAIR | 0 | Attempt brace/bracket repair on truncated JSON |
| LIST_STREAM_KEYS_ON_START | 0 | List existing stream keys at startup |
| LOG_LEVEL | INFO | Logging level |
| LOG_RAW | 0 | Log raw Redis messages |
| LOG_SQL | 0 | Log SQL statements |
| LOG_TRACKED_EACH | 0 | Log every tracked record execution |
| LOG_TRACKED_HASH_GEN | 0 | Log generated hash track_ids |
| MAX_JSON_REPAIR_APPEND | 10 | Max appended chars during repair |
| METRICS_INTERVAL_SECONDS | 60 | Metrics logging interval |
| NEWER_RECORD_TIME_LIMIT_SECONDS | 30 | Freshness window seconds |
| PRESERVE_TRACK_TYPE_CASE | 0 | Keep original case of track_type |
| PROCESS_STALE_SAVED | 1 | Allow stale saved records |
| REDIS_DB | 0 | Redis DB index |
| REDIS_HOST | localhost | Redis host |
| REDIS_PORT | 6379 | Redis port |
| REDIS_STREAM | userdata-stream | Stream name |
| RESTORE_BATCH_SIZE | 100 | Records per emitted restore message |
| RESTORE_DAYS | 7 | Look-back days for restore |
| RESTORE_FORCE | 0 | Force restore ignoring length/marker |
| RESTORE_INCLUDE_SAVED | 1 | Include saved rows in restore |
| RESTORE_INCLUDE_VISITED | 1 | Include visited rows in restore |
| RESTORE_LIMIT_TOTAL | 2000 | Max rows queried/emitted during restore |
| RESTORE_LOG_SAMPLE | 5 | Sample count logged (size only) |
| RESTORE_MARK_KEY | userdata_stream_restored | Redis marker key to prevent repeat |
| RESTORE_MIN_ROWS_TRIGGER | 0 | Skip restore if rows < threshold (unless force) |
| RESTORE_ON_EMPTY | 1 | Enable restore logic when stream empty |
| SKIP_FRESHNESS | 0 | Ignore freshness window (process all) |
| SNOWFLAKE_AUTO_RECONNECT | 1 | Reconnect on token expiration |
| SNOWFLAKE_PING_INTERVAL_SECONDS | 300 | Ping interval for Snowflake keep-alive |
| STREAM | userdata-stream | Alias for REDIS_STREAM (if used) |
| TARGET_LOG_IDS | (blank) | IDs that force detailed logging |
| TRACE_SQL | 0 | Extra SQL trace lines |
| UNIT_TEST | 0 | Use in-memory Redis stub |
| USERNAME_SPLIT_HYPHEN | 1 | Trim at first hyphen when deriving username |
| USERNAME_USE_RECORD_ID | 0 | Use record id as username for batches |
| VERBOSE_SNOWFLAKE_TEST | 0 | Log Snowflake context on connect |
| VERIFY_TRACKED_AFTER_WRITE | 0 | Run SELECT after write for sample rows |
| VERIFY_TRACKED_LIMIT | 20 | Max verify selects |
Restore Logic Summary
Runs once when:
RESTORE_ON_EMPTY=1- Stream length = 0 (unless
RESTORE_FORCE=1) - Marker key absent (unless force)
Fetches recent rows, grouped by user+kind, and emits batches respecting KEY_APPEND_PAGE for visited.
Freshness
Records older than NEWER_RECORD_TIME_LIMIT_SECONDS skipped unless:
- Saved and
PROCESS_STALE_SAVED=1 - First stale in batch with
ALWAYS_PROCESS_STALE_FIRST=1 SKIP_FRESHNESS=1disables skip
Track ID
Empty track_id stored as empty string. If HASH_NULL_TRACK_ID=1, computed auto_<12 hex> prefix ensures stable grouping.
Quick Dry Run
Look for RECORD_EXEC and optionally auto_ hash lines.
Unit Tests
Tests cover:
- JSON field select.
- Parse and repair.
- Username derive.
- Fresh vs stale skip.
- Stale saved override.
- Null track id hashing.
- Import of string keys.
- Failover path.
Common Checks
| Task | Command | Expect |
|---|---|---|
| Ping Redis | redis-cli PING |
PONG |
| Group info | redis-cli XINFO GROUPS userdata-stream |
Group row |
| Last 1 entry | redis-cli XREVRANGE userdata-stream + - COUNT 1 |
Array |
| Snowflake write | Look for SQL (TRACKED) in logs |
MERGE lines |
| Failover event | Search FAILOVER in logs |
Failover lines |
Troubleshooting Short List
| Issue | Fix |
|---|---|
| No stream growth | Check UI or bridge for XADD. |
| Consumer silent | Confirm REDIS_HOST and network. |
Null track_id rows missing |
Set HASH_NULL_TRACK_ID=1. |
| Saved queries skipped | Set PROCESS_STALE_SAVED=1 or fix timestamps. |
| Import not running | Set EXTRA_BRIDGE_KEYS before start. |
| Bad JSON repeats | Set ACK_INVALID_JSON=1 or fix source. |
| Failover noisy | Set STREAM_FAILOVER_TO_SNOWFLAKE=0. |
Minimal Deployment Env List
Apply these (example) settings along with Snowflake secrets:
Snowflake Verification
Query sample:
If no rows: check consumer logs for skip or failover markers.
Safe Manual Test in Pod
Clean Exit
Stop consumer with Ctrl+C. Redis keeps data.
Persistence
Keeping data on pod restart after persistent volume loss.
Redis writes data to files in /data inside the container. The Kubernetes Persistent Volume (PV) is mounted at that path.
Files:
appendonly.aof (every write is appended; stream entries survive restarts)
dump.rdb (snapshot saved based on save rules)
redis.conf (minimum settings)
plain appendfilename appendonly.aof appendfsync everysec appendonly yes dbfilename dump.rdb dir /data save 300 10 save 60 10000 save 900 1
appendfsync everysec: flush AOF once per second (small data loss window if a crash happens).appendonly yes: log each write (XADD) to AOF.dir /data: write files on the mounted volume.save xx xxetc: create a snapshot if at least 1 write in 900s, 10 writes in 300s, or 10000 writes in 60s.
Tests
Basic test:
Restart test (in cluster):
- Record current stream length:
redis-cli XLEN userdata-stream. - Delete the pod (deployment will recreate it).
- Run length again:
redis-cli XLEN userdata-stream. - It should match or be larger.
If length resets to 0:
- PV not mounted in the redis container.
- Wrong path in
redis.conf. - Permissions blocked (check ownership of
/data). - Different DB index used (UI wrote DB 1, you checked DB 0). Test with
redis-cli -n 1 XLEN userdata-stream.
To force a snapshot now (optional):
For near zero loss set appendfsync always (higher disk cost).
If persistence not needed for some environments (local dev), you can keep defaults and ignore PV.
Restore (Snowflake -> Redis Stream on Empty)
Goal: After a pod restart with an empty /data volume the Redis stream would be blank while Snowflake still has authoritative history. The consumer can seed the stream so downstream tools relying on recent backlog have context.
Trigger: Runs once at startup when ALL are true:
RESTORE_ON_EMPTY=1(default 1)- Stream length is 0
- Redis key marker (default
_userdata_stream_restored) absent
Process:
- Query Snowflake
TRACKEDjoined toUSERfor rows within lastRESTORE_DAYSdays (default 7). - Apply filters for saved / visited based on flags.
- Limit total rows (
RESTORE_LIMIT_TOTAL). - Group by user + kind (visited / saved) and emit batches of size
RESTORE_BATCH_SIZEas normal stream messages ({"key":"userdata-events:<email>/<visited|saved>","records":[...]}). - Set marker key to avoid re-running.
Environment flags:
| Var | Default | Description |
|---|---|---|
KEY_APPEND_PAGE |
1 | Append page segment to visited keys (/visited/<page>). |
RESTORE_BATCH_SIZE |
100 | Records per emitted message. |
RESTORE_DAYS |
7 | Look-back window (days). |
RESTORE_FORCE |
0 | Force restore even if stream not empty or marker exists. |
RESTORE_INCLUDE_SAVED |
1 | Include saved rows. |
RESTORE_INCLUDE_VISITED |
1 | Include visited rows. |
RESTORE_LIMIT_TOTAL |
2000 | Hard cap on rows queried/emitted. |
RESTORE_LOG_SAMPLE |
5 | Count of sample rows to mention in logs (size only; rows themselves not logged). |
RESTORE_MARK_KEY |
_userdata_stream_restored |
Redis string key used as idempotent marker. |
RESTORE_MIN_ROWS_TRIGGER |
0 | If >0 skip restore when candidate row count below threshold. |
RESTORE_ON_EMPTY |
1 | Enable feature. |
Dry run mode (DRY_RUN=1) still performs the query but just logs the would-be XADD actions.
To disable completely set RESTORE_ON_EMPTY=0.
Operational note - restore seeds only the stream. It does NOT rewrite Snowflake (source of truth). If restore runs after a true data-loss event for Redis it simply replays recent history for observability/consumers needing backlog.
