User activities data in a static web application hosted on Azure Kubernetes Services (AKS) needed fast and persistent database reads/writes.

Original solution in 2023 used opensource Redis v6.x with RedisGears for write-behind to a Snowflake database. Though the configuration worked, AKS security constraints made implementation complicated, fragile, and difficult to maintain. After Redis dropped its opensource license in 2024, security vulnerabilities could no longer be kept up to date. Support for RedisGears dwindled, and by mid-2025, a Snowflake table autoincrement issue caused the write-behind process to fail entirely.

When opensource Redis returned in 2025, I saw an opportunity to modernize, streamline, and simplify all parts of the solution using Redis OSS v8.x. My approach used a Stream rather than String data type, made write-behind configurable, and automatically restored data from Snowflake to Redis after a persistent volume failure.

Preserved user data during migration by writing utilities for populating Snowflake tables from Redis dump files. Minimized down-time by populating the missing historical data to the database in advance of the migration. Uploaded 10k records from Redis to Snowflake under 6 minutes with code that leveraged temporary tables and followed bulk loading practices.

Wrote tests and concise README for future self and others engineers team members could read in minutes.

Redis Stream to Snowflake

Instantiate Redis OSS cache and asynchronously move user activity from Redis to Snowflake tables from a pod in Kubernetes with two containers:

  1. Redis (main - stores data in stream).
  2. Consumer (sidecar - reads stream and writes rows to Snowflake).

Plain Flow

  1. UI or API sends a JSON batch with XADD to userdata-stream.
  2. Redis stores each batch as a stream entry.
  3. Consumer reads entries, builds MERGE / UPDATE SQL, writes to Snowflake.
  4. Freshness window skips stale unless allowed.
  5. Optional restore seeds Redis from Snowflake when empty.
Synchronous
CRUD
Asynchronous
CRUD
Restore on demand
APP
UI
Userdata
Handler
API

REDIS
CACHE
(Stream)
CONSUMER

SNOWFLAKE
TABLE

Terminology

Term Meaning
Visited View / open action
Saved Bookmark / saved search
Deleted Saved turned off (soft delete)
Fresh Inside freshness window
Stale Outside freshness window (skip unless override)

Stream Shapes

Create User:

{"user_email":"user.name@example.com"}
JSON

Track Visited / Saved batch:

{
  "key": "userdata-events:user.name-abc123/visited",
  "records": [
    {
      "id": "user.name-abc123",
      "page_name": "search",
      "query_row_count": "25",
      "track_id": "item-01",
      "track_title": "Report 01",
      "track_type": "REPORT",
      "track_url": "/search?q=item-01",
      "lastplus": "2025-11-01 10:20:30",
      "action": "v"
    }
  ]
}
JSON

Visited items use key segment visited and action v. Saved uses key segment /saved and action s. Delete uses action d and the database UPDATE sets saved=0.

If track_id is blank and HASH_NULL_TRACK_ID=1, consumer builds a stable auto_<md5prefix> from title|url|id. If disabled it leaves an empty string.

Key Rules

Part Rule
Key path /visited or /saved decides logic
Page suffix If KEY_APPEND_PAGE=1 visited restore keys include /visited/
Action v visited, s saved, d delete
Timestamp lastplus UTC YYYY-MM-DD HH:MM:SS
Stale Outside freshness window (skip unless override)

Environment Variables

Var Default Purpose
ACK_INVALID_JSON 0 Ack invalid JSON messages
ALLOW_NO_SNOWFLAKE 0 Force dry run if Snowflake secrets incomplete
ALWAYS_PROCESS_STALE_FIRST 0 Process first stale record in a batch
AUTO_QUOTE_LASTPLUS 0 Quote bare lastplus tokens / datetimes
CONFIG_MAX_ATTEMPTS 12 Snowflake config retry attempts
CONFIG_SLEEP_SECONDS 5 Sleep between config retries
CONSUMER consumer-1 Redis consumer name
DRY_RUN 0 Log SQL only, no execute
ENABLE_FALLBACK_RECORD_LIST 1 Treat pure list of tracked dicts as batch
ENABLE_FALLBACK_SINGLE_RECORD 1 Treat lone tracked dict as single-record batch
FALLBACK_JSON_FIELDS value,payload,data Additional JSON field names to inspect
GROUP snowflake-writer Redis consumer group
HASH_NULL_TRACK_ID 0 Hash blank track_id into auto_<hash>
KEY_APPEND_PAGE 1 Append page segment to visited restore keys
KEY_PREFIX userdata-events Prefix for stream batch keys
LENIENT_JSON_REPAIR 0 Attempt brace/bracket repair on truncated JSON
LIST_STREAM_KEYS_ON_START 0 List existing stream keys at startup
LOG_LEVEL INFO Logging level
LOG_RAW 0 Log raw Redis messages
LOG_SQL 0 Log SQL statements
LOG_TRACKED_EACH 0 Log every tracked record execution
LOG_TRACKED_HASH_GEN 0 Log generated hash track_ids
MAX_JSON_REPAIR_APPEND 10 Max appended chars during repair
METRICS_INTERVAL_SECONDS 60 Metrics logging interval
NEWER_RECORD_TIME_LIMIT_SECONDS 30 Freshness window seconds
PRESERVE_TRACK_TYPE_CASE 0 Keep original case of track_type
PROCESS_STALE_SAVED 1 Allow stale saved records
REDIS_DB 0 Redis DB index
REDIS_HOST localhost Redis host
REDIS_PORT 6379 Redis port
REDIS_STREAM userdata-stream Stream name
RESTORE_BATCH_SIZE 100 Records per emitted restore message
RESTORE_DAYS 7 Look-back days for restore
RESTORE_FORCE 0 Force restore ignoring length/marker
RESTORE_INCLUDE_SAVED 1 Include saved rows in restore
RESTORE_INCLUDE_VISITED 1 Include visited rows in restore
RESTORE_LIMIT_TOTAL 2000 Max rows queried/emitted during restore
RESTORE_LOG_SAMPLE 5 Sample count logged (size only)
RESTORE_MARK_KEY userdata_stream_restored Redis marker key to prevent repeat
RESTORE_MIN_ROWS_TRIGGER 0 Skip restore if rows < threshold (unless force)
RESTORE_ON_EMPTY 1 Enable restore logic when stream empty
SKIP_FRESHNESS 0 Ignore freshness window (process all)
SNOWFLAKE_AUTO_RECONNECT 1 Reconnect on token expiration
SNOWFLAKE_PING_INTERVAL_SECONDS 300 Ping interval for Snowflake keep-alive
STREAM userdata-stream Alias for REDIS_STREAM (if used)
TARGET_LOG_IDS (blank) IDs that force detailed logging
TRACE_SQL 0 Extra SQL trace lines
UNIT_TEST 0 Use in-memory Redis stub
USERNAME_SPLIT_HYPHEN 1 Trim at first hyphen when deriving username
USERNAME_USE_RECORD_ID 0 Use record id as username for batches
VERBOSE_SNOWFLAKE_TEST 0 Log Snowflake context on connect
VERIFY_TRACKED_AFTER_WRITE 0 Run SELECT after write for sample rows
VERIFY_TRACKED_LIMIT 20 Max verify selects

Restore Logic Summary

Runs once when:

  • RESTORE_ON_EMPTY=1
  • Stream length = 0 (unless RESTORE_FORCE=1)
  • Marker key absent (unless force)

Fetches recent rows, grouped by user+kind, and emits batches respecting KEY_APPEND_PAGE for visited.

Freshness

Records older than NEWER_RECORD_TIME_LIMIT_SECONDS skipped unless:

  • Saved and PROCESS_STALE_SAVED=1
  • First stale in batch with ALWAYS_PROCESS_STALE_FIRST=1
  • SKIP_FRESHNESS=1 disables skip

Track ID

Empty track_id stored as empty string. If HASH_NULL_TRACK_ID=1, computed auto_<12 hex> prefix ensures stable grouping.

Quick Dry Run

python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
export DRY_RUN=1 REDIS_STREAM=userdata-stream HASH_NULL_TRACK_ID=1
python app/stream_consumer.py &
redis-cli XADD userdata-stream * value '{"user_email":"alice@example.com"}'
redis-cli XADD userdata-stream * value '{"key":"userdata-events:alice-1/visited","records":[{"id":"alice-1","page_name":"search","query_row_count":"5","track_id":"","track_title":"Sample","track_type":"QUERY","track_url":"/search?q=Sample","lastplus":"2025-11-03 12:00:00","action":"v"}]}'
Bash

Look for RECORD_EXEC and optionally auto_ hash lines.

Unit Tests

pytest -q
Bash

Tests cover:

  • JSON field select.
  • Parse and repair.
  • Username derive.
  • Fresh vs stale skip.
  • Stale saved override.
  • Null track id hashing.
  • Import of string keys.
  • Failover path.

Common Checks

Task Command Expect
Ping Redis redis-cli PING PONG
Group info redis-cli XINFO GROUPS userdata-stream Group row
Last 1 entry redis-cli XREVRANGE userdata-stream + - COUNT 1 Array
Snowflake write Look for SQL (TRACKED) in logs MERGE lines
Failover event Search FAILOVER in logs Failover lines

Troubleshooting Short List

Issue Fix
No stream growth Check UI or bridge for XADD.
Consumer silent Confirm REDIS_HOST and network.
Null track_id rows missing Set HASH_NULL_TRACK_ID=1.
Saved queries skipped Set PROCESS_STALE_SAVED=1 or fix timestamps.
Import not running Set EXTRA_BRIDGE_KEYS before start.
Bad JSON repeats Set ACK_INVALID_JSON=1 or fix source.
Failover noisy Set STREAM_FAILOVER_TO_SNOWFLAKE=0.

Minimal Deployment Env List

Apply these (example) settings along with Snowflake secrets:

AUTO_QUOTE_LASTPLUS=1
DRY_RUN=0
HASH_NULL_TRACK_ID=1
PROCESS_STALE_SAVED=1
REDIS_CONSUMER=consumer-1
REDIS_GROUP=snowflake-writer
REDIS_STREAM=userdata-stream
SKIP_FRESHNESS=0
STREAM_FAILOVER_TO_SNOWFLAKE=1
Bash

Snowflake Verification

Query sample:

SELECT tracked_id, track_id, saved, visited_count, deleted_dts
FROM <DB>.<SCHEMA>.TRACKED
WHERE tracked_id = 'alice-1'
ORDER BY visited_dts DESC
LIMIT 5;
SQL

If no rows: check consumer logs for skip or failover markers.

Safe Manual Test in Pod

kubectl exec <pod> -- redis-cli XADD userdata-stream * value '{"diag":"ping","ts":"2025-11-01 10:22:00"}'
kubectl exec <pod> -- redis-cli XLEN userdata-stream
Bash

Clean Exit

Stop consumer with Ctrl+C. Redis keeps data.

Persistence

Keeping data on pod restart after persistent volume loss.

Redis writes data to files in /data inside the container. The Kubernetes Persistent Volume (PV) is mounted at that path.

Files:

appendonly.aof (every write is appended; stream entries survive restarts)

dump.rdb (snapshot saved based on save rules)

redis.conf (minimum settings)
plain appendfilename appendonly.aof appendfsync everysec appendonly yes dbfilename dump.rdb dir /data save 300 10 save 60 10000 save 900 1

  • appendfsync everysec: flush AOF once per second (small data loss window if a crash happens).
  • appendonly yes: log each write (XADD) to AOF.
  • dir /data: write files on the mounted volume.
  • save xx xx etc: create a snapshot if at least 1 write in 900s, 10 writes in 300s, or 10000 writes in 60s.

Tests

Basic test:

# Add one test entry
TS=$(date -u +"%Y-%m-%d %H:%M:%S")
redis-cli XADD userdata-stream * value '{"diag":"persist","ts":"'$TS'"}'
redis-cli XLEN userdata-stream
# Check files
ls -lh /data
# Tail last lines of AOF
tail -n 5 /data/appendonly.aof
Bash

Restart test (in cluster):

  1. Record current stream length: redis-cli XLEN userdata-stream.
  2. Delete the pod (deployment will recreate it).
  3. Run length again: redis-cli XLEN userdata-stream.
  4. It should match or be larger.

If length resets to 0:

  • PV not mounted in the redis container.
  • Wrong path in redis.conf.
  • Permissions blocked (check ownership of /data).
  • Different DB index used (UI wrote DB 1, you checked DB 0). Test with redis-cli -n 1 XLEN userdata-stream.

To force a snapshot now (optional):

redis-cli SAVE
ls -lh /data/dump.rdb
Bash

For near zero loss set appendfsync always (higher disk cost).

If persistence not needed for some environments (local dev), you can keep defaults and ignore PV.

Restore (Snowflake -> Redis Stream on Empty)

Goal: After a pod restart with an empty /data volume the Redis stream would be blank while Snowflake still has authoritative history. The consumer can seed the stream so downstream tools relying on recent backlog have context.

Trigger: Runs once at startup when ALL are true:

  • RESTORE_ON_EMPTY=1 (default 1)
  • Stream length is 0
  • Redis key marker (default _userdata_stream_restored) absent

Process:

  1. Query Snowflake TRACKED joined to USER for rows within last RESTORE_DAYS days (default 7).
  2. Apply filters for saved / visited based on flags.
  3. Limit total rows (RESTORE_LIMIT_TOTAL).
  4. Group by user + kind (visited / saved) and emit batches of size RESTORE_BATCH_SIZE as normal stream messages ({"key":"userdata-events:<email>/<visited|saved>","records":[...]}).
  5. Set marker key to avoid re-running.

Environment flags:

Var Default Description
KEY_APPEND_PAGE 1 Append page segment to visited keys (/visited/<page>).
RESTORE_BATCH_SIZE 100 Records per emitted message.
RESTORE_DAYS 7 Look-back window (days).
RESTORE_FORCE 0 Force restore even if stream not empty or marker exists.
RESTORE_INCLUDE_SAVED 1 Include saved rows.
RESTORE_INCLUDE_VISITED 1 Include visited rows.
RESTORE_LIMIT_TOTAL 2000 Hard cap on rows queried/emitted.
RESTORE_LOG_SAMPLE 5 Count of sample rows to mention in logs (size only; rows themselves not logged).
RESTORE_MARK_KEY _userdata_stream_restored Redis string key used as idempotent marker.
RESTORE_MIN_ROWS_TRIGGER 0 If >0 skip restore when candidate row count below threshold.
RESTORE_ON_EMPTY 1 Enable feature.

Dry run mode (DRY_RUN=1) still performs the query but just logs the would-be XADD actions.

To disable completely set RESTORE_ON_EMPTY=0.

Operational note - restore seeds only the stream. It does NOT rewrite Snowflake (source of truth). If restore runs after a true data-loss event for Redis it simply replays recent history for observability/consumers needing backlog.