A purpose-built AI caching layer that overlays your existing infrastructure. No migration required. Four steps from request to response, measured in nanoseconds, with AI predicting what your systems need before they ask.
Watch how a data request travels through your stack. Every hop adds latency you are paying for. Then see what happens when Cachee intercepts the chain.
Every request that hits Cachee passes through a four-stage pipeline. Each stage is optimized for high-performance execution. The entire pipeline completes before most systems finish a single network hop.
Deploy Cachee in your environment in minutes. Our CLI handles configuration, connection, and optimization automatically.
Don't rip out your Redis stack. Every integration model wraps your existing infrastructure and makes it dramatically faster — in under an hour.
Every model wraps your existing infrastructure. Nothing gets ripped out.
npm install @cachee/sdk
import { CacheeClient } from '@cachee/sdk' const cache = new CacheeClient({ apiKey: process.env.CACHEE_API_KEY, region: 'auto', // nearest edge fallback: 'local', // in-memory if offline timeout: 2000 }) // Set a key await cache.set('user:1234', userData, { ttl: 300 }) // Get a key const user = await cache.get('user:1234') // Batch set await cache.mset({ 'a': 1, 'b': 2, 'c': 3 })
from cachee import CacheeClient cache = CacheeClient(api_key="your_key") await cache.set("order:99", order_data, ttl=60) result = await cache.get("order:99")
Sign up at cachee.ai/start — no credit card required. Your account is active immediately.
Your API key is generated on first login. Copy it to your environment variables as CACHEE_API_KEY.
Run npm install @cachee/sdk or pip install cachee. Import the client and initialize it with your key.
Call cache.set() and cache.get() wherever you were calling Redis. The API is deliberately identical.
Your hit rate, latency, and request volume appear in the dashboard within 60 seconds of your first call.
services: your-app: image: your-app:latest environment: # Point your Redis client here instead: REDIS_HOST: cachee-sidecar REDIS_PORT: "6379" cachee-sidecar: image: cacheeai/sidecar:latest environment: CACHEE_API_KEY: ${CACHEE_API_KEY} # No ports exposed — localhost only
containers: - name: your-app image: your-app:latest env: - name: REDIS_HOST value: "localhost" - name: REDIS_PORT value: "6379" - name: cachee-sidecar image: cacheeai/sidecar:latest env: - name: CACHEE_API_KEY valueFrom: secretKeyRef: name: cachee-secrets key: api-key
docker pull cacheeai/sidecar:latest — it's under 50MB. No root, no surprise dependencies.
Paste the 5-line snippet alongside your existing app container. Set your CACHEE_API_KEY env var.
Point REDIS_HOST to cachee-sidecar (Compose) or localhost (Kubernetes). Your existing Redis client works without any code changes.
On startup it authenticates to the Cachee control plane, downloads your configuration, and begins serving the Redis protocol on port 6379.
SET, GET, DEL, EXISTS, MGET, MSET, EXPIRE, TTL. Any unsupported command returns a clear error — never a silent hang.
# From the Cachee dashboard → Self-Hosted → New Token # Token expires in 24 hours, single-use CACHEE_CONNECT_TOKEN="ct_live_xxxxxxxxxxxx" # Run the agent on your infrastructure docker run -d \ -e CACHEE_CONNECT_TOKEN=$CACHEE_CONNECT_TOKEN \ -e CACHEE_REGION="us-east-1" \ -p 6379:6379 \ cacheeai/agent:latest
# Within 60 seconds the dashboard shows CONNECTED. # The connect token is automatically invalidated. # Test the connection: redis-cli -h localhost -p 6379 PING # → PONG (served by Cachee agent)
All cache data — keys, values, TTLs — lives entirely on your hardware or VPC. Cachee has zero access to cache contents.
Operational metrics only: hit rate, latency percentiles, memory utilization, request count. No keys, no values, ever.
Cachee's AI models analyze usage patterns from operational metrics and push eviction and pre-warming decisions back to your agent.
The agent can operate in restricted-egress networks. Configure a proxy for control plane sync if direct outbound is not allowed.
All self-hosted accounts are assigned a Cachee solutions engineer who runs the first deployment with your team live.
services: your-app: image: your-app:latest environment: # Point at Cachee instead of your old cache: REDIS_HOST: cachee-overlay REDIS_PORT: "6379" cachee-overlay: image: cacheeai/proxy:latest environment: CACHEE_API_KEY: ${CACHEE_API_KEY} # Your existing cache becomes the L2 backend: UPSTREAM: ${YOUR_EXISTING_CACHE_ENDPOINT} ports: - "6379:6379"
# ElastiCache / Redis Cloud / Azure / GCP / Upstash: UPSTREAM=redis://your-elasticache.abc.cache.amazonaws.com:6379 # CloudFlare Workers KV (HTTP adapter): UPSTREAM=cloudflare://ACCOUNT_ID/NAMESPACE_ID UPSTREAM_CF_TOKEN=${CF_API_TOKEN}
One container, one env var for your API key, one for your existing cache endpoint. The proxy speaks Redis protocol on port 6379.
Change your app's REDIS_HOST from your existing cache to the Cachee proxy. Zero code changes needed — your existing Redis client works as-is.
Cachee's in-memory L1 (Tiny-Cachee engine) serves frequently accessed keys without touching your backend. Cache misses are forwarded transparently.
With 90%+ hit rates on the L1, you cut 90% of calls to your existing provider — lowering both latency and cost.
Your existing cache keeps all its data. Cachee only accelerates reads — writes pass through to your backend for durability.
Side-by-side comparison of all four deployment models across the metrics that matter.
| Capability | Managed Cloud | Sidecar | Overlay | Self-Hosted |
|---|---|---|---|---|
| Setup time | < 1 hour | 12 minutes | 10 minutes | 45 minutes |
| p99 cache hit latency | 0.46ms | 1.5µs | ~0.01ms | ~0.01ms |
| Infrastructure to manage | ✓ None — we run it | ⊘ One container | ⊘ One container | — Your hardware |
| Existing Redis client works | ⊘ SDK change | ✓ Zero code change | ✓ Zero code change | ✓ Zero code change |
| Keep existing cache provider | — | — | ✓ Your L2 backend | — |
| Reduces provider API costs | — | — | ✓ Up to 90% | — |
| AI pre-warming & optimization | ✓ | ✓ | ✓ | ✓ |
| Automatic scaling | ✓ Fully managed | ✓ Managed | ✓ Managed | ⊘ Controlled |
| Multi-region failover | ✓ | ✓ | ✓ | ✓ |
| Dedicated solutions engineer | — | — | — | ✓ Included |
| Available on tier | Starter + | Growth + | Starter + | Enterprise |
Every deployment model uses the same control plane. Start with Overlay to accelerate your existing ElastiCache or CloudFlare KV, or go Managed for a turnkey solution.
Every feature is designed for production workloads at scale. No toy benchmarks. No asterisks. These are the capabilities running in production today.
Side-by-side with the caching solutions you already know. Same metrics, same workloads, independently verifiable.
| Metric | Redis | Memcached | CloudFront | Cachee |
|---|---|---|---|---|
| Read Latency (p50) | 0.8 - 2ms | 0.5 - 1ms | 5 - 50ms | 1.21ns |
| Read Latency (p99) | 5 - 15ms | 3 - 8ms | 50 - 200ms | 12ns |
| Throughput | 500K ops/s | 1M ops/s | N/A (CDN) | 827M ops/s |
| AI Prediction | None | None | None | 95%+ accuracy |
| Auto-Tuning | Manual TTLs | Manual config | Basic TTLs | Fully autonomous |
| Network Hops | 2-3 hops | 2-3 hops | 1-4 hops | Near-zero |
| GC Pauses | Rare (C) | None (C) | Varies | None |
| Origin Load Reduction | 60 - 80% | 60 - 75% | 40 - 70% | 95%+ |
| Deploy Complexity | Moderate | Moderate | Low (CDN) | 1 command overlay |
Input your current infrastructure metrics. See exactly what changes when Cachee deploys. All calculations use conservative estimates based on production deployments.
These numbers come from production deployments, not synthetic benchmarks. Measured on real infrastructure under real workloads. All benchmarks are independently reproducible.
Memory utilization rises because Cachee is actively using it. Everything else -- server hits, infrastructure cost, response latency -- drops dramatically.
Representative enterprise running on a standard AWS stack. These are the line items that change when Cachee deploys.
| Line Item | Before Cachee | After Cachee | Delta |
|---|---|---|---|
| ElastiCache / Redis Cluster | $18,000/mo | $4,500/mo | −$13,500 |
| RDS / Aurora Database | $32,000/mo | $12,000/mo | −$20,000 |
| Compute (EC2 / ECS / Lambda) | $24,000/mo | $10,000/mo | −$14,000 |
| Data Transfer / CDN | $11,000/mo | $4,500/mo | −$6,500 |
| DevOps Hours (cache mgmt) | 60 hrs/mo ($12,000) | 4 hrs/mo ($800) | −$11,200 |
| Cachee Platform Cost | — | $500/mo | +$500 |
| NET MONTHLY IMPACT | $97,000/mo | $32,300/mo | −$64,700/mo |
Representative figures based on typical enterprise deployment. Actual results vary by infrastructure configuration, workload patterns, and scale.
Deploy Cachee in under an hour. No migration. No downtime. The data your systems need is already waiting in L1 memory before they ask for it.
1.21 nanoseconds — that's the new standard.
cachee.ai