- Reduce latency and cost.
- Improve resilience and observability.
- Limit blast radius with secure defaults.
Performance: make functions fast and efficient
Focus on minimizing cold starts, reusing resources across invocations, and right-sizing compute. Key strategies:- Minimize cold starts
- Keep dependencies small and trim package sizes.
- Move expensive initialization outside of the hot path. Prefer lightweight, fast startup code.
- Lazy-load heavy modules only when used.
- Reuse resources across invocations
- Initialize database clients, HTTP clients, or connection pools in the global scope (outside the request handler) so warm instances can reuse them.
- If your runtime supports instance concurrency (e.g., Cloud Functions Gen 2), ensure any shared/global resources are safe for concurrent access.
- Right-size memory and CPU
- Increasing memory often increases CPU allocation; for CPU-bound workloads this can reduce execution time.
- Benchmark different settings to find the best cost/performance trade-off.
- Reduce outbound connections
- Use connection pooling and shared clients to lower the number of new connections and reduce latency.

- Node.js (initialize client once in global scope)
- Python (reuse client across invocations)
| Tactic | Why it helps | Implementation tip |
|---|---|---|
| Reduce package size | Faster startup | Remove unused packages, bundle/minify code |
| Global clients | Lower connection overhead | Create DB / HTTP clients outside handlers |
| Lazy loading | Avoid heavy startup cost | Import heavy modules inside handler when needed |
| Right-size memory | Balance latency and cost | Benchmark with representative workloads |
Error handling and reliability
Assume failures are normal. Design to handle transient and permanent errors, and to fail in observable ways. Best practices:- Retries: Use exponential backoff with jitter to avoid thundering herds. Do not blindly retry non-idempotent operations.
- Dead letter queues (DLQs): Preserve messages that repeatedly fail via dead-letter topics/subscriptions for later inspection or reprocessing.
- Idempotency: Make operations idempotent where possible to prevent duplicate side effects during retries.
- Observability: Log errors, failed events, and context. Use structured logs, metrics, and traces for faster troubleshooting.
| Pattern | Example |
|---|---|
| Retry with jitter | Use exponential backoff + randomization to reduce synchronized retries |
| DLQ handling | Configure Pub/Sub dead-letter topic or publish failed events to a dedicated topic |
| Idempotency keys | Include a deduplication ID (e.g., request ID) and persist processed IDs |
Security: least privilege and secret handling
Hardening your functions reduces exposure and prevents accidental data leaks. Core security practices:- Principle of least privilege: Grant service accounts only the permissions required for the function.
- Secrets management: Never commit secrets to source code. Use a managed secret service and grant functions minimal access to fetch secrets at runtime.
- Network controls: Use VPC connectors, private IPs, and firewall rules when accessing internal resources.
- Input validation and sanitization: Validate inputs to reduce attack surface and avoid injection attacks.
Store secrets in a managed secret service (for example, Secret Manager) and fetch them at runtime. Avoid embedding credentials in source code or storing API keys in plaintext environment variables—use secret-manager integrations where possible.
- Option A: create a new connection on every request.
- Option B: use global variables for connection pooling.
- Option C: store credentials inside the function.
Advanced tips for power users
- Use global variables for shared clients and connection pools to avoid repeated connection setup.
- When implementing retries, combine exponential backoff with jitter to avoid causing spikes that overload downstream services.
- Monitor functions with Cloud Monitoring and send logs to Cloud Logging. Use distributed tracing (Cloud Trace) to analyze end-to-end latency and hotspots.
- Benchmark different memory allocations—more memory can yield better performance for CPU-bound tasks, but increases cost. Measure tail latency, not just averages.

Be mindful of the cost/performance trade-off when increasing memory. Benchmark different settings because allocating more memory increases cost and may also change CPU allocation.
Observability checklist
- Emit structured logs with sufficient context (request IDs, trace IDs).
- Create metrics for invocation count, error rate, latency, and cold-start frequency.
- Set alerts on error rate spikes, high latencies, or sudden cost increases.
- Correlate traces across services using Cloud Trace or OpenTelemetry.
References and further reading
- Cloud Functions documentation
- Cloud Monitoring
- Secret Manager
- Pub/Sub dead-letter topics
- Best practices for serverless applications