Real User Monitoring
Monitor actual user experiences with Core Web Vitals, session replay, and performance analytics to identify real-world bottlenecks.
Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.
Overview
Real User Monitoring (RUM) captures performance data from actual browser sessions — not synthetic tests or server-side metrics. It reveals how Core Web Vitals, JavaScript errors, and API latencies vary across devices, networks, and geographies. Unlike lab tests that run in ideal conditions, RUM exposes the experience of users on 3G networks, low-end devices, and older browsers.
When to Use
Use this resource when:
- Lab-based Lighthouse scores don’t match real-world performance complaints
- You need to correlate business metrics (conversion, bounce rate) with page speed
- Debugging performance issues that only affect specific browsers or regions
- Prioritizing optimization efforts based on actual user impact, not assumptions
Solution
Web Vitals Library (JavaScript)
import { getCLS, getFID, getFCP, getLCP, getTTFB } from 'web-vitals';
function sendToAnalytics(metric) {
const body = JSON.stringify({
name: metric.name,
value: metric.value,
id: metric.id,
rating: metric.rating, // 'good', 'needs-improvement', 'poor'
delta: metric.delta,
navigationType: metric.navigationType,
page: window.location.pathname
});
// Use navigator.sendBeacon for reliability during page unload
(navigator.sendBeacon && navigator.sendBeacon('/analytics/vitals', body)) ||
fetch('/analytics/vitals', { body, method: 'POST', keepalive: true });
}
getCLS(sendToAnalytics);
getFID(sendToAnalytics);
getLCP(sendToAnalytics);
getTTFB(sendToAnalytics);
Session Replay Integration (Sentry)
import * as Sentry from '@sentry/browser';
import { Replay } from '@sentry/replay';
Sentry.init({
dsn: 'https://abc@sentry.io/1',
integrations: [
new Replay({
maskAllText: true, // Mask sensitive text
blockAllMedia: true, // Block images/videos
})
],
tracesSampleRate: 0.1, // 10% of transactions
replaysSessionSampleRate: 0.01, // 1% of sessions
replaysOnErrorSampleRate: 1.0 // 100% of error sessions
});
Custom Performance Observer
const observer = new PerformanceObserver((list) => {
for (const entry of list.getEntries()) {
if (entry.entryType === 'navigation') {
console.log('TTFB:', entry.responseStart);
console.log('FCP:', entry.responseEnd);
console.log('DOM Ready:', entry.domContentLoadedEventEnd);
}
if (entry.entryType === 'resource') {
if (entry.duration > 1000) {
console.warn('Slow resource:', entry.name, entry.duration);
}
}
}
});
observer.observe({ entryTypes: ['navigation', 'resource', 'paint'] });
Explanation
RUM vs. synthetic monitoring:
| Aspect | RUM | Synthetic |
|---|---|---|
| Data source | Real users | Scheduled bots |
| Network | Variable (3G to 5G) | Controlled (fast) |
| Device diversity | Full range | Usually desktop |
| Geographic | Actual user locations | Data center |
| Use case | Understand reality | Baseline regression |
Key metrics:
- LCP: Largest visible element — hero image, heading
- INP: Interaction latency — button click to visual update
- CLS: Layout shifts — ads, images, fonts causing jumps
- TTFB: Server response time — hosting + backend performance. See server-side rendering.
- FCP: First content paint — first text or image visible
Variants
| Tool | Type | Notable Features |
|---|---|---|
| Google CrUX | Chrome-only | Largest real-world dataset; field data |
| New Relic Browser | Commercial | Session traces; JS error tracking |
| Datadog RUM | Commercial | Correlation with APM; session replay |
| Sentry | Open source | Error + performance + replay combined |
| SpeedCurve | Commercial | Competitive benchmarking; filmstrips |
| web-vitals.js | Open source | Google’s reference implementation |
Best Practices
- Sample intelligently: 100% sampling overwhelms backends; 5-10% is usually sufficient
- Capture context: Device type, connection speed, and country explain variation
- Alert on percentiles, not averages: P95 performance is what frustrated users experience
- Correlate with business metrics: Plot conversion rate vs. LCP to justify optimization budgets. See metrics collection.
- Respect privacy: Mask PII in session replay; comply with GDPR/CCPA for telemetry
Common Mistakes
- Only monitoring homepage: Product pages and checkout often have worse performance
- Ignoring SPA navigations: Single-page apps need custom LCP/FID measurement for route changes. Consider server-side rendering.
- No correlation with errors: A slow page that also throws JS errors needs different prioritization. See error handling.
- Alerting on averages: Mean LCP of 2s hides that 20% of users see 8s+ loads
- No action on data: Collecting RUM without optimization sprints wastes the instrumentation effort
Frequently Asked Questions
Q: Does RUM slow down my site? A: Negligibly. The web-vitals library is <1KB. Beacons are sent after the page is interactive.
Q: Should I use RUM or synthetic monitoring? A: Both. Synthetic for baseline regression detection. RUM for understanding actual user experience.
Q: How do I handle ad blockers? A: Serve RUM from your own domain (first-party), not third-party. Ad blockers target known analytics domains.
Related Resources
Observability Dashboards with Grafana and Prometheus
Build interactive Grafana dashboards that visualize Prometheus metrics with panels, variables, and alerts for comprehensive service observability
RecipeMetrics Collection and Alerting with Prometheus
Instrument applications and infrastructure with Prometheus metrics, configure alerting rules, and set up recording rules for efficient monitoring of service health
RecipeDistributed Tracing
Trace requests across distributed microservices with OpenTelemetry, Jaeger, and Zipkin for latency debugging and performance optimization.
RecipeLog Aggregation
Centralize logs from distributed services with ELK, Fluentd, and Loki for search, alerting, and troubleshooting in production.
RecipeMetrics Collection
Collect, aggregate, and expose application and infrastructure metrics with Prometheus, StatsD, and OpenTelemetry for monitoring and alerting.