advanced By StackPractices Jun 21, 2026

gRPC in Microservices — High-Performance RPC Guide

Q: Should I replace all my REST APIs with gRPC?

No. Keep public-facing APIs as REST (broad compatibility, easy debugging with curl) and use gRPC for internal service-to-service communication. gRPC's tooling ecosystem is weaker for external developers, and browsers cannot call gRPC directly without a proxy. The migration path is: identify high-traffic internal APIs, create .proto contracts, deploy gRPC services behind your existing gateway, and gradually shift traffic.

Q: How do I handle errors in gRPC?

gRPC uses status codes (NOT_FOUND, INVALID_ARGUMENT, UNAVAILABLE, etc.) instead of HTTP status codes. Always map business errors to appropriate gRPC codes; do not return OK with an error message in the payload. Implement retry logic on the client for UNAVAILABLE and DEADLINE_EXCEEDED, but not for INVALID_ARGUMENT (retrying a bad request is wasted work). Use deadline propagation: if a client gives you 500ms, forward a shorter deadline to downstream services.

Q: What about gRPC vs GraphQL for microservices?

gRPC and GraphQL serve different layers. gRPC is for service-to-service communication: fast, typed, binary. GraphQL is for client-facing aggregation: flexible, discoverable, JSON. A common architecture is: browser/mobile → GraphQL gateway → gRPC microservices → databases. Do not use GraphQL between services; the flexibility is unnecessary overhead and the N+1 problem is worse at the service layer. Use gRPC for the "last mile" between your gateway and backend services.

A practical guide to gRPC for microservices: Protocol Buffers, streaming, load balancing, and migration from REST for high-performance RPC.

grpc microservices protobuf rpc streaming load-balancing performance guide

Topics: architecture api

Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.

Overview

REST APIs are the lingua franca of the web, but they are not designed for high-throughput, low-latency service-to-service communication. JSON is verbose, HTTP/1.1 is head-of-line blocked, and REST’s uniform interface is a poor fit for RPC-style operations. gRPC addresses all three problems: Protocol Buffers for compact serialization, HTTP/2 for multiplexed streams, and a strict contract-first approach via .proto files. This guide covers gRPC for microservices: when to adopt it, how to implement it, and how to migrate from REST without breaking existing clients.

When to Use

Use this guide when:

Your internal microservices are experiencing high latency or serialization overhead with REST/JSON
You need streaming, bidirectional communication, or strong typing between services
You are designing a polyglot microservices architecture and need language-neutral service contracts

Solution

Protocol Buffers Definition

syntax = "proto3";
package userservice;

service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc ListUsers(ListUsersRequest) returns (stream User);
  rpc StreamUpdates(stream UserEvent) returns (stream UpdateResponse);
}

message GetUserRequest { string user_id = 1; }
message User {
  string user_id = 1;
  string name = 2;
  string email = 3;
  int64 created_at = 4;
}
message ListUsersRequest { int32 page_size = 1; }
message UserEvent { string user_id = 1; EventType type = 2; enum EventType { CREATED = 0; UPDATED = 1; DELETED = 2; } }
message UpdateResponse { bool success = 1; string message = 2; }

Server Implementation (Python)

from concurrent import futures
import grpc
import user_service_pb2, user_service_pb2_grpc

class UserServiceServicer(user_service_pb2_grpc.UserServiceServicer):
    def GetUser(self, request, context):
        user = db.get_user(request.user_id)
        if not user:
            context.set_code(grpc.StatusCode.NOT_FOUND)
            return user_service_pb2.User()
        return user_service_pb2.User(user_id=user.id, name=user.name, email=user.email)

    def ListUsers(self, request, context):
        for user in db.list_users(limit=request.page_size):
            yield user_service_pb2.User(user_id=user.id, name=user.name)

def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    user_service_pb2_grpc.add_UserServiceServicer_to_server(UserServiceServicer(), server)
    server.add_insecure_port('[::]:50051')
    server.start()
    server.wait_for_termination()

Client Implementation (Python)

import grpc
import user_service_pb2, user_service_pb2_grpc

channel = grpc.insecure_channel('user-service:50051')
stub = user_service_pb2_grpc.UserServiceStub(channel)

# Unary call
response = stub.GetUser(user_service_pb2.GetUserRequest(user_id="123"))
print(f"User: {response.name}")

# Server streaming
for user in stub.ListUsers(user_service_pb2.ListUsersRequest(page_size=10)):
    print(f"User: {user.name}")

Client Implementation (Go)

package main

import (
    "context"
    "log"
    "google.golang.org/grpc"
    pb "user-service/proto"
)

func main() {
    conn, err := grpc.Dial("user-service:50051", grpc.WithInsecure())
    if err != nil { log.Fatalf("did not connect: %v", err) }
    defer conn.Close()
    client := pb.NewUserServiceClient(conn)

    resp, err := client.GetUser(context.Background(), &pb.GetUserRequest{UserId: "123"})
    if err != nil { log.Fatalf("could not get user: %v", err) }
    log.Printf("User: %s", resp.Name)
}

Load Balancing with gRPC

# Envoy gRPC load balancing
clusters:
  - name: user_service_grpc
    connect_timeout: 0.25s
    type: EDS
    lb_policy: ROUND_ROBIN
    protocol_selection: USE_DOWNSTREAM_PROTOCOL
    load_assignment:
      cluster_name: user_service_grpc
      endpoints:
        - lb_endpoints:
            - endpoint:
                address: { socket_address: { address: user-service-1, port_value: 50051 } }
            - endpoint:
                address: { socket_address: { address: user-service-2, port_value: 50051 } }

Explanation

gRPC uses HTTP/2 as its transport, which gives it three advantages over REST/HTTP/1.1: multiplexing (many requests share one TCP connection), server push (for streaming), and binary framing (more compact than text). Protocol Buffers serialize to binary, typically 3–10x smaller than JSON for the same data. The contract-first approach forces API designers to define messages and services explicitly, which eliminates drift between client and server.

The biggest operational challenge is load balancing. HTTP/2 connections are long-lived, so naive L4 load balancers route all requests from one client to one server. You need an L7 (application) load balancer or a service mesh that understands gRPC and can route individual RPCs. Envoy, Linkerd, and NGINX with the gRPC module all support this.

Streaming is where gRPC truly differentiates itself. Unlike REST, where streaming requires WebSockets or Server-Sent Events, gRPC has four built-in patterns: unary (request/response), server streaming (server sends a sequence), client streaming (client sends a sequence), and bidirectional streaming (both send sequences). Bidirectional streaming is ideal for real-time features like chat, collaborative editing, or live dashboards.

Variants

Pattern	Use Case	gRPC Feature
Unary RPC	Simple CRUD operations	Standard request/response
Server Streaming	Sending large datasets, live updates	`returns (stream Message)`
Client Streaming	Uploading large files, batch operations	`stream Message` in request
Bidirectional Streaming	Chat, real-time games, collaborative editing	Both sides stream
gRPC-Web	Browser clients	Proxy via Envoy; limited streaming support
gRPC-Gateway	REST compatibility	Auto-generate REST API from `.proto`

Best Practices

Use protoc plugins to generate client/server stubs in every language you support
Version .proto files explicitly; never break field numbers in existing messages
Use context with timeouts in every RPC call to prevent hanging connections
Enable reflection in development for grpcurl debugging, but disable in production
Instrument with OpenTelemetry; gRPC metrics are harder to observe than HTTP without distributed tracing

Common Mistakes

Exposing gRPC directly to browsers; use gRPC-Web or a REST gateway instead
Not handling backpressure; unbounded streaming can exhaust memory on either side
Using insecure_channel in production; always use TLS/mTLS for service-to-service gRPC
Forgetting to set max_message_length; default limits can silently truncate large payloads
Changing field numbers in .proto files; this breaks wire compatibility for all clients

Frequently Asked Questions

Should I replace all my REST APIs with gRPC?

No. Keep public-facing APIs as REST (broad compatibility, easy debugging with curl) and use gRPC for internal service-to-service communication. gRPC’s tooling ecosystem is weaker for external developers, and browsers cannot call gRPC directly without a proxy. The migration path is: identify high-traffic internal APIs, create .proto contracts, deploy gRPC services behind your existing gateway, and gradually shift traffic.

How do I handle errors in gRPC?

gRPC uses status codes (NOT_FOUND, INVALID_ARGUMENT, UNAVAILABLE, etc.) instead of HTTP status codes. Always map business errors to appropriate gRPC codes; do not return OK with an error message in the payload. Implement retry logic on the client for UNAVAILABLE and DEADLINE_EXCEEDED, but not for INVALID_ARGUMENT (retrying a bad request is wasted work). Use deadline propagation: if a client gives you 500ms, forward a shorter deadline to downstream services.

What about gRPC vs GraphQL for microservices?

gRPC and GraphQL serve different layers. gRPC is for service-to-service communication: fast, typed, binary. GraphQL is for client-facing aggregation: flexible, discoverable, JSON. A common architecture is: browser/mobile → GraphQL gateway → gRPC microservices → databases. Do not use GraphQL between services; the flexibility is unnecessary overhead and the N+1 problem is worse at the service layer. Use gRPC for the “last mile” between your gateway and backend services.

Guide

gRPC in Microservices — High-Performance RPC Guide

Overview

When to Use

Solution

Protocol Buffers Definition

Server Implementation (Python)

Client Implementation (Python)

Client Implementation (Go)

Load Balancing with gRPC

Explanation

Variants

Best Practices

Common Mistakes

Frequently Asked Questions

Should I replace all my REST APIs with gRPC?

How do I handle errors in gRPC?

What about gRPC vs GraphQL for microservices?

API Gateway Design — Resilience, Routing, and Security

Microservices Architecture — When to Use and When Not To

Monolith to Microservices — Migration Strategies

System Design Interview Guide — Key Concepts

REST API Design Guide

Overview

When to Use

Solution

Protocol Buffers Definition

Server Implementation (Python)

Client Implementation (Python)

Client Implementation (Go)

Load Balancing with gRPC

Explanation

Variants

Best Practices

Common Mistakes

Frequently Asked Questions

Should I replace all my REST APIs with gRPC?

How do I handle errors in gRPC?

What about gRPC vs GraphQL for microservices?

Related Resources

API Gateway Design — Resilience, Routing, and Security

Microservices Architecture — When to Use and When Not To

Monolith to Microservices — Migration Strategies

System Design Interview Guide — Key Concepts

REST API Design Guide