Skip to content
SP StackPractices
advanced By StackPractices

gRPC in Microservices — High-Performance RPC Guide

A practical guide to gRPC for microservices: Protocol Buffers, streaming, load balancing, and migration from REST for high-performance RPC.

Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.

Overview

REST APIs are the lingua franca of the web, but they are not designed for high-throughput, low-latency service-to-service communication. JSON is verbose, HTTP/1.1 is head-of-line blocked, and REST’s uniform interface is a poor fit for RPC-style operations. gRPC addresses all three problems: Protocol Buffers for compact serialization, HTTP/2 for multiplexed streams, and a strict contract-first approach via .proto files. This guide covers gRPC for microservices: when to adopt it, how to implement it, and how to migrate from REST without breaking existing clients.

When to Use

Use this guide when:

  • Your internal microservices are experiencing high latency or serialization overhead with REST/JSON
  • You need streaming, bidirectional communication, or strong typing between services
  • You are designing a polyglot microservices architecture and need language-neutral service contracts

Solution

Protocol Buffers Definition

syntax = "proto3";
package userservice;

service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc ListUsers(ListUsersRequest) returns (stream User);
  rpc StreamUpdates(stream UserEvent) returns (stream UpdateResponse);
}

message GetUserRequest { string user_id = 1; }
message User {
  string user_id = 1;
  string name = 2;
  string email = 3;
  int64 created_at = 4;
}
message ListUsersRequest { int32 page_size = 1; }
message UserEvent { string user_id = 1; EventType type = 2; enum EventType { CREATED = 0; UPDATED = 1; DELETED = 2; } }
message UpdateResponse { bool success = 1; string message = 2; }

Server Implementation (Python)

from concurrent import futures
import grpc
import user_service_pb2, user_service_pb2_grpc

class UserServiceServicer(user_service_pb2_grpc.UserServiceServicer):
    def GetUser(self, request, context):
        user = db.get_user(request.user_id)
        if not user:
            context.set_code(grpc.StatusCode.NOT_FOUND)
            return user_service_pb2.User()
        return user_service_pb2.User(user_id=user.id, name=user.name, email=user.email)

    def ListUsers(self, request, context):
        for user in db.list_users(limit=request.page_size):
            yield user_service_pb2.User(user_id=user.id, name=user.name)

def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    user_service_pb2_grpc.add_UserServiceServicer_to_server(UserServiceServicer(), server)
    server.add_insecure_port('[::]:50051')
    server.start()
    server.wait_for_termination()

Client Implementation (Python)

import grpc
import user_service_pb2, user_service_pb2_grpc

channel = grpc.insecure_channel('user-service:50051')
stub = user_service_pb2_grpc.UserServiceStub(channel)

# Unary call
response = stub.GetUser(user_service_pb2.GetUserRequest(user_id="123"))
print(f"User: {response.name}")

# Server streaming
for user in stub.ListUsers(user_service_pb2.ListUsersRequest(page_size=10)):
    print(f"User: {user.name}")

Client Implementation (Go)

package main

import (
    "context"
    "log"
    "google.golang.org/grpc"
    pb "user-service/proto"
)

func main() {
    conn, err := grpc.Dial("user-service:50051", grpc.WithInsecure())
    if err != nil { log.Fatalf("did not connect: %v", err) }
    defer conn.Close()
    client := pb.NewUserServiceClient(conn)

    resp, err := client.GetUser(context.Background(), &pb.GetUserRequest{UserId: "123"})
    if err != nil { log.Fatalf("could not get user: %v", err) }
    log.Printf("User: %s", resp.Name)
}

Load Balancing with gRPC

# Envoy gRPC load balancing
clusters:
  - name: user_service_grpc
    connect_timeout: 0.25s
    type: EDS
    lb_policy: ROUND_ROBIN
    protocol_selection: USE_DOWNSTREAM_PROTOCOL
    load_assignment:
      cluster_name: user_service_grpc
      endpoints:
        - lb_endpoints:
            - endpoint:
                address: { socket_address: { address: user-service-1, port_value: 50051 } }
            - endpoint:
                address: { socket_address: { address: user-service-2, port_value: 50051 } }

Explanation

gRPC uses HTTP/2 as its transport, which gives it three advantages over REST/HTTP/1.1: multiplexing (many requests share one TCP connection), server push (for streaming), and binary framing (more compact than text). Protocol Buffers serialize to binary, typically 3–10x smaller than JSON for the same data. The contract-first approach forces API designers to define messages and services explicitly, which eliminates drift between client and server.

The biggest operational challenge is load balancing. HTTP/2 connections are long-lived, so naive L4 load balancers route all requests from one client to one server. You need an L7 (application) load balancer or a service mesh that understands gRPC and can route individual RPCs. Envoy, Linkerd, and NGINX with the gRPC module all support this.

Streaming is where gRPC truly differentiates itself. Unlike REST, where streaming requires WebSockets or Server-Sent Events, gRPC has four built-in patterns: unary (request/response), server streaming (server sends a sequence), client streaming (client sends a sequence), and bidirectional streaming (both send sequences). Bidirectional streaming is ideal for real-time features like chat, collaborative editing, or live dashboards.

Variants

PatternUse CasegRPC Feature
Unary RPCSimple CRUD operationsStandard request/response
Server StreamingSending large datasets, live updatesreturns (stream Message)
Client StreamingUploading large files, batch operationsstream Message in request
Bidirectional StreamingChat, real-time games, collaborative editingBoth sides stream
gRPC-WebBrowser clientsProxy via Envoy; limited streaming support
gRPC-GatewayREST compatibilityAuto-generate REST API from .proto

Best Practices

  1. Use protoc plugins to generate client/server stubs in every language you support
  2. Version .proto files explicitly; never break field numbers in existing messages
  3. Use context with timeouts in every RPC call to prevent hanging connections
  4. Enable reflection in development for grpcurl debugging, but disable in production
  5. Instrument with OpenTelemetry; gRPC metrics are harder to observe than HTTP without distributed tracing

Common Mistakes

  1. Exposing gRPC directly to browsers; use gRPC-Web or a REST gateway instead
  2. Not handling backpressure; unbounded streaming can exhaust memory on either side
  3. Using insecure_channel in production; always use TLS/mTLS for service-to-service gRPC
  4. Forgetting to set max_message_length; default limits can silently truncate large payloads
  5. Changing field numbers in .proto files; this breaks wire compatibility for all clients

Frequently Asked Questions

Should I replace all my REST APIs with gRPC?

No. Keep public-facing APIs as REST (broad compatibility, easy debugging with curl) and use gRPC for internal service-to-service communication. gRPC’s tooling ecosystem is weaker for external developers, and browsers cannot call gRPC directly without a proxy. The migration path is: identify high-traffic internal APIs, create .proto contracts, deploy gRPC services behind your existing gateway, and gradually shift traffic.

How do I handle errors in gRPC?

gRPC uses status codes (NOT_FOUND, INVALID_ARGUMENT, UNAVAILABLE, etc.) instead of HTTP status codes. Always map business errors to appropriate gRPC codes; do not return OK with an error message in the payload. Implement retry logic on the client for UNAVAILABLE and DEADLINE_EXCEEDED, but not for INVALID_ARGUMENT (retrying a bad request is wasted work). Use deadline propagation: if a client gives you 500ms, forward a shorter deadline to downstream services.

What about gRPC vs GraphQL for microservices?

gRPC and GraphQL serve different layers. gRPC is for service-to-service communication: fast, typed, binary. GraphQL is for client-facing aggregation: flexible, discoverable, JSON. A common architecture is: browser/mobile → GraphQL gateway → gRPC microservices → databases. Do not use GraphQL between services; the flexibility is unnecessary overhead and the N+1 problem is worse at the service layer. Use gRPC for the “last mile” between your gateway and backend services.