Blog

Understanding FastAPI: Building Production-Grade Asynchronous Applications with MCP

As the demand for real-time, responsive, and scalable AI applications grows, building robust asynchronous APIs becomes essential. In this guide, we explore FastAPI, a high-performance web framework for Python.

Riley Learning

17 May 2025 • 12 min read

As the demand for real-time, responsive, and scalable AI applications grows, building robust asynchronous APIs becomes essential. In this guide, we explore FastAPI, a high-performance web framework for Python, and how it can power production-grade asynchronous applications—particularly those integrating with AI orchestration protocols like the Model Context Protocol (MCP). The code below is based on mcp-client-python-example.

FastAPI: The Modern Framework for Async Web Applications

FastAPI is a modern, fast (high-performance) web framework for building APIs with Python 3.7+ based on standard Python type hints. It is designed to be:

Fast to code: Developer-friendly with automatic docs
Asynchronous: Supports async / await syntax for non-blocking operations
Production-ready: Easily scalable and suitable for real deployments
Fast to run: Built on top of Starlette and Pydantic

Type Hints in Python?

Type hints are a feature introduced in Python 3.5+ that let you explicitly declare the expected data types of variables, function parameters, and return values.

They don’t affect how the code runs, but they help:

Improve code readability
Enable static type checking (e.g., with tools like mypy)
Power frameworks like FastAPI and Pydantic to validate data automatically

def add(a: int, b: int) -> int:
    return a + b

This means:

a should be an int
b should be an int
The function will return an int

Even if you pass a string, Python won’t stop you at runtime — but tools or frameworks can warn you or reject invalid input.

In FastAPI, type hints are used to:

Validate request data automatically (via Pydantic)
Generate OpenAPI docs dynamically
Parse query/path/body parameters with correct types
Enable auto-completion in editors (e.g., VS Code, PyCharm)

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class Item(BaseModel):
    name: str
    price: float

@app.post("/items/")
async def create_item(item: Item) -> dict:
    return {"message": f"{item.name} created with price {item.price}"}

FastAPI uses type hints (item: Item) to know it should parse the request body as JSON, validate it against the Item model, and reject bad data.
The return type -> dict is used to auto-document the response schema.

Starlette?

Starlette is the web framework foundation that FastAPI is built upon. It provides the core features for building asynchronous web services, such as:

Key Features:

Async support: Designed with async/await for high-performance I/O
Routing: Manages HTTP endpoints and path matching
Middleware support: Lets you intercept and modify requests/responses
WebSocket support
Background tasks: Allows tasks to run in the background after sending responses

In short: FastAPI = Starlette (web layer) + Pydantic (data layer) + API docs

Pydantic?

Pydantic is a library used for data validation and parsing based on Python type annotations.

In FastAPI, it powers the request/response models and automatically handles:

Key Features:

Type-based validation: Validates incoming data using Python type hints
Automatic parsing: Converts JSON into Python objects
Error reporting: Returns detailed validation errors automatically
Fast execution: Built with Cython for performance

from pydantic import BaseModel

class Item(BaseModel):
    name: str
    price: float
    in_stock: bool

With this Item model:

FastAPI validates incoming JSON against the model
Automatically parses the JSON into a Python object
Returns a 422 Unprocessable Entity error if validation fails
Uses the model to auto-generate interactive API docs (e.g., Swagger UI)

Tool	Role
Type Hints	Define expected types, improve code clarity, and enable validation
Starlette	Web engine behind FastAPI (async routes, requests, middleware)
Pydantic	Data validation and parsing (powered by type hints)

What is async/await syntax?

async and await are keywords in Python (3.5+) used to write asynchronous, non-blocking code in a clean and readable way.

They allow you to define and run coroutines—functions that can pause and resume without blocking the rest of the program.

How it works

async def defines a coroutine function (like a normal function, but can pause).
await is used to pause execution until an asynchronous task is complete.

Why is this useful?

Traditional (synchronous) code waits for each operation to finish before moving to the next.

Asynchronous code using async/await can:

Pause when waiting for I/O (e.g., database, network request)
Let other tasks run in the meantime
Improve performance, scalability, and responsiveness

Example

Synchronous (blocking)

import time

def fetch_data():
    time.sleep(3)  # Blocks the program for 3 seconds
    return "Data"

print(fetch_data())
print("Next task")  # Runs *after* 3 seconds

Asynchronous (non-blocking)

import asyncio

async def fetch_data():
    await asyncio.sleep(3)  # Non-blocking pause
    return "Data"

async def main():
    data = await fetch_data()
    print(data)
    print("Next task")  # Runs immediately after fetch

asyncio.run(main())

With await, we pause only that coroutine, not the whole app. In Python, a coroutine is a special type of function that can pause and resume its execution, allowing for asynchronous, non-blocking behavior.

Why FastAPI uses async/await

FastAPI is built for high-concurrency environments:

Handles many requests simultaneously
Uses async/await to avoid blocking the server
Ideal for I/O-heavy tasks like:
- Calling LLM APIs (e.g., OpenAI, Anthropic)
- Talking to databases
- Calling external APIs

FastAPI Basic Syntax & Terminology

FastAPI is built around Python’s modern async features and type annotations. Here are some fundamental terms and how they’re used:

async def

Defines an asynchronous function (coroutine) that allows non-blocking operations. These are essential for I/O-bound tasks.

from fastapi import FastAPI

app = FastAPI()

@api.get("/hello")
async def hello():
    return {"message": "Hello, World!"}

The line @app.get("/hello") is called a decorator, and its role is:

Registering the Route

This decorator tells FastAPI:

“When an HTTP GET request is made to the path /hello, run the hello() function and return its response.”

It binds the function directly below it (hello) to a GET request handler.
"/hello" is the URL path for that endpoint.
FastAPI automatically:
- Registers this function as an endpoint
- Handles request parsing
- Converts the return value (dict) to JSON
- Generates OpenAPI documentation

Other common FastAPI route decorators:

HTTP Method	Decorator Example
GET	@app.get("/items")
POST	@app.post("/items")
PUT	@app.put("/items/{id}")
DELETE	@app.delete("/items/{id}")

await

Used inside an async def function to pause execution until an asynchronous task completes. It does not block the entire application.

import asyncio

@api.get("/delay")
async def wait_example():
    await asyncio.sleep(2)
    return {"done": True}

Pydantic Models

Pydantic is used for defining data validation schemas.

from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

Dependency Injection

FastAPI uses Depends to handle shared logic or reusable components.

from fastapi import Depends

def get_db():
    db = connect_to_database()
    try:
        yield db
    finally:
        db.close()

@app.get("/items")
async def read_items(db=Depends(get_db)):
    return db.query_items()

These features make FastAPI powerful, concise, and suitable for production environments.

What Makes an Application "Production-Grade"?

“Production-Grade” refers to software that is ready for real-world deployment—reliable, robust, and capable of serving real users at scale, serving actual users with minimal issues. Such applications exhibit several key characteristics:

Stability: Consistent performance with minimal crashes or unexpected behaviors
Scalability: Ability to handle increasing loads without degradation
Observability: Comprehensive logging, metrics, and tracing capabilities
Security: Protection against common vulnerabilities and exploits
Resilience: Ability to recover from errors and failures gracefully
In this context, “graceful” refers to how an application handles problems or shutdowns without crashing or leaving resources in a broken state.
Maintainability: Clean, well-structured code that's easy to update and extend
Resource Management: Efficient use of CPU, memory, and network resources

The Role of Asynchronous Programming

Asynchronous programming is a paradigm that allows operations to be performed concurrently without blocking the execution flow. This is particularly valuable for I/O-bound applications (like web services) that spend significant time waiting for external resources.

Key benefits include:

Improved Throughput: Handling more requests with the same resources
Better Responsiveness: Preventing long-running operations from blocking others
Efficient Resource Utilization: Making optimal use of available system resources

How FastAPI Facilitates Production-Grade Applications

FastAPI makes it easier to build production-ready applications by providing:

Structured Error Handling: Comprehensive exception handling with HTTP status codes
Request Validation: Automatic validation of request parameters and body
Response Models: Defined response structures with validation
Background Tasks: Support for asynchronous background operations
Middleware Support: Pre-processing and post-processing of requests
Testing Utilities: Simplified testing of asynchronous endpoints

Integrating FastAPI with Model Context Protocol (MCP)

The Model Context Protocol (MCP) client example demonstrates many aspects of building production-grade async applications. While the repository doesn't directly use FastAPI, it implements similar patterns that could be easily integrated with FastAPI to create a robust, production-ready AI service.

Understanding the MCP Client Code

Looking at the code below, we can identify several production-grade patterns:

class MCPClient:
    def __init__(self):
        # Initialize session and client objects
        self.session: Optional[ClientSession] = None
        self.exit_stack = AsyncExitStack()
        self.anthropic = Anthropic()

    async def connect_to_server(self, server_script_path: str):
        # Connection logic with proper error handling
        # ...

    async def process_query(self, query: str) -> str:
        # Process queries using Claude and available tools
        # ...

    async def chat_loop(self):
        # Interactive chat loop with error handling
        # ...

    async def cleanup(self):
        """Clean up resources"""
        await self.exit_stack.aclose()

The code demonstrates:

Proper Resource Management: Using AsyncExitStack for managing async resources
Error Handling: Try-except blocks for graceful error recovery
Type Annotations: Using Python's type hints for better code clarity
Asynchronous Operations: Using async/await for non-blocking operations
Clean Separation of Concerns: Different methods for different responsibilities

How This Could Be Integrated with FastAPI

To transform this MCP client into a production-grade FastAPI application, we could:

from fastapi import FastAPI, BackgroundTasks, HTTPException, Depends
from pydantic import BaseModel

app = FastAPI(title="MCP API Service")

class Query(BaseModel):
    text: str

# Dependency to get MCP client
async def get_mcp_client():
    client = MCPClient()
    try:
        await client.connect_to_server("path/to/server_script.py")
        yield client
    finally:
        await client.cleanup()

@app.post("/query", response_model=dict)
async def process_query(query: Query, client: MCPClient = Depends(get_mcp_client)):
    try:
        result = await client.process_query(query.text)
        return {"response": result}
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Processing error: {str(e)}")

This integration would provide:

API Endpoints: RESTful interface to the MCP functionality
Request Validation: Automatic validation via Pydantic models
Dependency Injection: Managed lifecycle of the MCP client
Error Handling: Proper HTTP errors with informative messages
Documentation: Automatic API documentation via Swagger UI

Let’s break down the purpose and meaning of the following components from your FastAPI example:

async def get_mcp_client()

This is a FastAPI dependency function. It’s designed to:

Create and yield an instance of your MCPClient class (used to communicate with the MCP server)
Ensure proper resource cleanup after use

async def get_mcp_client():
    client = MCPClient()# Create an instancetry:
        await client.connect_to_server("path/to/server_script.py")# Connect to serveryield client# Pass this client to any route that needs itfinally:
        await client.cleanup()# Clean up resources after the route is done

async def process_query(query: Query, client: MCPClient = Depends(get_mcp_client))

This is your route handler function for POST requests to /query.

query: Query: Accepts a request body that matches the Query Pydantic model (with a text: str field).
client: MCPClient = Depends(get_mcp_client): Tells FastAPI to inject the result of get_mcp_client() into this parameter. It will:
- Run get_mcp_client()
- Yield the client to use
- Clean up afterward

@app.post("/query")
async def process_query(query: Query, client: MCPClient = Depends(get_mcp_client)):

Using this code, you’re:

Receiving a JSON payload like { "text": "your query" }
Passing that to client.process_query(...)
Returning the result as a JSON response

@app.post("/query", response_model=dict)

This is a FastAPI route decorator, meaning that:

@app.post("/query") → Registers this function as an HTTP POST handler for the /query endpoint.
response_model=dict → FastAPI will:
- Validate that the return value is a dict
- Document the response format in the OpenAPI docs (Swagger UI)

Component	Purpose
async def get_mcp_client()	Creates and manages the lifecycle of an MCPClient instance
Depends(get_mcp_client)	Injects the MCPClient into the route handler
async def process_query(...)	Main logic for processing a POST request using the client
@app.post("/query", response_model=dict)	Registers the route and defines response type

Key Components for Production-Grade Async Applications

Whether using FastAPI, the MCP client, or any other async framework, several patterns are essential for production-grade applications:

1. Proper Resource Management

The MCP client demonstrates good resource management with AsyncExitStack:

self.exit_stack = AsyncExitStack()
# ...
await self.exit_stack.aclose()  # Proper cleanup

AsyncExitStack() is a utility provided by Python’s contextlib module. It helps manage multiple asynchronous context managers (things used with async with) in a clean, organized way, especially when you need to enter and exit them dynamically.

In FastAPI, this would be handled through dependencies:

async def get_resource():
    resource = await create_resource()
    try:
        yield resource
    finally:
        await resource.close()

yield resource

yield is used here to pause the function and “return” the resource to FastAPI so it can be used in your endpoint.
This is part of a “context-managed dependency” pattern.
After the request is handled, execution continues after the yield.
Why it’s used in FastAPI: This allows setup before yield, use during the request, and teardown after.

finally:

The finally block is always executed, even if an error occurs in the request handler.
It ensures that the resource is cleaned up properly, no matter what.

await resource.close()

This calls the resource’s close() method (usually to release memory, close connections, etc.).
Because the resource is asynchronous (e.g., an async DB or API client), await ensures the cleanup is done properly.

Therefore, the lifecycle can be summarized:

1.	resource = await create_resource() — Asynchronously create the resource
2.	yield resource — Temporarily “return” the resource to be used in an endpoint
3.	After the request finishes, jump to finally:
4.	await resource.close() — Clean up the resource asynchronously

2. Graceful Error Handling

The MCP client handles errors in its chat loop:

try:
    response = await self.process_query(query)
    print("\\n" + response)
except Exception as e:
    print(f"\\nError: {str(e)}")

In FastAPI, this translates to exception handlers:

@app.exception_handler(CustomException)
async def custom_exception_handler(request, exc):
    return JSONResponse(
        status_code=418,
        content={"message": f"Error: {str(exc)}"},
    )

3. Asynchronous Operations

Both the MCP client and FastAPI use Python's async/await for non-blocking operations:

# MCP client
async def process_query(self, query: str) -> str:
    # Async processing

# FastAPI
@app.get("/items/{item_id}")
async def read_item(item_id: int):
    # Async endpoint

4. Structured Logging

A production-grade application should include proper logging:

import logging

logger = logging.getLogger("app")

async def process_query(self, query: str) -> str:
    logger.info(f"Processing query: {query[:30]}...")
    try:
        result = await self._internal_process(query)
        logger.info("Query processed successfully")
        return result
    except Exception as e:
        logger.error(f"Error processing query: {str(e)}", exc_info=True)
        raise

5. Robust Connection Management

The MCP client manages connections carefully:

async def connect_to_server(self, server_script_path: str):
    # Validate input
    if not (is_python or is_js):
        raise ValueError("Server script must be a .py or .js file")

    # Create connection
    stdio_transport = await self.exit_stack.enter_async_context(
        stdio_client(server_params)
    )

In FastAPI, this would be implemented through startup/shutdown events and dependencies.

Practical Implementation Steps

To build a production-grade async application integrating FastAPI with MCP:

Structure of the project:

project/
├── app/
│   ├── __init__.py
│   ├── main.py         # FastAPI application
│   ├── mcp_client.py   # MCP client implementation
│   ├── models.py       # Pydantic data models
│   ├── dependencies.py # FastAPI dependencies
│   └── routers/        # API endpoints
├── tests/              # Test suite
├── requirements.txt    # Dependencies
└── Dockerfile          # Container definition

Implement Core Functionality:

Port the MCP client logic to a service class
Create FastAPI endpoints that utilize the MCP service
Implement proper error handling and validation

Add Production Features:

Logging with structured output
Health check endpoints
Metrics collection
Rate limiting
Authentication and authorization

Containerize the Application:

Containerizing an application means packaging it—along with all its dependencies, libraries, configuration files, and runtime—into a single, portable image that can run reliably on any system that has Docker installed.

Think of a container as a lightweight, standalone box that ensures:

Your app works the same in dev, test, and prod environments.
You avoid dependency conflicts and “it works on my machine” issues.
You can easily deploy the app to servers, cloud, or orchestration systems like Kubernetes.

Think of a container as a lightweight, standalone box that ensures:

Your app works the same in dev, test, and prod environments.
You avoid dependency conflicts and “it works on my machine” issues.
You can easily deploy the app to servers, cloud, or orchestration systems like Kubernetes.

The Dockerfile Breakdown

FROM python:3.10

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY ./app ./app

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Here’s what this Dockerfile does, line by line.

Base image: Use the official Python 3.10 environment from Docker Hub.

FROM python:3.10

Working directory inside the container: Everything from here on will happen in /app.

WORKDIR /app

Copy requirements.txt from your local machine into the container.

COPY requirements.txt .

Install dependencies listed in requirements.txt.

RUN pip install --no-cache-dir -r requirements.txt

Copy your application code (e.g., FastAPI app) into the container.

COPY ./app ./app

Run the app using Uvicorn (the ASGI server), exposing it on port 8000.

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

uvicorn?

uvicorn is an ASGI server (Asynchronous Server Gateway Interface).
It runs your FastAPI app (or any ASGI app).
It’s lightweight, fast, and supports async I/O.
Think of uvicorn as the engine that takes your Python API code and serves it as a real web server.

app.main:app

This refers to where your FastAPI app is defined.
Format: <module_name>:<FastAPI instance>

--host 0.0.0.0

Tells the server to listen on all network interfaces.
This is required in Docker, because the app must be accessible outside the container.
Without 0.0.0.0, your app would only be accessible from inside the container itself.

--port 8000

Tells uvicorn to serve the app on port 8000 inside the container.
You can map this to your local machine with docker run -p 8000:8000.

Set Up CI/CD:

Automated testing
Linting and code quality checks
Deployment pipelines
CI/CD (Continuous Integration and Continuous Deployment/Delivery) is a develops practice that automates the building, testing, and deployment of code so that updates can be delivered quickly, safely, and reliably.

Conclusion

Building production-grade async applications requires attention to many details beyond just making the core functionality work. FastAPI provides an excellent foundation for creating such applications, with built-in support for async operations, validation, documentation, and more.

The Model Context Protocol client example demonstrates many of these production-grade patterns, focusing on resource management, error handling, and clean async code. By integrating these approaches with FastAPI, you can create robust, scalable services that leverage AI models through the MCP protocol.

Whether you're building an AI service with MCP or any other async web application, following these patterns will help ensure your application is truly production-ready: stable, scalable, observable, secure, and maintainable.

Remember that the journey to a production-grade application doesn't end with deployment—continuous monitoring, refinement, and improvement are essential parts of maintaining a high-quality service in production.

Happy coding!