Integrating MongoDB with FastAPI using the Motor driver

Video thumbnail

Content Index

The next step is to configure our project to use MongoDB, which benefits greatly from the automated CRUD implementation we created earlier with FastAPI. We'll use the same database project structure we established in the previous section.

Let's learn how we can install MongoDB if we are on macOS; for this, we will start with the assumption that you have Homebrew installed, which is simply a package manager for macOS and Linux.

We will perform some practices to use a NoSQL database, specifically MongoDB, in FastAPI. Before starting, we will compare MongoDB with traditional relational (SQL) databases.

SQL vs. NoSQL

  • Relational Databases (SQL): As we have seen so far, they are structured. They function like Excel tables linked to each other. They have a fixed schema; for example, if you have a "tasks" table with ID and Name, and then you want to save a Description, you must mandatory modify the database schema.
  • NoSQL Databases (MongoDB): These are unstructured (or semi-structured) databases. Data is stored in a flexible way, generally in formats similar to JSON. This allows changing the structure without prior notice. For example, we can inject a "category" directly into the schema of a "task".

Advantages and Disadvantages

Advantages:

  • Flexibility: Ideal for rapid prototyping and changing schemas.
  • Massive scalability: Designed to handle gigantic volumes of data.
  • Speed: They tend to be more efficient for simple read/write operations.

Disadvantages:

  • Less consistency: By not having a fixed schema, they can become a mess if not managed well.
  • Complex queries: It is more difficult to perform joins or very intricate queries.
  • Maturity: Although they are popular, the SQL ecosystem has decades more of support and stability.

High Performance: FastAPI + MongoDB

The main focus is to help you break away from the relational SQL mindset and enter the NoSQL world.

  • FastAPI is a framework recognized for its extremely high performance and speed.
  • MongoDB is a database designed precisely for high performance and scalability.

This combination is ideal for high-demand projects where response speed is critical.

Flexibility vs. Consistency

One of the points I will repeat most often is the trade-off of lower data consistency in favor of greater flexibility. In MongoDB, everything is a JSON (technically BSON), which gives us total freedom, but also risks.

The Consistency Problem

Imagine we have a task with a category_id field. Due to MongoDB's flexibility, you could encounter:

  1. A task that has the category correctly defined.
  2. Another task that, due to a CRUD error, does not have the category field.
  3. A task with directly embedded tags, without an external relational table.

If you don't manage your logic well from the code, when trying to query the category of a record that doesn't have it, your application could throw a 500 error. In MongoDB, the responsibility for maintaining data integrity falls much more on the developer and how they program their CRUD.

Key Concepts and Equivalencies

Before starting our small project, it is essential that we speak the same language. If you come from the SQL world, here are the basic equivalencies:

SQL Concept    MongoDB Equivalent

  • Table    Collection
  • Record/Row    Document
  • Column    Field

Installation on Windows

Installation on Windows is very simple:

  1. Search Google for MongoDB Community Server.
  2. Download the installer and follow the typical steps (Next, Next, Finish).
  3. Environment variables configuration: You will likely need to add the installation path (usually the bin folder) to the system environment variables.
    1. Tip: Right-click on "This PC" -> Properties -> Advanced system settings -> Environment Variables -> Path -> Add the bin folder path.
  4. Restart your computer and you will be able to use the mongosh command.

Installation on macOS (using Homebrew)

Installing MongoDB on macOS might seem complicated the first time, but using Homebrew makes the process much simpler and cleaner. In this guide, I explain step-by-step how to install MongoDB on macOS with Homebrew, how to start it correctly, and how to begin working with the database by performing basic CRUD operations.

This flow is the one I always use whenever I set up a new development environment on Mac, and it avoids most of the typical errors that usually appear when starting MongoDB for the first time.

With our package manager ready, nothing could be easier; the first thing we need to do is add the MongoDB repository to our package manager.

Prerequisites

Before installing MongoDB, it is important to ensure that the system has everything it needs.

Compatible macOS

MongoDB works correctly on modern versions of macOS (Catalina onwards). If you are using a very old version, it is recommended to update the system or install a compatible version of MongoDB.

Homebrew Installation

macOS does not include Homebrew by default, and it is one of the most important tools for development on Mac. Homebrew is a package manager that allows you to install software from the terminal easily.

To install it, follow the official instructions from their website:

https://brew.sh/#install

Necessary software to install MongoDB

Install Command Line Tools for Xcode

Most likely, when you go to execute the Brew command to install the package, it will ask you to install the Command Line Tools for Xcode; accept, download, and install these tools.

It is very probable that, when executing any brew command, macOS will ask you to install the Command Line Tools for Xcode.

Accept the message and let them install, as they are necessary to compile and run many dependencies.

Install Homebrew

What is a Homebrew tap?

A tap is simply an additional repository that Homebrew uses to find packages. MongoDB maintains its official tap, which is important to avoid unsupported installations.

Command to add the tap

Now then, let's install Homebrew. macOS does not include the Homebrew preparation package by default; therefore, you have to install it as indicated on the official page. https://brew.sh/#install

Homebrew installs the things you need for your macOS from a terminal easily.

$ brew tap mongodb/brew

This step is key; many errors come from trying to install MongoDB without using the official tap.

Choosing the MongoDB version

MongoDB publishes several versions. In this case, we are going to install a specific stable version, which is the one that has given me the best results on macOS:

Install the MongoDB Homebrew Tap

Issue the following from the terminal to tap the official MongoDB Homebrew tap: https://github.com/mongodb/homebrew-brew

This is a custom Homebrew Tap (package) for official MongoDB software.

$ brew tap mongodb/brew

After this, we install the latest version to date, which at the time of saying these words would be:

$ brew install mongodb-community@8.2

Or you can install a specific version:

$ brew install mongodb-community@8.0
$ brew install mongodb-community@7.0

Installing a specific version avoids incompatibilities with libraries or the operating system, something that has already saved me more than one headache.

Graphical Interface: MongoDB Compass

To work in a more pleasant way and not depend only on the terminal, we will install MongoDB Compass, the official graphical interface tool.

  • On Windows: It can be selected during the server installation or downloaded separately from the official website.
  • On macOS (via Homebrew):
    • $ brew install --cask mongodb-compass
    • (The --cask parameter indicates that we are installing an application with a graphical interface).

Once installed, you will find it in your Applications folder. Open it, connect to the local server, and you will be ready to manage your data collections.

You can also install it using the installer on macOS and Windows:

https://www.mongodb.com/try/download/compass

Checking the MongoDB Installation

Once the process is finished, MongoDB will be installed on your computer, but it will not be running yet.

Starting the MongoDB Process

Now that we have MongoDB on our computer, the next thing we are going to do is start the process, because if we run in our terminal:

$ brew services start mongodb-community

This command is fundamental. If you don't start the service and run mongo directly, you will get a connection error.

Because if you don't start it and type mongo in the terminal, you will see an error like the following:

MongoDB shell version v8.0.2
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused :
connect@src/mongo/shell/mongo.js:372:17

This happens because MongoDB is not listening on port 27017.

Once successfully started, the command:

$ mongo

Will allow you to access the shell without problems, or see its installed version:

$ mongod --version

Starting the Service

Just like with services like MySQL, to be able to use it, we must start the service; because if we try to start the Mongo assistant without starting the service:

$ mongosh

We will see an error like the following:

Current Mongosh Log ID: 699c28e47c1b4855cf41cae5
Connecting to:          mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.7.0
MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017

Which says it's trying to connect but the MongoDB server DID NOT respond; we start the service:

brew services start mongodb-community@8.2

And now, if you run:

$ mongosh

It should greet you with a:

test>

Stopping or Restarting MongoDB

Some useful commands I usually use:

$ brew services stop mongodb-community
$ brew services restart mongodb-community

Common Errors and How to Resolve Them

Connection error on localhost

It is almost always because the service is not started. Verify with:

$ brew services list

Incompatible Version Issues

If you changed your macOS version or updated MongoDB, it might be necessary to reinstall the correct version or clean up old services.

Motor: The Asynchronous Driver for MongoDB

To work with MongoDB in Python, many connectors exist, but we are going to use Motor.

Motor is a driver specifically designed to work asynchronously. As we mentioned previously, MongoDB is designed for large data volumes and high concurrency. If we add this to FastAPI's asynchronous schema, we obtain a fundamental tool for handling requests much more efficiently. It makes perfect sense to use an asynchronous service for high-consumption applications, which is precisely the purpose of MongoDB.

Installation of Dependencies

To install the tool, the process is the standard one using pip. Remember to have your virtual environment active before running the command.

Keep in mind that as we move forward, we will be deleting the dependencies we no longer need, such as everything related to SQLite and SQLAlchemy, since MongoDB does not require these libraries.

To install Motor, run:

$ pip install motor

First Connection to MongoDB with Motor

We are going to make our first connection to MongoDB. We create a file named db_connection.py. In it, we import the Motor client to manage the asynchronous connection:

db_connection.py

import logging
from motor.motor_asyncio import AsyncIOMotorClient
logger = logging.getLogger("uvicorn.error")
mongo_client = AsyncIOMotorClient(
    "mongodb://localhost:27017"
)
async def ping_mongo_db_server():
    try:
        await mongo_client.admin.command("ping")
        logger.info("Connected to MongoDB")
    except Exception as e:
        logger.error(
            f"Error connecting to MongoDB: {e}"
        )
        raise e

Client Configuration

Before starting, make sure the MongoDB service is active (as we saw in the first video). If typing mongosh in your terminal gives you the welcome message, everything is in order; otherwise, you must start it.

We define the connection URL, which by default uses port 27017 (similar to how MySQL uses 3306).

We create a function to "ping" the server. If the attempt is successful, we will show a message indicating that we are connected; otherwise, we will throw an exception to warn that there are connection problems.

Cleaning up SQLAlchemy in FastAPI

In the main file of your API, you should comment out or delete everything related to SQLAlchemy and relational databases, since we will now use MongoDB.

  • Imports: Comment out the SQLAlchemy lines and the routes that depend on it.
  • Dependencies: You can delete the function that managed the relational database session.
  • Models: We will no longer need to create tables at startup, as MongoDB does not require a fixed predefined schema.

Life Cycle with Lifespan in FastAPI

To manage the connection efficiently, we will use the Lifespan event handler. If you weren't familiar with it, it is a simple way to control application life cycles.

  • Before the yield: Everything you place here will run before the application starts receiving requests. It is the ideal place to initialize the MongoDB client.
  • After the yield: Here we will place the logic to close the connection or clean up resources when the application stops.

Finally, we configure this lifespan when creating the FastAPI instance:

api.py

from fastapi import FastAPI, Depends, APIRouter, Query, Path
from contextlib import asynccontextmanager
from db_connection import ping_mongo_db_server
@asynccontextmanager
async def lifespan(app: FastAPI):
    await ping_mongo_db_server()
    yield
app = FastAPI(lifespan=lifespan)

Upon starting the server, you should see the message: "Connected to MongoDB" in the console. This confirms that the operation was successful and we are ready to start working with collections and documents.

Obtaining the MongoDB Client

Now, we are going to implement a service responsible for managing the connection. This service returns the database instance ready to use:

mongo_db.py

from db_connection import mongo_client
# Define the database that will contain all the collections of our application.
# The motor library will create it automatically if it does not exist.
database = mongo_client.task_manager
def get_mongo_database():
    """Returns the database to be used as a dependency."""
    return database

Router and Schema Implementation

In the API file, we configure the routing under the Mongo Tasks tag:

api.py

from mongo_task import mongo_task_router
***
app.include_router(mongo_task_router, prefix="/mongo/tasks", tags=["Mongo Tasks"])

You will notice that, although the structure is similar to the one used with SQLAlchemy, there are key differences in the methods and how we handle data.

From ORM to Native Driver

Previously, we used an ORM for relational databases. In MongoDB, being a document-oriented database, the nomenclature changes:

  • Instead of traditional SQL methods, we use functions like insert_one, find, update_one, or delete_one.
  • Data structure: MongoDB works natively with JSON-like structures. In the case of Python, this translates into the constant use of dictionaries.

CRUD Operations Step by Step

Let's start with the initial imports:

from fastapi import APIRouter, Body, Depends, HTTPException, status, Path
from pymongo.database import Database
from bson import ObjectId
from mongo_db import get_mongo_database
from schemes import TaskWrite
mongo_task_router = APIRouter()

1. Create Task (POST)

We convert the model to a dictionary and insert the record. It is an asynchronous operation that returns the generated ID.

mongo_task.py

# CREATE
@mongo_task_router.post("/", status_code=status.HTTP_201_CREATED, summary="Create a new task")
async def add_task(
    task: TaskWrite = Body(...),
    db: Database = Depends(get_mongo_database),
):
    """
    Creates a new task in the database.
    """
    # task_dict = task.dict()
    task_dict = task.model_dump()
    insert_result = await db.tasks.insert_one(task_dict)
    return {
        "message": "Task added successfully",
        "id": str(insert_result.inserted_id),
    }

_id and the ObjectId

When inserting your first task, you will notice that the identifier is not an incremental number (1, 2, 3...), but a strange hexadecimal string called ObjectId.

Why isn't it a sequential number?

Relational databases are usually centralized, which makes it easy to keep an exact count. However, MongoDB is designed to be decentralized.

If we had several MongoDB servers running in parallel, two servers might try to assign the ID "5" at the same time, generating a conflict. The ObjectId solves this by combining several factors:

  • Timestamp: The exact time of creation (this guarantees it is unique in time).
  • Process identifier and counter: Random data that ensures uniqueness even if two records are created in the same millisecond.

2. Read All Tasks (GET)

We use the find() method. It is important to convert the cursor returned by Mongo to a list using to_list() so that FastAPI can return it as a JSON.

mongo_task.py

# READ ALL
@mongo_task_router.get("/", summary="Get all tasks")
async def get_all_tasks(db: Database = Depends(get_mongo_database)):
    """
    Gets all tasks from the 'tasks' collection.
    """
    tasks_cursor = db.tasks.find()
    return await tasks_cursor.to_list(length=None)

3. Read a Specific Task (GET by ID)

Here we apply a double validation:

  • Format validation: We check if the received string is a valid ObjectId. If it is not, we return a 400 error immediately to save resources.
  • Search: If the format is correct but the record does not exist, we return a 404.

mongo_task.py

# READ ONE
@mongo_task_router.get("/{task_id}", summary="Get a task")
async def get_task(
    task_id: str = Path(..., description="The ID of the task to retrieve"),
    db: Database = Depends(get_mongo_database),
):
    """
    Gets a single task by its ID.
    """
    if not ObjectId.is_valid(task_id):
        raise HTTPException(status_code=400, detail=f"Invalid ObjectId: {task_id}")
    
    task = await db.tasks.find_one({"_id": ObjectId(task_id)})
    
    if not task:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND,
            detail=f"Task with id {task_id} not found"
        )
    return task

4. Update and Delete (PUT / DELETE)

In the update, we send only the fields we want to modify. For deletion, we simply search by the _id and execute delete_one. If the affected document count is zero, we report that no action was taken.

mongo_task.py

# UPDATE
@mongo_task_router.put("/{task_id}", summary="Update a task")
async def update_task(
    task_id: str = Path(..., description="The ID of the task to update"),
    task: TaskWrite = Body(...),
    db: Database = Depends(get_mongo_database),
):
    """
    Updates the fields of a task.
    """
    if not ObjectId.is_valid(task_id):
        raise HTTPException(status_code=400, detail=f"Invalid ObjectId: {task_id}")
    # update_data = task.dict(exclude_unset=True)
    update_data = task.model_dump(exclude_none=True)
    if not update_data:
        raise HTTPException(status_code=400, detail="No data provided for update")
    result = await db.tasks.update_one({"_id": ObjectId(task_id)}, {"$set": update_data})
    if result.matched_count == 0:
        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"Task with id {task_id} not found")
    
    if result.modified_count == 1:
        updated_task = await db.tasks.find_one({"_id": ObjectId(task_id)})
        return updated_task
    
    return {"message": "The task data was the same, no update was performed."}
# DELETE
@mongo_task_router.delete("/{task_id}", status_code=status.HTTP_204_NO_CONTENT, summary="Delete a task")
async def delete_task(
    task_id: str = Path(..., description="The ID of the task to delete"),
    db: Database = Depends(get_mongo_database),
):
    """
    Deletes a task from the database.
    """
    if not ObjectId.is_valid(task_id):
        raise HTTPException(status_code=400, detail=f"Invalid ObjectId: {task_id}")
    result = await db.tasks.delete_one({"_id": ObjectId(task_id)})
    if result.deleted_count == 0:
        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"Task with id {task_id} not found")
    
    return

Verification in Mongo Compass

Once the tests are executed from the FastAPI interactive documentation (Swagger UI), you can refresh Mongo Compass. You will see how the documents are stored with their flexible JSON-type structure.

This workflow demonstrates how transparent switching from a relational schema to a NoSQL one can be if you have a good architecture.

Example of the Relational Schema in MongoDB

What do we want to do? Currently, we have the Task entity, but now I want to add a list of tags to it. In this case, it is a list of strings, although it could be anything else. Here is where the "weird" part begins: if this were a pure relational schema, the relationship wouldn't be so direct. Usually, we would define a property equal to an ID found in another entity. However, I didn't do it that way here because I want the tags to be simply embedded text.

Regarding the data model (Pydantic), we simply add the tags field, which is a list. We don't have an additional table for tags, and this is where I want you to reflect:

schemes.py

class Task(BaseModel):
    name: str
    description: Optional[str] = Field("No description",min_length=5)
    status: StatusType
    tags: List[str] = []
***
class TagsUpdate(BaseModel):
    tags: List[str]    

Where are we going to store those tags if there isn't an independent table like in a relational schema?

In a relational model, we would necessarily have a table for tasks and another for tags, probably with an intermediate table.

In MongoDB, no. Here we break away from that traditional schema.

MongoDB works with JSON documents. And a JSON can contain an array. That array is precisely the tags field.

Adding and Removing Tags

In the mongo_task.py file is where the main changes are. The initial part (getting data and inserting) remains the same. To make it easier to read, I will focus on the tag manipulation part, which is a bit more abstract.

mongo_task.py

# ADD TAGS
@mongo_task_router.put("/{task_id}/tags/add", summary="Add tags to a task")
async def add_tags_to_task(
    task_id: str = Path(..., description="The ID of the task to update"),
    tags_update: TagsUpdate = Body(..., example={"tags": ["new_tag_1", "new_tag_2"]}),
    db: Database = Depends(get_mongo_database),
):
    """
    Adds one or more tags to an existing task.
    Uses $addToSet to avoid duplicates in the tags array.
    """
    if not ObjectId.is_valid(task_id):
        raise HTTPException(status_code=400, detail=f"Invalid ObjectId: {task_id}")
    result = await db.tasks.update_one(
        {"_id": ObjectId(task_id)},
        {"$addToSet": {"tags": {"$each": tags_update.tags}}}
    )
    if result.matched_count == 0:
        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"Task with id {task_id} not found")
    
    updated_task = await db.tasks.find_one({"_id": ObjectId(task_id)})
    return updated_task
# REMOVE TAGS
@mongo_task_router.put("/{task_id}/tags/remove", summary="Remove tags from a task")
async def remove_tags_from_task(
    task_id: str = Path(..., description="The ID of the task to update"),
    tags_update: TagsUpdate = Body(..., example={"tags": ["tag_to_remove_1", "tag_to_remove_2"]}),
    db: Database = Depends(get_mongo_database),
):
    """
    Removes one or more tags from an existing task.
    Uses $pull to remove instances of the specified tags.
    """
    if not ObjectId.is_valid(task_id):
        raise HTTPException(status_code=400, detail=f"Invalid ObjectId: {task_id}")
    result = await db.tasks.update_one(
        {"_id": ObjectId(task_id)},
        {"$pull": {"tags": {"$in": tags_update.tags}}}
    )
    if result.matched_count == 0:
        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"Task with id {task_id} not found")
    
    updated_task = await db.tasks.find_one({"_id": ObjectId(task_id)})
    return updated_task

The initial part (getting data and inserting) remains the same. To make it easier to read, I will focus on the tag manipulation part, which is a bit more abstract.

1. Adding Tags ($addToSet Operator)

I have added methods to manipulate the tags of a task using its task_id. Note that we receive an array of data all at once. In a relational schema, if you wanted to add 10 tags, you would have to do 10 insertions or a complex operation between tables. Not here; everything is done in a single operation.

To update, we use update_one with an operator called $addToSet.

Why $addToSet? Because MongoDB works with JSON format, and a JSON can have an embedded JSONArray. This operator allows us to add elements to that array while ensuring they are not repeated, and it does so atomically.

To avoid manually iterating with a 'for' loop in our code (which would be inefficient), we use the $each modifier. This allows MongoDB to iterate internally and add all values at once.

$addToSet + $each

We use the $addToSet operator along with the $each modifier.

  • $addToSet adds values without duplicating them.
  • $each allows internal iteration over the received array.

This avoids having to:

  • Make 10 API requests.
  • Manually iterate through values.
  • Execute multiple operations in the database.

2. Removing Tags ($pull Operator)

To remove tags, the logic is similar but we use the $pull operator.

  • How does it work? We receive an array with the elements to remove.
  • The $in operator: It is responsible for searching which of the tags we sent actually exist in the document.
  • The $pull operator: Extracts them from the list.

Internally we use $pull along with $in, which allows comparing multiple values.

It is very flexible: if you send a tag that does not exist in the task, it simply skips it without throwing errors, like a silent conditional.

And if we check the database after performing some operations:

tasks (collection)

{
  "_id": {
    "$oid": "699c6024ba20f652d828f93c"
  },
  "name": "Task 1",
  "description": "No description",
  "status": "done",
  "category_id": 1,
  "user_id": 0,
  "id": 0,
  "tags": [
    "Tag 1",
    "Tag 3",
    "Tag 4"
  ]
}
{
  "_id": {
    "$oid": "699d86e1701f772d79b49f03"
  },
  "name": "string",
  "description": "No description",
  "status": "done",
  "tags": [
    "Tag 2",
    "Tag 3"
  ],
  "id": "string"
}

Schema Flexibility: Where is the tags table?

This is where I want you to ask yourself: Where are the tags? In the relational world, you would have a Tags table and perhaps a pivot table. Here they don't exist. The tags are embedded within the task's JSON itself.

This has advantages and disadvantages:

  • The good: When you query a task, you already bring its tags "in one go" without needing to perform a JOIN. It is much faster for large data loads.
  • The bad: There is no strict migration system. As you saw in the exercise, I modified the structure by adding the tags column and MongoDB didn't care; it simply started saving the new field in new or updated documents.

How tags are stored in MongoDB

Remember that MongoDB stores documents in JSON format:

{
 "id": 1,
 "title": "Task 1",
 "tags": ["tag1", "tag2"]
}

Here the tags are embedded within the task document itself. There is no separate table.

That has a big advantage: when we query the task, we already get all its tags in a single operation, without the need to JOIN.

This can be good or bad, depending on the use case, but in terms of performance and simplicity, it is quite efficient.

Relationships

Let's summarize quickly to reinforce the most important points. In the previous class we saw how to handle relationships in MongoDB and, although I didn't mention it explicitly, we are working with a Many-to-Many (N:N) relationship.

Why is it Many-to-Many? Let's see it in practice:

We have a task with tags 1, 3, and 4, and another task that has nothing. If we add "Tag 3" to that second task, now both share the same tag.

I know you might be wondering: "Isn't this a mess? There is no foreign key or relational indexes." This is where you must open your mind: MongoDB is not a relational database. The "table" scheme is broken to understand that everything is a JSON. MongoDB is, essentially, a manager that allows us to manipulate those JSONs with great flexibility. Even if the link is just text (string), if the value "Tag 3" is identical in both documents, a functional relationship exists.

Normalized vs. Denormalized Schemas

In MongoDB we can follow two paths to structure data:

1. Denormalized Schema (Embedded)

It is the one we are using. We save the value directly (the tag text) inside the task.

  • Advantage: You don't need pivot tables or joins. When bringing the task, you already have all the information "in one fell swoop."
  • Link: The value itself is the link. It is ideal if the data does not change frequently.

2. Normalized Schema (Referenced)

It is the equivalent of the relational schema. Instead of saving the text "Tag 3", we save the ID (or ObjectId) that references a document in a separate collection called tags.

  • Structure: You would have an array of IDs called tag_ids.
  • Usage: It is recommended when the related entity (the tag or the user) undergoes many updates. If you change the name of a tag in its own collection, the change is reflected everywhere because the tasks only point to the ID.

When to use each? (1:1, 1:N and N:N)

Everything depends on your business logic and update frequency:

  • One-to-Many Relationship (1:N): Like categories. If a task belongs to a category, instead of an array, you would simply have a category field which can be the name or a reference.
  • One-to-One Relationship (1:1): Example: Users and Addresses. Since an address is usually unique to a user, the most logical thing is for the address schema to live embedded (within) the user object. It doesn't make sense to create a separate collection for something that won't be shared.

Mass Operations: update_one vs update_many

If you use the denormalized schema and need to rename a tag in all tasks, you cannot use update_one. For that, update_many exists.

MongoDB offers these methods precisely because of its flexible nature. If you have 1,000 tasks with the tag "Old" and you want them to now say "New", you launch an update_many that looks for that value and replaces it in the entire collection at once.

Conclusion and Practice

The best way to understand this is by breaking the mental schema of Excel tables. I leave it to you as a task to research or ask your assistant to generate the code for an update_many following our schema. You have the source code in the repository to compare.

Try creating a model where addresses are an embedded object or try to simulate a 1:N relationship with categories. Only by practicing will you understand when the speed of denormalized data or the integrity of normalized data suits you better.

Source code:

https://github.com/libredesarrollo/fastapi-book-course-mongodb 

Learn how to connect MongoDB with FastAPI using Engine, installation, CRUD, relationships, and recommendations.

I agree to receive announcements of interest about this Blog.

Andrés Cruz

ES En español