Python API Development with FastAPI

Advanced FastAPI

Joins Sqlalchemy

In this article, you'll learn how to retrieve related data from two tables by using SQLAlchemy joins. We start with a quick reminder of how raw SQL joins work, then build an equivalent query in SQLAlchemy, and finally troubleshoot common pitfalls with response schemas.

Building the Basic Query

Below is an initial version of a path operation that retrieves posts. Notice how the query joins the posts with the votes table and counts the votes using SQLAlchemy's built-in func.count function:

def get_posts(
    db: Session = Depends(get_db),
    current_user: int = Depends(oauth2.get_current_user),
    limit: int = 10,
    skip: int = 0,
    search: Optional[str] = ""
):
    posts = db.query(
        models.Post, func.count(models.Vote.post_id).label("votes")
    ).join(
        models.Vote, models.Vote.post_id == models.Post.id, isouter=True
    ).group_by(models.Post.id)

    posts = db.query(models.Post).filter(
        models.Post.title.contains(search)
    ).limit(limit).offset(skip).all()

    return posts

@router.post("/", status_code=status.HTTP_201_CREATED, response_model=schemas.Post)

The query begins with db.query(models.Post, ...) to select both the post fields and the vote count, grouped by the post ID. We can temporarily pause execution (by commenting out or avoiding the .all() call) to inspect the raw SQL generated by SQLAlchemy.

Tip

Printing the query object before execution is a useful debugging step to verify the correctness of your join conditions.

Inspecting the Generated SQL

To see the SQL that SQLAlchemy produces, store the query in a variable (e.g., results) and print it:

def get_posts(
    db: Session = Depends(get_db),
    current_user: int = Depends(oauth2.get_current_user),
    limit: int = 10,
    skip: int = 0,
    search: Optional[str] = ""
):
    posts = db.query(
        models.Post, func.count(models.Vote.post_id).label("votes")
    ).join(
        models.Vote, models.Vote.post_id == models.Post.id, isouter=True
    ).group_by(models.Post.id)
    
    posts = db.query(models.Post).filter(
        models.Post.title.contains(search)
    ).limit(limit).offset(skip).all()
    
    results = db.query(models.Post)
    print(results)
    
    return posts

A typical log output might look like:

INFO: Application startup complete.
INFO: 127.0.0.1:60707 - "GET /posts HTTP/1.1" 307 Temporary Redirect
INFO: 127.0.0.1:60707 - "GET /posts/ HTTP/1.1" 200 OK

Establishing the Join with Votes

The next step is to perform the join on the votes table. The equivalent SQL for this operation is:

SELECT t.*, COUNT(votes.post_id) as votes 
FROM posts 
LEFT JOIN votes ON posts.id = votes.post_id 
GROUP BY posts.id;

In SQLAlchemy, the join condition is specified as models.Vote.post_id == models.Post.id. By default, SQLAlchemy uses an inner join. To use a left outer join, pass the parameter isouter=True.

Consider this code snippet:

posts = db.query(models.Post).filter(
    models.Post.title.contains(search)
).limit(limit).offset(skip).all()

results = db.query(models.Post).join(
    models.Vote, models.Vote.post_id == models.Post.id
).all()
print(results)

The log output will show a basic inner join. To explicitly change this to a left outer join, adjust the join as follows:

posts = db.query(models.Post).filter(
    models.Post.title.contains(search)
).limit(limit).offset(skip).all()

results = db.query(models.Post).join(
    models.Vote, models.Vote.post_id == models.Post.id, isouter=True
)
print(results)

The log should confirm that a left outer join is applied.

Grouping Results and Counting Votes

Now, modify the query to group the posts and count the associated votes:

posts = db.query(models.Post).filter(
    models.Post.title.contains(search)
).limit(limit).offset(skip).all()

results = db.query(
    models.Post, func.count(models.Vote.post_id).label("votes")
).join(
    models.Vote, models.Vote.post_id == models.Post.id, isouter=True
).group_by(models.Post.id)

print(results)

This statement mimics the following SQL:

SELECT posts.*, COUNT(votes.post_id) as votes 
FROM posts 
LEFT JOIN votes ON posts.id = votes.post_id 
GROUP BY posts.id;

Make sure to import the func method from SQLAlchemy at the top of your file:

from sqlalchemy import func

Handling Pydantic Response Validation

When running these queries, you might encounter Pydantic validation errors like:

pydantic.error_wrappers.ValidationError: 78 validation errors for Post
response -> 0 -> title
  field required (type=value_error.missing)
...

These errors occur because the Pydantic schema expects fields like title, content, and id at the top level. However, the join returns a dictionary with keys such as "Post" (capital "P") and "votes". To resolve this, update your Pydantic response models. For example, if your original post model is:

class Post(PostBase):
    id: int
    created_at: datetime
    owner_id: int
    owner: UserOut

    class Config:
        orm_mode = True

Define a new schema that matches the output of your join query:

class PostOut(BaseModel):
    Post: Post
    votes: int

    class Config:
        orm_mode = True

Note

Ensure that the field name "Post" in the new schema matches exactly with the key returned by your join query. Inconsistent capitalization can lead to validation errors.

Finally, update your path operation to use the new response model:

@router.get("/", response_model=List[schemas.PostOut])
def get_posts(
    db: Session = Depends(get_db),
    current_user: int = Depends(oauth2.get_current_user),
    limit: int = 10,
    skip: int = 0,
    search: Optional[str] = ""
):
    results = db.query(
        models.Post, func.count(models.Vote.post_id).label("votes")
    ).join(
        models.Vote, models.Vote.post_id == models.Post.id, isouter=True
    ).group_by(models.Post.id).all()

    # Optionally, if you want to run a raw query:
    # posts = db.execute(
    #     'SELECT posts.*, COUNT(votes.post_id) as votes FROM posts LEFT JOIN votes ON posts.id = votes.post_id GROUP BY posts.id'
    # )
    # results = posts.fetchall()

    return results

Testing Your Query

After deploying your changes, test the query by adding query parameters. For instance:

  • Filter posts by a search term: ?search=beaches
  • Limit the number of posts: ?limit=2

The JSON response should include both the post details and the aggregated vote count:

[
    {
        "Post": {
            "id": 11,
            "title": "top beaches in florida",
            "content": "something something beaches",
            "published": true,
            "created_at": "2021-08-28T22:28:01.010598-04:00",
            "owner_id": 21,
            "owner": {
                "id": 21,
                "email": "[email protected]",
                "created_at": "2021-08-28T21:09:08.032636-04:00"
            }
        },
        "votes": 0
    }
]

This setup returns posts along with vote counts and owner details, while retaining filtering and pagination functionality.

Happy coding!

Watch Video

Watch video content

Previous
Sql Joins