Demo Upgrade Python Flask App to Connect to BentoML for Online Serving

Welcome to this lesson on how to integrate a Flask application with BentoML for online prediction serving. In this tutorial, you will learn how to set up a simple Flask application that accepts CSV file uploads, processes the data, communicates with the BentoML predict endpoint, and then displays prediction results in an intuitive user interface. Below is an overview of the steps covered in this tutorial:

Step 1: Setting Up the Flask Application

Begin by launching your VS Code editor and opening a new terminal. Also, open a separate terminal window to start the BentoML service, which must be running to serve predictions. Create a file named flaskapp.py and include the following code. This code establishes an endpoint that receives a POST request containing a Base64-encoded CSV file. The CSV content is decoded and converted into a DataFrame. Should the DataFrame include a claim_id column, it is temporarily separated from the rest of the data. The remaining data is then sent as JSON to the BentoML predict endpoint. Upon receiving the prediction results, they are merged back with the original DataFrame, and the results are rendered through an HTML template.

from flask import Flask, render_template, request
import pandas as pd
import requests
import base64
import io

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/predict', methods=['POST'])
def predict():
    file_data = request.form.get('file')
    # Decode the Base64 encoded CSV file content
    decoded_file = base64.b64decode(file_data.split(',')[1])
    # Read CSV content into a DataFrame
    df = pd.read_csv(io.StringIO(decoded_file.decode('utf-8')))
    
    # Separate the 'claim_id' column if it exists
    if 'claim_id' in df.columns:
        claim_ids = df['claim_id']
        df = df.drop(columns=['claim_id'])
    else:
        claim_ids = None
    
    # Send the DataFrame to the BentoML service as JSON
    response = requests.post(
        'http://127.0.0.1:3000/predict/',  # BentoML endpoint
        json=df.to_dict(orient='records')
    )
    
    # Retrieve predictions from the response
    predictions = response.json()['predictions']
    df['Prediction'] = predictions
    
    # Reattach the 'claim_id' column if it was present
    if claim_ids is not None:
        df['claim_id'] = claim_ids
    
    # Render the results in the results.html template
    return render_template('results.html', tables=[df.to_html(classes='data')])
    
if __name__ == '__main__':
    app.run(port=5005)

Note: Ensure that both Flask and BentoML are installed in your Python environment to avoid import errors.

Step 2: Creating HTML Templates

To provide a straightforward user interface, create a folder named templates in your project directory. Inside this folder, establish the following HTML templates:

1. index.html

This template displays the file upload interface for users. (Customize this file according to your design preferences.)

2. results.html

The results.html file displays prediction results formatted in a table. Below is an example template with basic table styling:

<html lang="en">
<head>
    <style>
        table, th, td {
            border: 1px solid black;
        }
        th, td {
            padding: 10px;
            text-align: left;
        }
    </style>
</head>
<body>
    <h1>Prediction Results</h1>

{{ table|safe }}

    <a href="/">Go Back</a>
</body>
</html>

3. Additional HTML for File Upload Handling

Below is a minimal HTML snippet that includes client-side logic for handling CSV file uploads. You can either incorporate this snippet within your index.html or save it as a separate file (for example, visualize.html):

<html lang="en">
<body>
<script>
function handleFile(file) {
    if (file && file.type === 'text/csv') {
        const reader = new FileReader();
        reader.readAsDataURL(file);
        reader.onload = () => {
            hiddenFileInput.value = reader.result;
            uploadForm.submit();
        };
    } else {
        alert('Please upload a valid CSV file.');
    }
}
</script>
</body>
</html>

These templates serve as a starting point for your application’s user interface. In real-world projects, front-end engineers might further enhance or style these templates.

Step 3: Running the Flask Application

After saving all your files, run your Flask application with the following command in your terminal:

python3 flaskapp.py

The terminal output should resemble the following:

* Serving Flask app "flaskapp"
* Debug mode: off
WARNING: This is a development server. Do not use it in production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5005/ (Press CTRL+C to quit)
* Restarting with stat
* Debugger is active!
* Debugger PIN: 119-280-234

Open your browser and navigate to http://127.0.0.1:5005/ to access the application.

Step 4: Debugging Template Errors

If you encounter a TemplateNotFound error for result.html when uploading a CSV file, it indicates that Flask is attempting to render a template with an incorrect name. Verify that your template filenames are consistent with those being referenced in your Flask code. The correct file name should be results.html (plural), not result.html.

The image shows a web page displaying a "TemplateNotFound" error from a Flask application, indicating that the "result.html" template is missing. It includes a traceback of the error in the code.

Review your project files in VS Code to confirm that the template folder contains the correctly named file:

The image shows a Visual Studio Code interface with a project open, displaying an HTML file with some CSS styling and a terminal window below showing error logs related to a Python Flask application.

Step 5: Testing the BentoML Integration

After resolving any template errors, refresh your page and drag-and-drop a CSV file containing claim data into the upload area. The application will then invoke the BentoML predict endpoint, and the ML model will process the claim data to return predictions. For example, a prediction value of -1 may indicate that a claim requires further investigation, whereas a value of +1 suggests that the claim can be automatically approved. The prediction results are then rendered into a table displaying essential claim details, such as claim ID, amount, number of services, patient age, provider ID, days since the last claim, and the corresponding prediction value.

The image shows a table titled "Prediction Results" with columns for claim details, including claim ID, amount, number of services, patient age, provider ID, days since last claim, and a prediction value.

This organized presentation enables insurance agents to quickly pinpoint claims that require further review (i.e., those with a prediction of -1) while streamlining the approvals for other claims.

Step 6: Architectural Overview

To summarize the entire workflow, consider the following steps:

Users submit claims through a dedicated portal.
The uploaded CSV file containing claim data is processed by the Flask app.
The processed data is sent as JSON to the BentoML predict endpoint.
The ML model hosted with BentoML analyzes the data and returns predictions.
Claims with a prediction of -1 are flagged for review, while those with +1 are automatically approved.
A Data Lake containing historical claim data supports the ML model training process, further enhancing prediction accuracy.

The complete architecture is illustrated in the following flowchart, which outlines the workflow from claim submission to final approval or review:

The image is a flowchart illustrating an ML model designed to accelerate insurance claims processing, showing the steps from claim submission to approval and payout.

Conclusion

This end-to-end project demonstrates how to modernize the insurance claims process using an ML model served via BentoML, seamlessly integrated with a Flask-based user interface. By automating the detection of problematic claims and streamlining automatic approvals, this solution empowers insurance agents to focus on claims that genuinely require attention. That concludes this lesson. See you in the next tutorial—thank you for joining us!

Note: For additional details on Flask, BentoML, and integrating machine learning with web applications, visit the Flask Documentation and BentoML Docs.

Data Collection and Preparation

Model Development and Training

Model Deployment and Serving

Data Security and Governance

Automating Insurance Claim Reviews with M Lflow and Bento ML

Introduction to ML Ops

Sneak Peek into AWS Sage Maker

Demo Upgrade Python Flask App to Connect to BentoML for Online Serving

Step 1: Setting Up the Flask Application

Step 2: Creating HTML Templates

1. index.html

2. results.html

3. Additional HTML for File Upload Handling

Step 3: Running the Flask Application

Step 4: Debugging Template Errors

Step 5: Testing the BentoML Integration

Step 6: Architectural Overview

Conclusion

Watch Video

Practice Lab

Data Collection and Preparation

Model Development and Training

Model Deployment and Serving

Data Security and Governance

Automating Insurance Claim Reviews with M Lflow and Bento ML

Introduction to ML Ops

Sneak Peek into AWS Sage Maker

​Step 1: Setting Up the Flask Application

​Step 2: Creating HTML Templates

​1. index.html

​2. results.html

​3. Additional HTML for File Upload Handling

​Step 3: Running the Flask Application

​Step 4: Debugging Template Errors

​Step 5: Testing the BentoML Integration

​Step 6: Architectural Overview

​Conclusion

Watch Video

Practice Lab

Step 1: Setting Up the Flask Application

Step 2: Creating HTML Templates

1. index.html

2. results.html

3. Additional HTML for File Upload Handling

Step 3: Running the Flask Application

Step 4: Debugging Template Errors

Step 5: Testing the BentoML Integration

Step 6: Architectural Overview

Conclusion