Chunked & Resumable Uploads in Python (Django & Flask) — Resumable.js Guides

Guides·Updated 2026-04-11

Python is one of the most common backend choices for web applications, and both Django and Flask handle chunked uploads from Resumable.js well — once you understand the protocol. Resumable.js doesn't use anything exotic. It sends standard multipart POST requests with extra query parameters identifying the chunk. Your server needs two endpoints: one to receive chunks (POST) and one to check whether a chunk already exists (GET, used when testChunks is enabled). The rest is file I/O and a merge step.

This guide implements both endpoints in Flask and Django, covers the merge logic, and addresses the server-side configuration and security concerns that trip people up in production.

How Resumable.js Sends Chunks

When Resumable.js uploads a file, it splits the file into chunks on the client side and sends each chunk as a separate HTTP request. Each request includes these parameters (as query parameters on GET, form fields on POST):

Parameter	Description
`resumableChunkNumber`	The index of the chunk (1-based)
`resumableChunkSize`	The configured chunk size in bytes
`resumableCurrentChunkSize`	The actual size of this chunk (the last chunk may be smaller)
`resumableTotalSize`	Total file size in bytes
`resumableType`	The file's MIME type
`resumableIdentifier`	A unique identifier for the file (typically size + filename hash)
`resumableFilename`	The original filename
`resumableRelativePath`	The file's path relative to the selected directory
`resumableTotalChunks`	Total number of chunks

The file data itself is sent as a multipart file field named file in the POST request.

The GET request (for testChunks) sends the same parameters as query strings. Your server checks whether it already has that chunk and returns 200 (already have it, skip) or 204/404 (don't have it, please send).

Flask Implementation

Flask's lightweight routing makes the implementation straightforward. Two routes, one helper function.

# app.py
import os
from pathlib import Path
from flask import Flask, request, jsonify

app = Flask(__name__)

UPLOAD_DIR = Path('./uploads')
CHUNK_DIR = Path('./chunks')
CHUNK_DIR.mkdir(parents=True, exist_ok=True)
UPLOAD_DIR.mkdir(parents=True, exist_ok=True)

# Max request size: must accommodate your chunk size + overhead
app.config['MAX_CONTENT_LENGTH'] = 20 * 1024 * 1024  # 20 MB


def chunk_path(identifier, chunk_number):
    """Path for a specific chunk file."""
    # Sanitize identifier to prevent directory traversal
    safe_id = "".join(c for c in identifier if c.isalnum() or c in '-_')
    return CHUNK_DIR / f"{safe_id}.part{int(chunk_number):06d}"


@app.route('/api/upload', methods=['GET'])
def check_chunk():
    """Check if a chunk already exists (testChunks support)."""
    chunk_number = request.args.get('resumableChunkNumber', type=int)
    identifier = request.args.get('resumableIdentifier', '')

    if not chunk_number or not identifier:
        return 'Missing parameters', 400

    path = chunk_path(identifier, chunk_number)
    if path.exists():
        return 'Chunk exists', 200
    return 'Chunk not found', 204


@app.route('/api/upload', methods=['POST'])
def upload_chunk():
    """Receive and store a single chunk."""
    chunk_number = request.form.get('resumableChunkNumber', type=int)
    total_chunks = request.form.get('resumableTotalChunks', type=int)
    identifier = request.form.get('resumableIdentifier', '')
    filename = request.form.get('resumableFilename', '')

    if not all([chunk_number, total_chunks, identifier, filename]):
        return 'Missing parameters', 400

    # Validate chunk number range
    if chunk_number < 1 or chunk_number > total_chunks:
        return 'Invalid chunk number', 400

    file = request.files.get('file')
    if not file:
        return 'No file data', 400

    # Save the chunk
    path = chunk_path(identifier, chunk_number)
    file.save(str(path))

    # Check if all chunks are present
    if all_chunks_received(identifier, total_chunks):
        merge_chunks(identifier, total_chunks, filename)
        return jsonify({'status': 'complete', 'filename': filename}), 200

    return jsonify({'status': 'chunk_received'}), 200


def all_chunks_received(identifier, total_chunks):
    """Check if every chunk for this upload has been saved."""
    for i in range(1, total_chunks + 1):
        if not chunk_path(identifier, i).exists():
            return False
    return True


def merge_chunks(identifier, total_chunks, filename):
    """Concatenate all chunks into the final file."""
    # Sanitize filename
    safe_filename = "".join(
        c for c in filename if c.isalnum() or c in '-_.'
    )
    output_path = UPLOAD_DIR / safe_filename

    with open(output_path, 'wb') as outfile:
        for i in range(1, total_chunks + 1):
            path = chunk_path(identifier, i)
            with open(path, 'rb') as chunk_file:
                while True:
                    data = chunk_file.read(8192)
                    if not data:
                        break
                    outfile.write(data)

    # Clean up chunk files
    for i in range(1, total_chunks + 1):
        path = chunk_path(identifier, i)
        path.unlink(missing_ok=True)


if __name__ == '__main__':
    app.run(debug=True, port=5000)

The merge step reads each chunk sequentially in 8 KB buffer increments rather than reading entire chunks into memory. For a 500 MB file with 5 MB chunks, loading each chunk entirely would use 5 MB of memory per read. The buffered approach keeps memory usage constant regardless of chunk size.

Django Implementation

Django's view layer handles the same logic with slightly different request parsing. The parameters arrive via request.GET for the check endpoint and request.POST / request.FILES for the upload.

# views.py
import os
from pathlib import Path
from django.http import JsonResponse, HttpResponse
from django.views.decorators.csrf import csrf_exempt
from django.conf import settings

CHUNK_DIR = Path(settings.BASE_DIR) / 'chunks'
UPLOAD_DIR = Path(settings.BASE_DIR) / 'uploads'
CHUNK_DIR.mkdir(parents=True, exist_ok=True)
UPLOAD_DIR.mkdir(parents=True, exist_ok=True)


def chunk_path(identifier, chunk_number):
    safe_id = "".join(c for c in identifier if c.isalnum() or c in '-_')
    return CHUNK_DIR / f"{safe_id}.part{int(chunk_number):06d}"


@csrf_exempt
def upload_chunk(request):
    if request.method == 'GET':
        return check_chunk(request)
    elif request.method == 'POST':
        return receive_chunk(request)
    return HttpResponse(status=405)


def check_chunk(request):
    chunk_number = request.GET.get('resumableChunkNumber')
    identifier = request.GET.get('resumableIdentifier', '')

    if not chunk_number or not identifier:
        return HttpResponse('Missing parameters', status=400)

    path = chunk_path(identifier, int(chunk_number))
    if path.exists():
        return HttpResponse('Chunk exists', status=200)
    return HttpResponse('Chunk not found', status=204)


def receive_chunk(request):
    chunk_number = int(request.POST.get('resumableChunkNumber', 0))
    total_chunks = int(request.POST.get('resumableTotalChunks', 0))
    identifier = request.POST.get('resumableIdentifier', '')
    filename = request.POST.get('resumableFilename', '')

    if not all([chunk_number, total_chunks, identifier, filename]):
        return HttpResponse('Missing parameters', status=400)

    if chunk_number < 1 or chunk_number > total_chunks:
        return HttpResponse('Invalid chunk number', status=400)

    file = request.FILES.get('file')
    if not file:
        return HttpResponse('No file data', status=400)

    # Save chunk using Django's file handling
    path = chunk_path(identifier, chunk_number)
    with open(path, 'wb') as dest:
        for chunk in file.chunks():
            dest.write(chunk)

    # Check for completion
    all_present = all(
        chunk_path(identifier, i).exists()
        for i in range(1, total_chunks + 1)
    )

    if all_present:
        merge_chunks(identifier, total_chunks, filename)
        return JsonResponse({'status': 'complete', 'filename': filename})

    return JsonResponse({'status': 'chunk_received'})


def merge_chunks(identifier, total_chunks, filename):
    safe_filename = "".join(
        c for c in filename if c.isalnum() or c in '-_.'
    )
    output_path = UPLOAD_DIR / safe_filename

    with open(output_path, 'wb') as outfile:
        for i in range(1, total_chunks + 1):
            path = chunk_path(identifier, i)
            with open(path, 'rb') as chunk_file:
                while True:
                    data = chunk_file.read(8192)
                    if not data:
                        break
                    outfile.write(data)

    for i in range(1, total_chunks + 1):
        chunk_path(identifier, i).unlink(missing_ok=True)

And the URL configuration:

# urls.py
from django.urls import path
from . import views

urlpatterns = [
    path('api/upload', views.upload_chunk, name='upload_chunk'),
]

Note the @csrf_exempt decorator. Resumable.js sends XHR requests that won't include Django's CSRF token by default. In production, you should either pass the CSRF token via Resumable.js headers (configuration reference) or use a separate authentication mechanism like token-based auth. Don't leave CSRF protection disabled on production endpoints.

Django uses file.chunks() instead of file.read() for the uploaded chunk data. This is Django's built-in streaming iterator that handles large uploads without loading the entire request body into memory — which matters when DATA_UPLOAD_MAX_MEMORY_SIZE is configured to allow large requests.

Server Limits

Three layers of configuration gate how large a chunk your server will accept:

Reverse proxy (nginx)

# nginx.conf
client_max_body_size 20m;  # Must be >= your chunk size

The nginx default is 1 MB. If your chunks are 5 MB, you'll get 413 Request Entity Too Large before your Python code ever sees the request. This is the most common "it works in development but fails in production" issue.

Django

# settings.py
DATA_UPLOAD_MAX_MEMORY_SIZE = 20 * 1024 * 1024  # 20 MB
FILE_UPLOAD_MAX_MEMORY_SIZE = 20 * 1024 * 1024

Django's DATA_UPLOAD_MAX_MEMORY_SIZE defaults to 2.5 MB. Requests exceeding this are rejected with a RequestDataTooBig exception. Set it to at least your maximum chunk size plus some overhead for the multipart framing.

Flask

app.config['MAX_CONTENT_LENGTH'] = 20 * 1024 * 1024  # 20 MB

Flask rejects requests exceeding MAX_CONTENT_LENGTH with a 413 response. If unset, Flask has no limit — which is a security risk. Always set it explicitly.

For guidance on choosing the right chunk size for these limits, see the chunk size optimization guide.

Temporary Storage Cleanup

Incomplete uploads leave orphaned chunk files on disk. A user might start uploading, close the browser, and never return. Without cleanup, your chunk directory grows indefinitely.

A simple approach: a periodic cleanup task that deletes chunk files older than a threshold.

# cleanup.py
import time
from pathlib import Path

CHUNK_DIR = Path('./chunks')
MAX_AGE_HOURS = 24


def cleanup_stale_chunks():
    now = time.time()
    cutoff = now - (MAX_AGE_HOURS * 3600)

    for path in CHUNK_DIR.iterdir():
        if path.is_file() and path.stat().st_mtime < cutoff:
            path.unlink()
            print(f"Removed stale chunk: {path.name}")

Run this via cron (0 * * * * python cleanup.py), a Django management command, a Celery beat task, or whatever scheduling mechanism your deployment uses. Twenty-four hours is a reasonable default — long enough to accommodate overnight uploads on terrible connections, short enough to prevent disk bloat.

For larger deployments, consider using a database table to track active uploads. When an upload completes or is explicitly cancelled, mark it done and clean up immediately. The cron job then only handles the truly abandoned uploads.

Security Considerations

Accepting file uploads from the internet is inherently risky. Several things to validate on every chunk request:

Chunk number bounds. Verify that resumableChunkNumber is between 1 and resumableTotalChunks. The code above checks this. Without it, an attacker could send arbitrary chunk numbers and fill your disk with files.

Filename sanitization. Never use the client-provided filename directly in file paths. The examples above strip everything except alphanumeric characters, hyphens, underscores, and dots. A filename like ../../etc/passwd must not resolve to a path outside your upload directory. The security guide covers this in depth.

MIME type validation. The resumableType parameter reports the client-side MIME type, but it's trivially spoofable. For real validation, check the file's magic bytes after the merge step. Python's python-magic library handles this:

import magic

def validate_file_type(filepath, allowed_types):
    mime = magic.from_file(str(filepath), mime=True)
    if mime not in allowed_types:
        filepath.unlink()
        raise ValueError(f"Invalid file type: {mime}")

The file validation guide covers client-side and server-side validation strategies in detail.

File size enforcement. Even though each chunk is within your size limit, validate that resumableTotalSize doesn't exceed your maximum allowed file size. Check this on the first chunk and reject early if it's too large.

Identifier collisions. The resumableIdentifier is generated client-side, typically as a combination of file size, filename, and a relative path. Two different users uploading the same file could generate the same identifier. In a multi-user system, scope chunks by user ID — use a path like chunks/{user_id}/{identifier}/ instead of a flat directory.

Wiring It Up

On the client side, point Resumable.js at your endpoint:

const r = new Resumable({
  target: '/api/upload',
  chunkSize: 5 * 1024 * 1024,
  testChunks: true,
  simultaneousUploads: 3,
});

The testChunks: true setting enables the GET check before each chunk upload. This is what makes resumption work across page reloads and reconnections — the client asks the server which chunks it already has and only sends the missing ones. The server receivers guide covers additional server-side patterns including Node.js and PHP implementations alongside the Python approaches shown here.