How to Intelligently Split Multi-Document Files Using AI in Python

Technical Articles

Review Cloudmersive's technical library.

4/21/2026 - Brian O'Neill

When enterprises process large volumes of physical paperwork (think stacks of scanned ID cards or mixed document packets) they often end up digitizing everything into a single file. The result? A sprawling, multi-document PDF that technically contains all the right information, but in a format that’s completely unusable for any downstream automation. Someone (or something) still has to figure out where one document ends and the next begins.

Manually splitting these files is tedious, and it doesn’t scale. It’s also very error prone, which can quickly cascade into mountains of new problems down the road. Building that logic yourself in code means wrestling with things like layout heuristics and header detection, which are no fun at all. There’s also the problem of visual boundary recognition, which is a genuinely difficult issue that has nothing to do with whatever your application is actually supposed to do.

Splitting Documents with Cloudmersive Document AI

That’s the exact problem the Cloudmersive AI Document Splitting AI is built to solve. It accepts a multi-document file as input, analyzes its contents using advanced AI, and then returns each identified sub-document as its own discrete chunk, complete with page range metadata and PDF bytes you can work with directly.

Intelligent Split Docs Hero Graphic

It detects boundaries based on visual content and document-type recognition, which means it handles messy, real-world input far better than any rule-based approach could. It’s built for enterprise environments, so it handles a variety of file formats ranging from DOCX and PDF to XLSX, PPTX, JPG, PNG, and WEBP.

Implementing the AI Document Splitting API in Python

In this article, we’ll walk through an example API call using Python (3) in Google Colab, and we’ll walk through what the response looks like. Code examples are pulled directly from the Cloudmersive swagger page, which you can find linked here.

To follow along with this walkthrough, you’ll need a Cloudmersive API key, which you can get by signing up for an account on our website. You can get a free API key with 800 API calls/month and no commitments, and that’s more than enough to work through this example. Just bear in mind that this API consumes 100 calls per page in the input document, so keep that in mind when testing with larger files.

Installing the SDK

First things first, let’s get the SDK installed. We can run the below command in our terminal:

pip install cloudmersive-documentai-api-client

Importing Resources

With SDK installation out of the way, we’ll pull in the resources we need:

from __future__ import print_function
import time
import cloudmersive_documentai_api_client
from cloudmersive_documentai_api_client.rest import ApiException
from pprint import pprint

Note that we don't actually need the from __future__ or time imports.

Structuring the Request

The Document Splitting API uses a multipart form data request, so the structure looks a little different from a typical Cloudmersive API call. We’ll start from the raw example code on the Swagger page:

# Configure API key authorization: Apikey
configuration = cloudmersive_documentai_api_client.Configuration()
configuration.api_key['Apikey'] = 'YOUR_API_KEY'



# create an instance of the API class
api_instance = cloudmersive_documentai_api_client.ExtractApi(cloudmersive_documentai_api_client.ApiClient(configuration))
recognition_mode = 'recognition_mode_example' # str | Optional; Recognition mode - Advanced (default) provides the highest accuracy but slower speed, while Normal provides faster response but lower accuracy for low quality images (optional)
input_file = '/path/to/inputfile' # file | Input document containing multiple sub-documents to split (optional)

try:
    # Intelligently Split a Combined Document into Sub-Documents using AI
    api_response = api_instance.extract_split(recognition_mode=recognition_mode, input_file=input_file)
    pprint(api_response)
except ApiException as e:
    print("Exception when calling ExtractApi->extract_split: %s\n" % e)

We have a few things to fill in before this is functional. Most obviously, we’ll need to replace the ’YOUR_API_KEY’ snippet with our actual key. We’ll also need to specify our input file and, optionally, enter a recognition mode (it’s best to leave this at the default value for testing purposes).

First, let’s handle the API key and host configuration:

configuration.api_key['Apikey'] = userdata.get("freekey")
configuration.host = "api.cloudmersive.com"

For our input file, we’ll use a sample multi-document file containing a few different file types together (specifically an invoice, a contract, and a form).

Here’s the completed request:

# Configure API key authorization: Apikey
configuration = cloudmersive_documentai_api_client.Configuration()
configuration.api_key['Apikey'] = userdata.get('freekey')
configuration.host = "api.cloudmersive.com"

# create an instance of the API class
api_instance = cloudmersive_documentai_api_client.ExtractApi(cloudmersive_documentai_api_client.ApiClient(configuration))
recognition_mode = '' # str | Optional; Recognition mode - Advanced (default) provides the highest accuracy but slower speed, while Normal provides faster response but lower accuracy for low quality images (optional)
input_file = 'mixed_documents.pdf' # file | Input document containing multiple sub-documents to split (optional)

try:
    # Intelligently Split a Combined Document into Sub-Documents using AI
    api_response = api_instance.extract_split(recognition_mode=recognition_mode, input_file=input_file)
    pprint(api_response)
except ApiException as e:
    print("Exception when calling ExtractApi->extract_split: %s\n" % e)

Interpreting the API response

After a moment, the API will return a response structured like the below example:

Example response

The top-level Successful flag gives us a quick sanity check before we start iterating over our results. The real substance lives within the SubDocuments array, where each entry contains four fields.

StartPage and EndPage define the page range within the original document where that sub-document was found. DocumentDescription gives a plain-language label for what the AI identified; something like “Financial Document” or “Tax Form”. And FileBytes contains the base64-encoded PDF bytes for that sub-document, which you can decode and write directly to disk or pipe into another processing step.

So if the original file contained three distinct documents, you’d get three entries in SubDocuments (like this example), each independently extractable and labeled. From there, routing each sub-document to the right downstream workflow becomes straightforward.

Here's an example response from the tree-page multi-document file used in this example (file byte strings are shortened for readability):

{'sub_documents': [{'document_description': 'Financial Document',
                    'end_page': 1,
                    'file_bytes': 'JVBERi0xLjcKJanN...',
                    'start_page': 1},
                   {'document_description': 'Legal Document',
                    'end_page': 2,
                    'file_bytes': 'JVBERi0xLjcKJanN...',
                    'start_page': 2},
                   {'document_description': 'Form',
                    'end_page': 3,
                    'file_bytes': 'JVBERi0xLjcKJanN...',
                    'start_page': 3}],
 'successful': True}

Conclusion

The Cloudmersive AI Document Splitting API takes a difficult document processing problem and reduces it to a single API call. Whether you’re dealing with mixed intake forms or batched scanned records, plugging this into a Python application is a low-lift way to unlock reliable, AI-driven document boundary detection without building any of that logic yourself.

If you want guidance on fitting this API into a larger document processing pipeline, we encourage you to reach out to the Cloudmersive sales team for expert advice.

Technical Articles

Splitting Documents with Cloudmersive Document AI

Implementing the AI Document Splitting API in Python

Installing the SDK

Importing Resources

Structuring the Request

Interpreting the API response

Conclusion

Related

600 free API calls/month, with no expiration

API Products

Virus Scan APIs

Content Disarm and Reconstruction APIs

Spam Detection APIs

Document Conversion & Processing APIs

Document AI APIs

Natural Language Processing (NLP) APIs

Optical Character Recognition (OCR) APIs

Image and Face Recognition and Processing APIs

Questions? We'll be your guide.