|
Edit Excel XLSX Documents Efficiently with Cloudmersive APIs |
1/22/2025 - Brian O'Neill |
Creating a smooth, automated file-processing workflow with efficient memory usage can be a bit challenging. That’s especially true when we’re working with complex files like Excel XLSX documents. How are Excel files normally handled in programmatic workflows?Before we call local methods or external APIs to modify Excel file contents in some way, we’ll typically load our Excel files into local memory and store that content in a variable - just like any other data. Handling our files this way is straightforward and simple, but it creates a considerable amount of overhead, stretching our (often limited) resources thin. Complex files tend to occupy a good chunk of space in local memory, and this inherently limits the efficiency of our program. The inefficiency of the Excel file-loading & file-storage process isn’t a huge problem for small-scale operations (e.g., editing a single Excel file in one quick workflow), but it becomes increasingly significant in large-scale operations (e.g., editing dozens of Excel files over the course of an hour or two). This inefficiency is also greatly magnified when we chain multiple Excel document edits together across a series of unique methods or web API calls; each new iteration of our modified Excel content must be written to a new variable. How do we circumvent inefficiencies caused by repeatedly assigning file bytes to local memory?There are a few ways to resolve this inefficiency when we’re dealing with local file processing methods - such as prioritizing working in-memory within certain programming libraries, or assembling batch edit operations where multiple edits are combined in a single method. It is, however, a bit more challenging to solve this problem when we leverage web API calls in place of local method calls. Web APIs return file content from some external server to our code environment each time they perform their value-add service (in this case, editing an Excel file in some way). But what if they didn’t have to actually return the file? What if they could return a reference to the file instead? Chaining Excel Edits with Cloudmersive Convert APIsMany Cloudmersive Convert/Edit/ (EditDocument) APIs solve this overhead issue by using temporary URLs to improve the efficiency of chained document editing actions. When an Excel file is loaded into memory and passed to the Convert API endpoint (e.g., for operations like inserting or deleting new worksheets & setting the contents of specific cells/rows), that file’s contents are cached on the endpoint server temporarily. For security, this cache is permanently deleted after 30 minutes, and it’s assigned a secure temporary URL for use before the expiration time (e.g., https://tempstash.cloudmersive.com/api/SecureFileManager/DownloadFile/ would be the URL format for Cloudmersive public cloud API calls). The secure temporary URL is returned to the code environment in place of the modified file bytes themselves. This means we can store the lightweight temporary URL string in a variable in place of the more resource-intensive Excel file bytes. We can pass that URL variable back to the endpoint when we use a subsequent Excel editing API; the endpoint can begin another file-processing operation with the modified Excel content by reaching into the cache with the temporary URL. When we’re ready to finish our document editing workflow, we can simply download the fully modified content from the final temporary URL and write that content to a new Excel XLSX file. Example: Creating an Excel worksheet & filling in rows/cellsLet’s imagine, for example, that we’re using EditDocument APIs to 1) insert a new worksheet into an Excel document and 2) update cell contents within that worksheet by cell index. We can assume, in this case, that our Convert API endpoint is deployed on Cloudmersive servers locally in our region of the globe. We would start by calling the /convert/edit/begin-editing document API. After loading our Excel file into memory just one time, we would pass our file content to the begin-editing method. This would cache our Excel content on the endpoint server & return a tempstash.[our API endpoint] URL to our code environment. We would then store this URL in a new variable. Next, we would call the /convert/edit/xlsx/insert-worksheet API, passing our URL variable to the insert-worksheet method as an “InputFileUrl”: argument within a more complex set of request parameters (see below):
We would then store the insert-worksheet response URL as a variable before calling the /convert/edit/xlsx/set-cell/by-index API. We would once again pass our URL variable as an “InputFileUrl”: argument, this time within the set-cell/by-index method parameters (see below):
Finally, we would call the /convert/edit/finish-editing API to download the fully modified Excel file bytes to our code environment. To do that, we would simply pass the set-cell/by-index URL to the finish-editing method as a single “InputFileUrl”: argument (see below):
In seconds, our workflow would return a set of fully modified Excel file bytes. We would simply write the modified content to a new XLSX file (or overwrite the original file). This new file would have a new worksheet with custom cells and rows as defined in our insert-worksheet and set-cell/by-index API calls. ConclusionBy retaining Excel file bytes on the endpoint server & returning a temporary URL to reference that content, Cloudmersive EditDocument APIs reduce overhead from inefficient local memory usage. This speeds up programs which chain complex document editing actions together in a single workflow. For expert advice on leveraging Cloudmersive Convert/EditDocument APIs in your file processing workflows, please reach out to a member of our team. |