Technical Articles

Review Cloudmersive's technical library.

What are Executables and Why Are They a Threat?
6/1/2023 - Brian O'Neill


Executables are a vital file type, required to install many of the desktop applications and tools we rely on day to day. They can also be used to inject malware into our system, however, which makes them a potent security threat. Below, we’ll review how legitimate executables work and discuss how malicious executable threats can be mitigated with straightforward security policies.

code running on a screen

What are Executables?

Executables are files containing binary machine code which is explicitly intended to run (execute) on a certain computer operating system. These files make it extremely easy to distribute software programs across multiple systems through simple file sharing mechanisms. They’re commonly used to initiate the installation of commercial products like licensed business applications, computer games, and many more examples.

An executable file can be thought of like a list of instructions. After an executable file is launched on a compatible system, the system will load the file’s contents into memory and work its way down the file’s list of instructions, ultimately running the executable file’s program.

Why are Executables Dangerous?

Since executable files are designed to run programs directly after file opening, they present an ideal method for threat actors to install malicious code – including viruses, malware, and a variety of other threat types – onto a victim’s operating system. Attackers can email executable files from compromised external devices and convince victims to open (and therefore execute) their malicious files using social engineering tactics. They can also share these files directly through vulnerable file upload portals for distribution to a wide range of victims who have access to their target application’s database.

Executable files can also be distributed through far more insidious means to avoid detection from poorly configured security policies. For example, attackers can give executable files misleading names, so they appear to be an entirely different file format than they are. A file with a seemingly innocent name like “newemployeeheadshot.jpg” might contain executable contents, and this file might then successfully avoid detection from system users who assume the file contents safely match the simple, descriptive title. Additionally, attackers can bury executable files in compressed archives alongside a number of valid documents. Once files are extracted from these folders, the executable file can lay dormant until it’s opened by a downstream system user at some point in the future.

How can Executable File Threats be Prevented?

Preventing the distribution of malware within any system always begins with user accountability. System users should be regularly trained to identify illegitimate files, and they should know to avoid clicking on suspicious attachments shared from any external source.

It’s equally critical to scan files for malicious content as they enter a system at the network edge, before they enter specific applications, and while they reside in cloud storage instances. The Cloudmersive Advanced Virus Scan API can be deployed in any of those locations, and API administrators can configure custom policies in its request body (or from their Cloudmersive account page) to block executable files from entering their system. This API ignores superficial information like file names, extensions, and headers, instead performing in-depth content verification to detect and report each file’s actual encoding.

This API also scans files for viruses and malware, referencing a continuously updated list of more than 17 million virus and malware signatures in the process. Further, it provides additional request policies for blocking more non-malware threats than just executables, with coverage including scripts, macros, XML external entities, and much more. Any files found to contain viruses, malware, or non-malware threats blocked by custom policies in the API request body will receive a CleanResult: False response. This uniform threat value makes it easy to categorically delete (or quarantine) malicious files in a single API request.

For more information on the Cloudmersive Advanced Virus Scan API, please do not hesitate to reach out to a member of our sales team.



What is Remote Code Execution?
5/26/2023 - Brian O'Neill


Remote Code Execution (RCE) is an extremely dangerous attack vector which aims to exploit vulnerabilities in an application’s input validation measures to execute arbitrary code. While RCE is conceptually straightforward, it’s also an extremely broad threat category. There are multiple ways RCE can be accomplished, and the outcomes can vary greatly depending on a threat actor’s individual goals.

hacker writing code

What are Common Remote Code Execution Outcomes?

Threat actors can use Remote Code Execution attacks to accomplish a variety of malicious outcomes. A few of those outcomes are outlined below.

One common goal of an RCE attack is to disrupt regular activity within a system or network, often with the intention of causing Denial of Service. By gaining unrestricted access to a system, a threat actor can arbitrarily change important system information, remove critical files, and deny external users access to the system’s internal resources. This type of RCE attack severely damages the victim’s reputation, often causing irreparable harm to their current and future business prospects.

Another common goal of RCE is Data Theft. Most companies thriving in the modern, digital landscape view data as their most valued asset, including financial data, customer data (e.g., personal contact information), and much more. Motivated threat actors are aware of this fact, and as a result, RCE attacks often aim to retrieve or modify data housed in sensitive storage locations. Once a threat actor gains access to internal data, they can remove (steal) it, corrupt it, or threaten to delete it unless the victim provides financial compensation (i.e., ransomware).

How is Remote Code Execution Accomplished?

At a high level, to accomplish any RCE attack, a threat actor must first find weak points in their target’s system or network. After that, they can begin sharing malicious code by exploiting the vulnerability they found. Once they succeed in the first two steps, the threat actor can hijack their target system/network or compromise data for a variety of purposes.

One common system vulnerability is found in an application’s data deserialization process. By sending malicious objects through a poorly sanitized data parser, a threat actor can trick an application into executing code of their choosing. Several high-profile attacks in the last decade have utilized this method, and in one case, this resulted in theft of customer data on an unprecedented scale.

Another common vulnerability is found in a system’s external file upload process. In recent years, adoption of direct file upload workflows has grown in lockstep with businesses gaining increased access to affordable cloud storage solutions, and this development has encouraged threat actors to seek out file upload vulnerabilities much more actively. If file uploads aren’t scanned and validated thoroughly, threat actors can carry out RCE attacks by loading files with malicious code and delivering them through a vulnerable upload portal.

Protect your System with the Cloudmersive Advanced Virus Scan API

The Cloudmersive Advanced Virus Scan API provides 360-degree content protection for your system and/or network (depending on where you choose to deploy it). Its anti-virus and anti-malware policies reference a continuously updated list of more than 17 million virus and malware threats, and its non-malware threat detection capabilities can be customized to block common RCE file upload attack vectors, including executables, invalid files, scripts, password-protected files, insecure deserialization, and more.

For more information on the Cloudmersive Advanced Virus Scan API, please do not hesitate to contact a member of our sales team.



Why are Macros a Security Threat?
5/26/2023 - Brian O'Neill


typing one hand with document

What is a Macro?

Like most non-malware file upload threats, there’s nothing inherently malicious about a macro. At a high level, macros are just lines of Visual Basic (VBA) code embedded within a Microsoft document, and they’re generally intended to expedite some specific tasks performed within that document through process automation. For example, a macro might be created to automatically format text entered within specific cells in an Excel spreadsheet, or it might be used to create formulas that change based on certain evolving input criteria.

Why are Macros a Security Threat?

Despite their time-saving benefits, macros unfortunately represent a significant attack vector which threat actors exploit with relative ease. Since the introduction of macros in the early 90’s, threat actors have consistently (and quite often successfully) used them to inject malware onto client devices, often compromising systems completely while also gaining exponential access to new victims through those systems. Since macros can be primed to run based on simple document events (e.g., when a document is opened), all it takes is one errant double-click from a system user to trigger a full-blown malware crisis.

Contributing to the challenge of mitigating macro threats is the fact that Microsoft document formats are ubiquitous in professional systems. Files like .DOCX and .XLSX, for example, are used widely in professional environments all around the world, which means system users naturally expect to see them attached to emails and saved within important shared folders. By applying basic social engineering techniques, threat actors can share documents containing malicious macros (or encrypted .ZIP archives containing multiple infected documents) from compromised devices, assuring the recipient that enabling the macro provides access to necessary content. They can also share their infected document through an insecure file upload portal, waiting patiently for a downstream internal or external user to access that file and unleash its contents.

How are Macro Threats Mitigated?

Recent versions of Microsoft Office disable macros by default, so keeping systems up to date is an important first step. Additionally, regular user training is critical to prevent the spread of any type of malware through social engineering techniques; system users should always know how to identify suspicious emails and subsequently avoid clicking on their attachments.

A comprehensive anti-macro policy can also be configured within the Cloudmersive Advanced Virus Scan API request body. Setting the allowMacros Boolean to “False” will return a CleanResult: False response for any files containing macros, making it easy to delete or quarantine these files alongside other virus, malware, and non-malware content threats. This API references a continuously updated list of more than 17 million virus and malware signatures, including ransomware, spyware, trojans and more, and it delivers high-speed, sub-second typical response times.

For more information on Cloudmersive Virus Scanning APIs, please feel free to contact a member of our sales team.



Why are ZIP File Uploads Dangerous?
5/18/2023 - Brian O'Neill


Sharing groups of files in .ZIP archive format is common practice for users working within most professional networks. Compressing bulky files (including .PPTX, .XLSX, or .DOCX for example) drastically accelerates file upload processes and facilitates faster file sharing over communication platforms. The trouble with .ZIP files, however, is that they’re just as expedient for client-side threat actors as they are convenient for trustworthy users. As a result, it’s often best to avoid allowing .ZIP files move through our network entirely.

zip file

What makes .ZIP archives dangerous?

Files containing viruses, malware and other malicious content can be compressed together into .ZIP archives and jointly bypass weakly configured upload security policies. In some cases, polymorphic code can be used to further disguise these threats, ensuring any traceable signatures change frequently enough that basic anti-virus & anti-malware policies are thrown off their trail. Left undetected, these unsafe archives can remain dormant in file storage for extended periods of time before trusted users unwittingly open and activate their contents.

Further, even without using malicious code, threat actors can weaponize .ZIP files by filling them with immense quantities of data. Known as “ZIP Bombs,” these overloaded archives are intended to rapidly overwhelm and crash a system once opened, triggering Denial of Service (DoS) and sometimes opening the door to subsequent cyber-attacks.

How can the Cloudmersive Advanced Virus Scan API protect a system against unsafe archives?

Deployable in multiple critical locations around a network – including at the file storage layer, at the network edge, and in defense of any specific application with custom code integration – the Cloudmersive Advanced Virus Scan API can be used to scan inbound & outbound archives for millions of virus and malware signatures. In addition, a custom policy can be configured within the Advanced Virus Scan API request body to specifically detect and weed out unsafe archive contents. Once configured, all unsafe archives will receive a CleanResult: False Boolean within the API response body, making it easy to delete or quarantine these files before they can reach their intended destination.

For more information on Cloudmersive Virus Scanning APIs, please do not hesitate to reach out to a member of our sales team.



How do I protect my system from password-protected file threats?
5/10/2023 - Brian O'Neill


file system

New file upload security policies are typically workshopped in response to threats which successfully breached (or came close to breaching) some reputable organization's systems. Once brought to fruition, these advancements allow our own systems to inch closer towards becoming havens for valuable and sensitive customer data.

The challenge is that determined threat actors work equally hard to overcome new attempts to ward them off, and they’re quite often successful. Right when we least expect it, clandestine file upload attacks are carried out through even more convoluted pathways than we first imagined possible, often catching us off guard completely.

Why are password-protected file threats so dangerous?

Password-protected files are a particularly subtle threat type, designed to slip malicious code past poorly configured input validation/content verification policies while relying on us – the internal (or external) user accessing that file – to eventually execute that code within the targeted system. One of the most challenging aspects of threats involving password-protected files is that they tend to leverage simple & effective social engineering concepts, goading us into unlocking a threat which we aren’t alert to.

A threat actor can, for example, infiltrate systems adjacent to ours – i.e., a trusted client or partner – and then email or upload password-protected files to our system from a seemingly legitimate source. If we normally trust content originating from that source, we’re quite unlikely to suspect a potential threat, and we might blindly follow instructions to unlock that file with a password supplied alongside it. Many commonly shared file types like Excel (XLSX) and PDF are capable of housing malicious code behind password protection, and that code is designed to execute immediately once the file’s password is supplied.

How can I stop password-protected file threats from harming my system?

First and foremost, if password-protected files are a recurring and immutable form of content our job necessarily deals with, our own vigilance is key. Malicious password-protected files are sometimes identifiable by their incongruous names and suspect purposes; if we see a file that doesn’t look right, there’s a good chance it isn’t. It’s important to exercise extreme caution if we’re suddenly equipped with the means to unlock content which we don’t fully understand the origin or purpose of. We should always feel entitled to question where a file came from, who the original creator is, what the intended use-case for that file is, and more.

If password-protected files are not a necessary part of our workflow, however – which is often the case in scenarios involving external client-side file uploads – blocking them altogether is a sensible step to take.

The Cloudmersive Advanced Virus Scan API can be used to block password-protected files from a file upload process entirely. By setting the allowPasswordProtectedFiles Boolean to “false” in our request (configurable via our Cloudmersive Account page), we can ensure that all files with password protection will categorically receive a CleanResult: False response from the API, allowing us to easily delete (recommended) or quarantine these files and protect our underlying system from potential harm.

For more information on how the Cloudmersive Advanced Virus Scan API can protect any system from non-malware content threats, please feel free to reach out to a member of our sales team.



How do I protect my application against file upload threats?
5/5/2023 - Brian O'Neill


uploading mobile and laptop

What are file upload threats?

File upload threats refer to deliberate attempts, orchestrated by malicious client-side actors (i.e., hackers), to exploit our systems from the inside out by weaponizing common files in various ways. The incentive to execute such attacks is largely driven by an increase in the adoption of file upload processes by professional websites around the world.

File upload threats can be used to target a wide range of vulnerabilities in a file upload process. JSON or XML files can, for example, contain malicious code designed to be executed by poorly configured data parsers, allowing attackers to subsequently retrieve or delete sensitive data from within a system. PDF files can house dormant viruses and malware which infect client-side user devices once downloaded, leaving the website they originated from directly accountable and seriously damaging their reputation. Zip files can be rigged with exceptionally high volumes of data in order to crash (or severely limit) our system once they're opened, resulting in a sudden Denial of Service (DoS). The list goes on.

How do I protect my application against file upload threats?

There isn’t any one-track answer to this question, unfortunately. There are, however, a variety of sensible security policies which, applied together, can collectively create a formidable content security phalanx for any application. Below, we’ll look at a few of the most critical, rudimentary policies which can help secure any file upload process.

First and foremost, it’s critical to validate file extensions immediately after file names are decoded, which is a process made simpler by restricting accepted file extensions in the first place. While myriad file types are used around the world, most businesses only need to accept a few common file types to fulfill the needs of the service they’re providing. A recruitment website, for example, can comfortably limit CV (resume) uploads to .DOCX or .PDF file extensions without inconveniencing their user base, and a social media site can narrow its image uploads to .JPG or .PNG with similar effect.

It's equally important to validate file contents, too. Files with valid, expected extensions can quietly contain a very wide range of malicious content, including anything from viruses and malware to illicit (i.e., pornographic) material. As outlined above, poorly validated files can also contain massive quantities of inert data intended to bypass limited security checks and crash a system. Without digging into the contents of every new file upload, an application is at severe risk of releasing a virus internally (or infecting an external user device), distributing unsolicited illicit content, or triggering widespread service outages which might violate its SLA (Service Level Agreement).

How can the Cloudmersive Advanced Virus Scanning API Protect my system against file upload threats?

By combining multiple critical file upload security policies in a single API call with high-speed, in-memory scanning and sub-second response times, the Advanced Virus Scanning API protects applications against file uploads with unparalleled efficiency. This API is designed to provide 360-degree content protection against a wide range of file upload threats at once - including viruses and malware, executables, invalid files, HTML and scripts, password protected files, macros, XML external entities, JSON insecure deserialization, and OLE (Object linking and embedding). In the core virus scanning process, a perpetually cloud-updated list of more than 17 million virus and malware signatures is referenced to rapidly identify any potential viruses, malware, trojans, ransomware, or spyware embedded within a file’s encoding.

Apart from virus and malware checks, each built-in threat restriction stated above can be lifted by configuring the API’s request parameters with custom Boolean values (for example, to allow executables, the allowExecutables parameter can be set to “True”). Additionally, restrictions on unwanted file types can be placed upon any file upload process by providing a list of acceptable file types in comma-separated format. Once a list is designated in the API request, all file extensions AND contents will be verified against this list; any files which violate the file restriction policy will categorically receive a CleanResult: False response from the API.

For more information about Cloudmersive Virus Scan APIs (low-code and no-code products), please do not hesitate to reach out to a member of our sales team.



How to Scan SharePoint List Item Attachments using the Cloudmersive Virus Scanning Connector in Power Automate
4/13/2023 - Brian O'Neill


Seemingly harmless, run-of-the-mill document uploads in SharePoint can hide dangerous viruses and malware threats under the guise of valid file extensions. The longer these infected files remain undetected, the more likely it is that a member of our organization unwittingly opens one and releases its malicious content into our system.

Thankfully, through Power Automate, we can create RPA file security flows which automatically run when new files are uploaded to specific SharePoint locations. Below, I’ll demonstrate how we can use the Cloudmersive Virus Scanning Connector to ensure new SharePoint List Item Attachments are clean the moment they’re uploaded to a specific list.

Demonstration

The goal of this demonstration is to show how the Cloudmersive Virus Scanning Connector can be used in a Power Automate flow to automatically scan SharePoint list item attachments the moment they’re uploaded.

1 - To accomplish this, we’re going to start on the Power Automate home page and select the Create option on the lefthand side of the page. Doing so will provide us with the following flow options:

select flow type

For this flow, we’re going to select the Automated Cloud Flow option on the far lefthand side.

2 - Once we select this option, we’ll jump into the Build an Automated Cloud Flow tab which allows us to give our flow a name and choose the way our flow will be triggered. At this stage, we can give our flow a relevant name of our choosing, and then we can select the When an Item is Created SharePoint trigger (which should appear high on the list of initial trigger options).

2-name flow

At this point, we can click Create at the bottom of the window. Doing so will bring us to the Flow Diagram Page where we can begin assembling our flow logic.

3 - On the Flow Diagram Page, we’ll find our When an Item is Created trigger already opened. We can begin the flow design process by configuring our SharePoint Site Address within the request body and selecting the List Name we want to attach this flow to (for this demonstration, I’ve created a Site Address called “List Site” with a List called “List”).

3- configure trigger step

Once we’ve configured these details, we can click New Step.

4 - In the second step of our flow, we’re going to implement a second SharePoint action which will automatically retrieve the attachments from our new SharePoint list items. To find this action, we can type “Get attachments” into the Choose an Operation search bar and select that option when it comes up below.

4 - get attachments

Once we select this option, we’ll have a few important fields to configure.

5 - Just like we did in our trigger step, we’ll need to configure our SharePoint Site Address and List Name once again. After we do that, we’ll need to supply the Id of the List Item which we’re pulling the attachments from. We can accomplish this easily by simply clicking on the Id field and selecting the “Id” option from the Dynamic Content window.

5 - get attachments fill ID

Since we chose to use an automated flow, the List Item Id will always represent the most recently uploaded file to our SharePoint list.

This step is finished, so we can go ahead and click New Step once again.

6 - Now that our new list items’ attachment IDs are available as Dynamic Content within our flow, we can set up a subsequent step which retrieves the actual contents (file encodings) of those attachments.

To do so, we’ll use the SharePoint Get Attachment Content action. To find this action, we can type “get attachment content” into the Choose an Operation search bar and select the correct option when it comes up below.

6 - get attachment content

7 - Once we select this action, we’ll need to again choose our Site Address and List Name from their respective dropdowns. After that, we can again select the ID of our List Item using Dynamic Content from our trigger step.

7 - get attachment content fill item

Once we’ve configured the first three fields of our Get Attachment Content action, we can turn our attention to the final field which asks for the File Identifier. We can retrieve this ID by clicking on the File Identifier field and selecting Dynamic Content labeled “Id” from the Get Attachment step.

8 - get attachment content fill file ID

The moment we select this option, Power Automate will automatically create a Control around our SharePoint action called Apply to Each. This action will ensure our attachment content retrieval operation is executed for each item in an array of uploads.

9 - apply to each

8 - With our file attachment content now available in our flow, we can introduce the Cloudmersive Virus Scanning Connector and task it with scanning each new attachment’s contents.

Within the Apply to Each operation, let’s select the option to Add an Action, which will once again open the Choose an Operation search bar. From here, we can type “Cloudmersive” and allow the full list of Cloudmersive Connectors to populate below. From that list, we can select the Cloudmersive Virus Scanning Connector (written as Cloudmersive Virus Scan) with the dark, blue-green logo.

10 - searching for cloudmersive

Upon selecting this Connector, we’ll be prompted to name our connection and provide an API key for authorization. We can retrieve our API key by visiting our Cloudmersive Account Page, clicking on View and Manage API Keys, and finally clicking Copy next to our API key to add it to our clipboard.

Once we’ve successfully established our Cloudmersive connection, we can now view both Cloudmersive Virus Scanning Connector actions on the actions list. From this list, we can click on the Scan a File for Viruses option.

11 - selecting the right action

9 - Within the Scan a File for Viruses action, we only have one request field to satisfy, and this simply asks us for our Input File. To satisfy this request field, let’s click on the Input File bar and select the Attachment Content option – which is available from our Get Attachment Content action – from the Dynamic Content window.

12 - adding dynamic content to action

10 - At this point, we can save our flow and test it. Power Automate will prompt us to test our flow manually by creating a new SharePoint list item and adding a file attachment to it.

Once we do so, we should shortly see green checkmarks appear on each operation in our flow to indicate that they ran smoothly.

13 - green checkmarks

If we take a closer look at the Scan a File for Viruses output body, we can see that this action provides a simple Boolean response called CleanResult.

14 - scan file for viruses

This response makes it easy to take subsequent steps in our flow depending on the results of the virus scan. For example, if our new file receives a CleanResult: True response, we can prompt our flow to automatically save the attachment in a specific folder of our choosing. If our file receives a CleanResult: False response, we can take steps to automatically delete or quarantine the infected file through various means, and we can set up alert mechanisms (such as through Outlook, Slack, Teams, etc.) to notify relevant stakeholders within our organization about the flow’s concerning results.

11 - To automatically delete SharePoint List Items whose attachments received a CleanResult: False response, let’s click Add an Action once again within our Apply to Each control and type “Condition” into the Choose an Operation search bar. Let’s then select the Condition action once it comes up below.

15 - search for condition control

Once we’ve opened the Condition control body, we can use the Choose a Value request field to trigger subsequent flow actions based on previous flow results.

In this case, we want to use Dynamic Content from the Scan a File for Viruses action. To do so, let’s first click on the first (lefthand) Choose a Value field and select CleanResult, and then let’s type “false” into the second (righthand) Choose a Value field.

16 - set cleanresult equal to false

Our Condition is now configured such that the If Yes response will trigger based on a CleanResult: False result, and the If No response will trigger if any other value is provided (in this case, that means CleanResult: True).

Within the If Yes body, let’s click Add an Action and type “delete item” into the Choose an Operation search bar. Let’s then select the SharePoint Delete Item action when it comes up below.

17 - search delete item action

Within this action body, we can quickly provide our Site Address and List Name once again. After that, we can select ID from the Dynamic Content window to specify that the original flow item should be deleted upon receiving a CleanResult: False response.

18 - fill delete item body

In this example, we’ll leave the If No body blank so that our flow takes no action when it receives a CleanResult: True response.

Our flow should look like this when we’ve finished testing:

19 - final flow with green checkmarks

Final Thoughts

If you have any additional questions about using Cloudmersive Virus Scanning Connectors to scan your SharePoint file uploads, please do not hesitate to contact a member of our sales team (you may also contact a member of our support team through your account page).



What is Natural Language Processing (NLP)?
3/31/2023 - Brian O'Neill


What is Natural Language Processing?

Natural Language Processing (NLP) is a field of artificial intelligence which aims to improve the relationship between human language and computer language. Its origins date back to the period directly following the end of the Second World War, during which time the prospect of accomplishing some means of mechanical language translation was placed at a premium.

The field of NLP persisted into its expansive modern form due to the emergence of computers as everyday tools and the studiously recorded dissonance between human and computer language which came along with that. The challenge is that we humans chiefly learn to interpret our native languages/dialects (English, Spanish, French, Mandarin, etc.) in an unstructured way through years of socialization. We may initially learn the rules of our language through academia, but as we grow, we rely predominantly on our surrounding cultures to assign meanings to specific words and phrases, which then slowly change shape over time. Since computers conversely rely on rigidly structured, quantifiable data to communicate with one another, they cannot natively interpret asynchronous, evolving human languages on their own, and must be rigorously trained to do so as a result.

Why is NLP Important, and How is it achieved?

Natural Language Processing combines a variety of scientific fields – including computer science, linguistics, mathematics, and psychology – with the goal of structuring linguistic data in such a way that computers can successfully and efficiently predict the meaning of human sentences (in any given language; NLP is highly language-specific). Successfully predicting the meaning of written language allows computers to both draw insights from and reproduce human language on their own, increasing the range of useful services computer applications can provide to human users. Given that, in the modern digital era, an increasingly high volume of human interactions occurs online in an automated fashion, NLP capabilities have quickly transformed into a necessity rather than a luxury.

The high-level NLP training process starts with tokenizing vast quantities of reference text into individual words and tagging each token with the part-of-speech category it belongs to (i.e., verb, noun, adjective, etc.). Once language has been tokenized and tagged, parsing algorithms can be applied to establish relationships between different tokens, enabling computers to discern how words tend to occur (in what order) in any given language’s phrases or clauses. These basic building blocks make it possible for applications to perform several rudimentary NLP services – many of which we have benefited from for a long time (like spellcheck, for example).

From these building blocks, various complex subfields of NLP branch out to train computers with specific contextual understandings which are pivotal to a well-rounded understanding of any human language. Some subfields include Sentiment Analysis (identifying the meaning of text as positive, negative, or neutral), Subjectivity/Objectivity Analysis (identifying the bias inherent in a text string), Semantic Similarity Comparison (identifying the degree to which two sentences with different words mean the same thing semantically), and more. Techniques such as these effectively empower our applications to draw meaningful insights from text on their own, rather than simply predict and reproduce words in the correct grammatical order.

Cloudmersive NLP APIs

Creating any NLP service from scratch is a significant challenge given the vast quantities of data involved (bringing overhead storage cost into play) and the hands-on training required to bestow meaning upon that data. As a result, it is highly efficient and advantageous to incorporate NLP services into any application through external, RESTful API calls. This method loosely couples a business’ text data with a powerful, pre-existing reference dataset & codebase without the need to update or maintain those systems whatsoever.

Cloudmersive NLP APIs are low-code solutions designed to easily integrate and scale with your business’ needs. There are dozens of useful NLP services available through the Cloudmersive NLP API endpoint, including NLP Analytics, Language Translation, Spellcheck, and more.

For more information on our NLP APIs, please do not hesitate to contact a member of our sales or support teams.



What is a File Upload Security Threat?
3/28/2023 - Brian O'Neill


Should I be Worried About File Upload Threats?

Cyber criminals are always on the lookout for new vulnerabilities in our applications. If they can effectively identify and exploit a hidden weakness in our system, they can damage or steal information from our servers – often before we have a chance to stall or counter their efforts. Since a variety of successful, high-profile web breaches have stolen headlines in the last decade (including several in which poorly documented vulnerabilities were exploited in data parsers, user input sanitization workflows, and more), it’s easy to forget about the more obvious paths we leave open into our system which are often less protected and easier to take advantage of.

As more and more businesses move towards implementing user file upload workflows – accepting images, videos, and documents with myriad purposes and origins – the potential to exploit systems through the file upload process has grown all the more attractive to attackers. As a result, it's important that we stay vigilant and apply appropriate security policies to protect our storage infrastructure.

What do File Upload Threats Entail, and How Can We Protect our System Against Them?

Without adequate protection, file upload workflows can be exploited more insidiously than we might first imagine. For example, files in ubiquitous formats (like PDF) can quietly carry malware or executable content into our cloud storage instances, tricking poorly configured security policies by presenting valid file extensions. Compressed zip files containing gigantic payloads of data can slip into our system and remain undetected for days, weeks or even months before they’re opened by an unsuspecting user, suddenly crashing our system from the inside. The list of hidden threats goes on and on.

Protecting a system against file upload threats requires a dynamic, multi-pronged solution, and that can only begin with a comprehensive evaluation of the present state of file upload security. Is our virus scanning software up to date? Are we assuming files contain content directly aligned with their file extensions, or are we digging through each file’s contents thoroughly? Do we pay close enough attention to the complexity of file names and file paths in our storage architecture, or are we unwittingly leaving a trail of breadcrumbs which attackers can easily follow to our sensitive data? Do we have clear-cut quarantine protocols in place when malicious files are detected, or are we really unprepared to handle threats when we find them?

How can Cloudmersive Virus Scanning APIs Improve My Threat Profile?

Implementing a powerful file scanning security solution is a critical piece of the puzzle – and that’s where Cloudmersive Virus Scanning APIs can make a big difference.

Our Virus Scanning APIs reference a continuously updated list of more than 17 million virus and malware signatures in an effort uncover malicious content in each file entering your system, digging deep into each file’s encoding to ensure disguised threats won’t slip through the cracks. Advanced scanning features can be configured to block a variety of threatening file types, including executables, invalid files, scripts, password protected files, macros, and more; in addition, custom restrictions can be applied to allow/disallow specific file extensions.

These APIs can be deployed flexibly as low-code or no-code solutions, and they can occupy several strategic footholds in your upload architecture, kicking into gear at the network edge or between cloud storage instances.

For more information on how our Virus Scanning APIs can impact your business, please feel free to contact a member of our sales team.



What is a Content Delivery Network (CDN)?
3/17/2023 - Brian O'Neill


routers and switches

What is a Content Delivery Network?

A Content Delivery Network (CDN) refers to a group of servers strategically distributed across widespread geographical regions with the goal of reducing content loading speeds (latency) for client devices in those regions.

It’s important to note that servers in a CDN don’t host content directly, however; rather, they simply cache content from an origin server at the network edge. In this way, they reduce the physical distance content must travel to reach a particular group of client applications/users.

What are the main benefits of CDNs?

By reducing the distance content must travel to reach a client application, CDNs greatly improve web content loading times in their locale, thereby improving the user experience.

If, for example, a popular video streaming service based in North America is preparing to launch an episode of a TV show for a global audience, it can leverage its CDN to cache the episode data in Europe-based servers so it can be accessed easily by European consumers upon release. Without a CDN in place, viewers based in Europe would be forced to wait much longer than North American viewers to buffer and watch the new episode. Each individual request to load that content would need to travel across the Atlantic Ocean and back.

While CDNs are primarily deployed to benefit content consumers, they also benefit the content provider in a few noteworthy ways. For example, caching content at the network edge reduces the volume of traffic making requests to a company’s origin servers, improving local performance and, by extension, cheapening web-hosting costs. A widely distributed content delivery architecture also passively decreases the likelihood of hardware failures and provides extra redundancy when failures do occur.

What other utility do CDNs offer?

Websites which rely on user-generated content (UGC) uploads can also leverage their CDN in reverse, increasing the speed at which their global diaspora of users can upload content from their device. This not only improves the user experience, but it also presents a new chokepoint for security redundancy. Similarly to reverse proxies, CDN servers can be configured with Virus Scanning software to weed out threats while new content uploads are conveniently cached at the network edge.

Can I Secure My Website's CDN Uploads with Cloudmersive Virus Scanning APIs?

Cloudmersive Virus Scanning APIs can be deployed within a CDN to quickly identify any viruses and malware present within your cached content. Advanced custom rules can be configured to block additional content threats including invalid files, executables, scripts, password protected files, macros, and many more.

For more information on how you can leverage our Virus Scanning APIs to protect your network, please feel free to contact a member of our sales team.



What is an API?
2/14/2023 - Brian O'Neill


What does the acronym “API” stand for?

The term API Stands for “Application Programming Interface.” Despite its association with modern web architecture, the term originated in the 1970’s.

use APIs

What are the different categories of APIs, and how are they different from one another?

There are two basic, high-level categories of APIs: OS (operating system) APIs and Web APIs. The key difference between these categories is that OS APIs make it possible for developers to access resources within a specific operating system (such as Mac, Windows, Linux, etc.), while Web APIs make it possible to access resources from web servers.

Some common examples of OS APIs include System Call APIs, GUI (Graphical User Interface) APIs, and Network APIs, all of which define access to their namesake resources within a given OS architecture. Web APIs, on the other hand, are defined quite differently; they most often either adhere to REST (Representational State Transfer) architecture or follow SOAP (Simple Object Access Protocol) protocol. Please note that there are several other types of OS and Web APIs beyond those listed above; these examples are simply among the most common.

How do Web APIs work, and how do they benefit us day to day?

Simply put, Web APIs make it possible for two separate web resources to communicate with ease. Many of the web applications and tools we use day-to-day benefit enormously from sharing data with each other, and for that relationship to be possible, these applications need to access each other’s code efficiently. A Web API serves to establish the way in which its underlying web resource can be accessed, allowing an external web developer to send, modify, and receive important data by structuring their request to that resource in a specific way.

To provide a basic example, if a social media application wants to display daily weather reports for its site visitors, it can do so efficiently by accessing an external Web API created by an independent weather reporting application. The weather application’s Web API will likely follow a common architectural guideline (like REST), making it easy for the external web developer to understand and create a connection with the underlying weather reporting resource. The Web APIs various rules and protocols will also serve to ensure that the social media application receives a secure and timely response after information is requested.

Without a Web API to facilitate this connection, the web developer working on the social media application would be forced to either develop their own weather reporting functionality – a very significant undertaking in time and resources – or abandon the inclusion of this feature all together, leaving their users with a less enticing product. In this way, APIs benefit everyone involved. The social media application can use the weather API to efficiently expand its features, the weather application can monetize its API service (compensating for what it might consider a marginal drop-off in website traffic), and the client-side user can gain access to more information in one place, increasing the likelihood they’ll enjoy and recommend this app to their friends.

What are Cloudmersive APIs?

The Cloudmersive API platform allows individual web developers and enterprises to add dozens of unique and powerful services to their applications quickly and at low cost. These APIs enable developers to add scalable data and file format conversions, advanced security policies, media processing services, optical character recognition (OCR) capabilities to their applications through a single, all-encompassing account. With a suite of scalable APIs in their arsenal, a developer can more easily create a multidimensional application and bring it to market quickly or flesh out an existing application with useful new features.

For more information on Cloudmersive API products, please feel free to contact a member of our sales team.



How to Protect File Uploads against Viruses and Malware
2/6/2023 - Brian O'Neill


caution on keyboard

Why is file security so important, and what is the risk in failing to secure file uploads?

There’s a lot at stake when external files are uploaded to our websites. Poorly vetted files containing viruses and malware can lay dormant in storage for indefinite periods of time before executing attacks from within, compromising our systems and stealing or corrupting invaluable data.

Once our systems are exploited, the resulting damage can range from catastrophic to completely irreparable, resulting in a major blow to our bottom line and to our credibility as a secure and reliable business. Even worse, our website patrons can be directly affected by our security failure when they unwittingly download infected files made available through our servers. All told, it’s essential that we take numerous steps to protect our systems from hidden file threats.

What are the steps an online business should take to protect their systems against malicious file uploads?

There are a variety of ways we can actively protect our systems against malicious file uploads. To begin with the most obvious solution, deploying a virus scanning service is essential. From a high-level security architecture point of view, there are many ways to do this; for example, such services are often incorporated into a reverse proxy or ICAP server to take advantage of the strategic bottleneck (receiving each incoming server-bound HTTP message) each option occupies.

New files originating from a client-side visitor should be thoroughly vetted through our Virus Scanning service for viruses and malware, and certain file types (such as executables and password-protected files) should trigger alarm bells even when they don’t overtly contain virus or malware signatures.

Further, after files are thoroughly vetted through these initial policies, they should be scanned regularly in storage (especially cloud storage). Taking this extra step adds an important layer of redundancy to our security architecture, smoking out malicious files which may have snuck into storage from a typically trustworthy source.

In addition to the above, checking the IP addresses of certain client-side users attempting file uploads can sometimes help identify whether a malicious file upload is imminent. Files originating from known cybercriminal IP address should never be trusted, and files originating from generally suspicious IP addresses (such as bot clients or Tor servers) should receive an additional layer of scrutiny. This layer of security can also help protect against various network security threats.

Finally, once our file security solutions are in place, they need to be regularly updated and reviewed. There’s no such thing as a static security solution; cybercriminals are constantly at work developing new ways to advance and disguise their threats, so our security policies need to mirror those efforts as threats take on new forms.

How does the Cloudmersive Virus Scanning API protect files uploads?

The Cloudmersive Virus Scanning API offers 360-degree content protection, referencing a continuously updated list of more than 17 million virus and malware signatures to find threats hidden within file uploads. Advanced Virus Scanning API features can be customized to allow or block various inherently dangerous file types such as executables, invalid files, scripts, password protected files and much more. This API boasts high-speed, in-memory scanning and delivers a sub-second average response time.

The Virus Scanning API can be custom integrated within your systems, and it can also be leveraged as the underlying service in a Virus Scanning Reverse Proxy server or Virus Scanning ICAP server. Further, in product form (Cloudmersive Storage Protect), this API can be deployed in conjunction with Azure Blob, AWS S3, SharePoint Online and GCP to scan files in cloud storage.

For more information on the Cloudmersive Virus Scanning API and its various applications, please contact our sales team.



What is a Virus Scanning ICAP Server?
1/19/2023 - Brian O'Neill


light blue servers

What is ICAP?

ICAP (Internet Content Adaptation Protocol) is a lightweight, transparent internet protocol capable of modifying HTTP messages. ICAP was initially proposed at the turn of the millennium as the need for scalable internet services became increasingly apparent. Global internet traffic was growing exponentially, and ICAP presented an efficient way of reducing the burden on monolithic servers to process high volumes of HTTP messages on their own.

How are ICAP Servers used?

ICAP servers are typically used to extend the services of proxy servers. They do so by performing specific content adaptation services on HTTP messages as they flow through a network. In this case, “content adaptation services” refers to many important, value-add operations such as virus scanning, language translation, ad insertion, content filtering, and more.

In effect, ICAP servers lighten the load experienced by proxy servers on a high-traffic network. A company can deploy an ICAP server to access its proxy server caches and perform its adjacent value-add services seamlessly.

What is a Virus Scanning ICAP Server, and What are its Advantages?

A virus scanning ICAP server is one which is deployed specifically to scan HTTP message contents for viruses (and other forms of malicious content) as those messages flow through a network.

Leveraging ICAP for virus scanning purposes is highly advantageous for several reasons, and it’s an extremely common practice as a result. These servers fit naturally into network chokepoints, which are ideal positions for any security policy. At the same time, they also increase the efficiency of high-traffic networks by reducing the need for proxy servers (which typically perform several different tasks at once) to scan each HTTP message for viruses by themselves. Their relative ease of deployment and maintenance makes them a particularly cost-effective security policy as well.

What is the Cloudmersive Virus Scanning ICAP Server?

The Cloudmersive Virus Scanning ICAP Server performs its value-add service by calling the Cloudmersive Virus Scanning API as HTTP messages pass through your network. The Virus Scanning API offers distinct security advantages, including continuously updated signatures for millions of threats and advanced, high-performance scanning capabilities. In addition, the Virus Scanning API provides customizable content security policies, allowing you to block or allow content including executables, invalid files, scripts, password-protected files, and more.

For more information on the Cloudmersive Virus Scanning ICAP server, please contact our sales team.



What is a Reverse Proxy Server
1/6/2023 - Brian O'Neill


servers

What is a Proxy Server?

Before we dive into a closer look at reverse proxy servers, we should first understand proxy servers from a birds-eye view. In short, the purpose of any proxy server is to act as an intermediary between client-side users browsing the internet and the various external web servers they request information from. It’s in the name: the word “proxy” refers to any entity which has been given the power to act on another entity’s behalf. There are two major categories of proxy servers: forward and reverse.

What is the Difference between Forward and Reverse Proxy Servers?

Both forward and reverse proxy servers are extremely useful, and they perform very different functions in practice. While forward proxies work on behalf of client-side users, acting as a buffer between them and the various web resources they request information from, reverse proxies behave in the exact opposite way, acting as a buffer between a particular web server (or group of web servers) and the inbound client-side requests they receive. In practice, forward proxies typically protect client-side user identities while they browse the internet from a particular network, and they can also provide a means for restricting access to specified web servers.

What is a Reverse Proxy Server, and How does it Benefit a Backend Server (or Group of Backend Servers?)

At a basic level, a reverse proxy server funnels client-side requests towards the backend web servers they seek resources from. Once those requests are satisfied, the reverse proxy server receives responses back from the web server and subsequently returns those responses to the client-side user.

The excellent strategic positioning occupied by a reverse proxy server allows it to benefit backend web servers in a variety of useful ways. For example, reverse proxy servers are generally used to load balance inbound requests across their group of assigned web servers, ensuring no one server is ever overloaded. This is accomplished using one of several load balancing algorithms, such as a Round Robin algorithm, which balances requests to a group of servers in a specific order (one request per server). In addition, reverse proxy servers are often asked to carry out important functions such as terminating SSL connections, decrypting incoming requests, and temporarily caching frequently requested content for convenience. Functions such as these effectively help lighten the burden for backend web servers when they experience high volumes of requests.

Perhaps most beneficially of all, reverse proxy servers can be leveraged to deploy basic and advanced security policies which greatly help to protect the backend web servers they support. On the simplest end of that spectrum, they can be configured to process authentication details for various types of information – such as personal account details – only redirecting client requests to the backend server(s) when valid authentication details are supplied. On the more complex end, they can be configured to scan incoming client-side data for viruses and screen out various forms of otherwise untrustworthy content (such as macros and executables), acting as a first line of defense against a wide variety of potentially dangerous external requests.

Cloudmersive Virus Scanning Reverse Proxy Server

The Cloudmersive Virus Scanning Reverse Proxy Server represents a multidimensional reverse proxy security solution, calling the Cloudmersive Virus Scanning API into action when inbound requests are funneled through the reverse proxy stage. This API placement protects backend web servers from malicious traffic - such as virus and malware uploads - before that traffic has a chance to reach its intended destination.

The Cloudmersive Virus Scanning API has access to a growing list of more than 17 million virus and malware signatures, and various custom settings can be configured to either block or allow myriad dangerous forms of content, such as scripts, executables, invalid files, and more. For more information, please feel free to check out our product page, or make an inquiry to one of our sales representatives.



What is a Forward Proxy?
12/27/2022 - Cloudmersive Technical Writing Team


A forward proxy is a type of proxy server that acts as an intermediary between a client and a server. It is called a "forward" proxy because it forwards client requests to the appropriate server on behalf of the client.

Forward proxies are often used in corporate networks to enforce Internet usage policies, block unwanted websites, and provide additional security. They can also be used to improve Internet performance by caching frequently requested content, such as popular videos or images.

Forward proxies can be configured in a number of ways. They can be configured to allow all client requests to pass through, or they can be configured to block certain types of requests based on criteria such as the destination website or the client's IP address. They can also be configured to encrypt client requests for added security.



800 free API calls/month, with no expiration

Get started now! or Sign in with Google

Questions? We'll be your guide.

Contact Sales