Express.js XML External Entity (XXE) via xml2json

High Risk XML External Entity (XXE)
expressxxexmlxml2jsonjavascriptexternal-entities

What it is

The Express.js application uses the xml2json library to parse XML input without properly disabling external entity processing, making it vulnerable to XXE attacks. Attackers can exploit this to read local files, perform SSRF attacks, or cause denial of service through entity expansion.

// Vulnerable: xml2json with default settings
const xml2json = require('xml2json');

app.post('/convert', (req, res) => {
  const xmlData = req.body.xml;
  
  try {
    // Dangerous: processes external entities by default
    const json = xml2json.toJson(xmlData, {
      object: true
    });
    res.json(json);
  } catch (error) {
    res.status(400).json({ error: 'Invalid XML' });
  }
});
// Secure: Use libxml with security flags
const libxml = require('libxmljs');
const xml2json = require('xml2json');

app.post('/convert', (req, res) => {
  const xmlData = req.body.xml;
  
  try {
    // First parse with libxml to disable external entities
    const doc = libxml.parseXml(xmlData, {
      noent: false, // Disable entity substitution
      nonet: true,  // Disable network access
      dtdload: false, // Don't load DTD
      dtdattr: false, // Don't default DTD attributes
      dtdvalid: false // Don't validate against DTD
    });
    
    // Then convert the sanitized XML to JSON
    const json = xml2json.toJson(doc.toString(), {
      object: true
    });
    
    res.json(json);
  } catch (error) {
    res.status(400).json({ error: 'Invalid XML format' });
  }
});

💡 Why This Fix Works

The vulnerable code was updated to address the security issue.

Why it happens

Express applications use xml2json library with default configuration that doesn't explicitly disable external entity processing: xml2json.toJson(xmlString) or xml2json.toJson(xmlString, {}) without security options. The xml2json library relies on underlying XML parsers (libxmljs) which default to processing external entities and DTD declarations. When parsing untrusted XML containing <!ENTITY xxe SYSTEM "file:///etc/passwd"> declarations, the parser resolves these entities by default, enabling file disclosure, SSRF to internal services at http://169.254.169.254/, or billion laughs denial-of-service attacks through recursive entity expansion.

Root causes

Using xml2json with Default XML Parser Settings

Express applications use xml2json library with default configuration that doesn't explicitly disable external entity processing: xml2json.toJson(xmlString) or xml2json.toJson(xmlString, {}) without security options. The xml2json library relies on underlying XML parsers (libxmljs) which default to processing external entities and DTD declarations. When parsing untrusted XML containing <!ENTITY xxe SYSTEM "file:///etc/passwd"> declarations, the parser resolves these entities by default, enabling file disclosure, SSRF to internal services at http://169.254.169.254/, or billion laughs denial-of-service attacks through recursive entity expansion.

Processing Untrusted XML Without Disabling External Entities

Applications accept XML from user uploads, API POST requests, webhooks, or third-party integrations without configuring xml2json to disable external entity resolution. XML input arrives via Express middleware like body-parser configured for application/xml, or custom XML parsing middleware. Code directly passes req.body or uploaded file contents to xml2json.toJson() without security hardening. Developers treat XML as simple data format like JSON without understanding XXE risks. Applications process SOAP messages, RSS feeds, SVG files, or configuration uploads through xml2json enabling attackers to inject malicious entity declarations that get resolved during parsing.

Missing XML Content Structure and Pattern Validation

Applications fail to validate XML structure or scan for malicious patterns before passing to xml2json parser. No pre-parse checks for suspicious DOCTYPE declarations, ENTITY keywords, SYSTEM/PUBLIC references, or file:// URLs in XML payloads. Applications don't implement size limits to prevent DoS through large entity expansions, don't validate root element names, and don't check for expected XML schema compliance. XML passes directly from external sources to parser without intermediate inspection or filtering. Lack of validation allows attackers to submit crafted XXE exploit payloads that bypass application logic and directly target parser vulnerabilities.

Explicitly Enabling DOCTYPE Processing in XML Conversion

Applications or xml2json configurations explicitly enable DOCTYPE and DTD processing features thinking they're required for proper XML parsing or data conversion. Code may configure underlying libxmljs parser with options enabling DTD features for XML validation purposes. Developers enable DOCTYPE to support legacy XML formats, entity-based templating, or XML schemas defined in DTDs. Even security-aware teams may enable DOCTYPE under the assumption that input validation or network restrictions provide adequate protection, creating exploitable gaps. xml2json's flexible configuration options allow dangerous features to be enabled without clear security warnings.

Insufficient Security Configuration Across XML Processing Stack

Applications use xml2json alongside other XML libraries (fast-xml-parser, xmldom, sax) with inconsistent security configurations. Some parsers have XXE protections configured while xml2json instances lack security options, creating vulnerabilities in specific code paths. Different routes or microservices handle XML differently, some secure and some vulnerable. XML library versions get updated but security configurations aren't updated to use new protection features. No centralized XML parsing factory ensures consistent security posture across application. Ad-hoc xml2json.toJson() calls throughout codebase lack standardized security configuration, making comprehensive security review difficult.

Fixes

1

Configure xml2json with Explicit Security Options

Use xml2json with security-focused configuration options that disable external entity processing: xml2json.toJson(xmlString, {sanitize: false, parseOptions: {noent: false, dtdload: false, dtdattr: false, dtdvalid: false, nonet: true}}). The parseOptions object passes settings to underlying libxmljs parser: noent: false disables entity substitution, dtdload: false prevents DTD loading, nonet: true blocks network access for external entities. Apply these options to all xml2json.toJson() calls processing untrusted input. Create wrapper function encapsulating secure configuration: function secureXmlToJson(xml) { return xml2json.toJson(xml, {parseOptions: {noent: false, dtdload: false}}); } and use consistently across application.

2

Disable External Entity Processing in libxmljs Parser

Configure the underlying libxmljs XML parser used by xml2json to reject external entities and DTD processing. When using libxmljs directly alongside xml2json, ensure consistent security settings: libxml.parseXmlString(xmlString, {noent: false, dtdload: false, dtdattr: false, dtdvalid: false, nonet: true}). The noent: false option is critical as it disables entity substitution (default is true which enables dangerous entity processing). Set nonet: true to prevent parser from making network requests to fetch external entities. Test configuration by attempting to parse XML with external entity declarations and verifying they're not resolved or cause parsing to fail safely.

3

Implement Pre-Parse XML Validation and Sanitization

Validate and sanitize XML content before passing to xml2json parser. Check for XXE attack patterns using regex: if (xmlString.match(/<!ENTITY|<!DOCTYPE|SYSTEM|PUBLIC/i)) throw new SecurityError('XML contains dangerous declarations'). Implement file size limits rejecting payloads exceeding business requirements (e.g., 1MB max for API requests). Validate Content-Type headers match expected application/xml or text/xml. Remove DOCTYPE declarations before parsing: xmlString = xmlString.replace(/<!DOCTYPE[^>]*>/gi, ''). Use XML schema (XSD) validation to enforce expected structure and reject unexpected elements. Log rejected XML to security monitoring systems for attack detection and pattern analysis.

4

Enforce Strict XML Schema Validation

Define XML Schema (XSD) documents for all expected XML formats and validate against schemas before conversion to JSON. Use libxmljs2 schema validation: const schema = libxmljs.parseXmlString(xsdString); const doc = libxmljs.parseXmlString(xmlString, {noent: false}); if (!doc.validate(schema)) reject. Schemas create allowlists of permitted elements, attributes, and data types while rejecting unexpected structures. Schema validation provides defense-in-depth by catching malformed XML before entity processing can occur. Maintain XSD schemas in version control alongside application code. For third-party XML integrations, generate schemas from sample data and strictly enforce them.

5

Migrate to JSON for Modern Data Exchange

Evaluate whether XML is essential or if JSON can replace it for data exchange and API communication. Modern web services predominantly use JSON for its simplicity, better performance, lack of complex features like entities, and superior developer experience. Migrate XML-based APIs to JSON endpoints: replace Content-Type: application/xml with application/json, convert XML request/response formats to JSON equivalents. For external integrations requiring XML (legacy SOAP services, RSS feeds), isolate XML processing to API gateway layer with enhanced security controls, monitoring, and rate limiting. JSON eliminates XXE attack surface entirely while simplifying parsing, validation, and application logic.

6

Use Alternative XML Libraries with Secure Defaults

Replace xml2json with modern XML parsing libraries designed with security as priority. Use fast-xml-parser with entity processing explicitly disabled: const parser = new XMLParser({ignoreAttributes: false, parseAttributeValue: false, allowBooleanAttributes: true, processEntities: false}); const jsonObj = parser.parse(xmlString). The processEntities: false option prevents entity expansion attacks. For applications requiring xml2json specifically, create secure wrapper module that imports xml2json, applies security configuration, and exports secured parsing function. Monitor library security advisories and keep dependencies updated. Consider xmldom with DOMParser that doesn't resolve external entities by default for DOM-style XML processing needs.

Detect This Vulnerability in Your Code

Sourcery automatically identifies express.js xml external entity (xxe) via xml2json and many other security issues in your codebase.