PHP base_convert() Precision Loss Vulnerability

Medium Risk Data Integrity
phpbase-convertprecision-lossfloating-pointdata-integrity

What it is

The PHP application uses the base_convert() function with large numbers, which can lead to precision loss due to floating-point limitations. This vulnerability can cause incorrect calculations, authentication bypasses, or data corruption when dealing with cryptographic tokens, user IDs, or financial calculations.

// Vulnerable: base_convert() with large numbers
function convertToken($hexToken) {
    // Dangerous: Precision loss with large hex values
    $decimalToken = base_convert($hexToken, 16, 10);
    
    // This may lose precision for large tokens
    return $decimalToken;
}

// Example with user ID conversion
function generateUserId($sessionId) {
    $hexId = hash('sha256', $sessionId);
    
    // Dangerous: Large hash converted to decimal loses precision
    $numericId = base_convert(substr($hexId, 0, 15), 16, 10);
    
    return $numericId;
}

$userId = generateUserId($_SESSION['id']); // May have collisions
// Secure: Using BCMath for precise conversions
function convertTokenSafe($hexToken) {
    // Validate input
    if (!ctype_xdigit($hexToken)) {
        throw new InvalidArgumentException('Invalid hexadecimal token');
    }
    
    // Use BCMath for arbitrary precision
    $decimal = '0';
    $base = '1';
    
    for ($i = strlen($hexToken) - 1; $i >= 0; $i--) {
        $digit = hexdec($hexToken[$i]);
        $decimal = bcadd($decimal, bcmul($digit, $base));
        $base = bcmul($base, '16');
    }
    
    return $decimal;
}

// Alternative: Use GMP extension
function convertTokenGMP($hexToken) {
    if (!ctype_xdigit($hexToken)) {
        throw new InvalidArgumentException('Invalid hexadecimal token');
    }
    
    // GMP handles arbitrary precision
    return gmp_strval(gmp_init($hexToken, 16), 10);
}

// Secure user ID generation
function generateUserIdSafe($sessionId) {
    $hexId = hash('sha256', $sessionId);
    
    // Use first 8 bytes (16 hex chars) to stay within safe range
    $safeHex = substr($hexId, 0, 16);
    
    // Convert using built-in functions for safe range
    return hexdec($safeHex);
}

try {
    $userId = generateUserIdSafe($_SESSION['id']);
} catch (Exception $e) {
    error_log('User ID generation failed: ' . $e->getMessage());
    $userId = null;
}

💡 Why This Fix Works

The vulnerable code was updated to address the security issue.

Why it happens

PHP applications call base_convert() to convert large numeric strings between bases (hexadecimal to decimal, base64 to decimal) without considering PHP's integer precision limits. The base_convert() function internally uses floating-point arithmetic for calculations when numbers exceed PHP_INT_MAX (typically 2^63-1 on 64-bit systems or 2^31-1 on 32-bit systems). When converting values like base_convert('FFFFFFFFFFFFFFFF', 16, 10) (16 F's representing a 64-bit value), PHP promotes the calculation to floating-point (double precision) which has only 53 bits of mantissa precision per IEEE 754 standard. This causes the least significant bits to be rounded or lost, producing incorrect results. A typical vulnerable pattern converts cryptographic hash outputs to decimal: $hashValue = base_convert(hash('sha256', $data), 16, 10), where SHA-256's 256-bit output far exceeds floating-point precision, resulting in a value that appears numeric but has lost significant precision, creating security issues in comparisons, authentication tokens, or unique identifiers where collisions become possible due to multiple different inputs producing identical truncated output.

Root causes

Using base_convert() with Numbers Larger than PHP_INT_MAX

PHP applications call base_convert() to convert large numeric strings between bases (hexadecimal to decimal, base64 to decimal) without considering PHP's integer precision limits. The base_convert() function internally uses floating-point arithmetic for calculations when numbers exceed PHP_INT_MAX (typically 2^63-1 on 64-bit systems or 2^31-1 on 32-bit systems). When converting values like base_convert('FFFFFFFFFFFFFFFF', 16, 10) (16 F's representing a 64-bit value), PHP promotes the calculation to floating-point (double precision) which has only 53 bits of mantissa precision per IEEE 754 standard. This causes the least significant bits to be rounded or lost, producing incorrect results. A typical vulnerable pattern converts cryptographic hash outputs to decimal: $hashValue = base_convert(hash('sha256', $data), 16, 10), where SHA-256's 256-bit output far exceeds floating-point precision, resulting in a value that appears numeric but has lost significant precision, creating security issues in comparisons, authentication tokens, or unique identifiers where collisions become possible due to multiple different inputs producing identical truncated output.

Converting Large Hexadecimal or Base64 Values Without Considering Precision

Developers convert API tokens, session identifiers, cryptographic keys, or database identifiers from hexadecimal or base64 encoding to decimal representation without understanding precision constraints. Common code patterns include converting UUID values from hexadecimal to numeric IDs: $numericId = base_convert(str_replace('-', '', $uuid), 16, 10), where UUIDs are 128-bit values that cannot be accurately represented in 64-bit floating-point. Blockchain applications converting Ethereum addresses (160-bit hexadecimal) or Bitcoin addresses to decimal for storage or comparison using base_convert() lose precision. OAuth token implementations converting base64url-encoded tokens to numeric representations for database indexing experience precision loss. The issue is particularly insidious because base_convert() doesn't throw errors or warnings when precision is lost—it silently returns an approximated value that looks numerically valid. Developers testing with small values during development don't observe precision issues, but production data with larger values exhibits unexpected behavior: authentication failures when tokens don't match expected values, user ID collisions causing data to be attributed to wrong users, financial calculation errors in systems converting large monetary amounts between number bases.

Relying on base_convert() for Cryptographic or Security-Sensitive Operations

Security-sensitive code uses base_convert() in authentication systems, token generation, password reset mechanisms, or cryptographic key derivation where precision loss creates authentication bypasses or security vulnerabilities. Password reset token generation might use: $resetToken = base_convert(random_bytes(16), 256, 36) to create URL-safe tokens, but precision loss means multiple different random values could produce identical tokens, allowing attackers to guess valid reset tokens with much higher probability than intended. Session ID generation using base_convert(bin2hex(random_bytes(32)), 16, 10) to create numeric session identifiers suffers precision loss making session IDs predictable or creating collisions where different sessions share the same ID. HMAC comparison operations converting hexadecimal HMAC outputs to decimal for numeric comparison: if (base_convert($computedHmac, 16, 10) == base_convert($receivedHmac, 16, 10)) can incorrectly validate signatures because both values lose precision in the same way, causing different HMACs to compare as equal. Time-based one-time password (TOTP) implementations converting Unix timestamps through multiple base conversions accumulate precision errors causing authentication to fail or accept incorrect codes. The security impact is severe: authentication bypasses through token/ID collisions, reduced entropy in supposedly-random identifiers making brute force attacks feasible, and timing attack vulnerabilities when precision loss causes variable-time comparison behaviors.

Missing Validation of Input Size Before Conversion

Applications accept user-controlled or external input for base conversion without validating the numeric magnitude or string length that would cause precision loss. API endpoints receiving hexadecimal product IDs, transaction references, or cryptographic signatures directly pass these to base_convert() without checking: if (strlen($hexInput) > 13) { /* too large for safe conversion */ }. The threshold depends on the base conversion—hexadecimal strings longer than 13 characters (representing values larger than 2^52) will lose precision in base_convert() on most PHP installations. Code like $decimal = base_convert($_GET['id'], 16, 10) processes arbitrary-length user input without constraints, causing calculation errors when users provide long identifiers. Third-party API integration code converting external system identifiers (Stripe payment IDs, AWS resource IDs, database primary keys from different systems) assumes these fit within safe conversion ranges without validation. Dynamic base conversion accepting both the value and bases from user input: base_convert($_POST['value'], $_POST['from_base'], $_POST['to_base']) is particularly dangerous, allowing attackers to deliberately trigger precision loss or exploit edge cases. Proper validation requires checking both input length and the mathematical range after conversion, but developers typically validate only format (hexadecimal characters) without considering numeric magnitude. The absence of validation means precision loss occurs unpredictably based on input size, causing intermittent bugs that are difficult to diagnose in production.

Using Floating-Point Arithmetic for Precise Calculations

PHP code combines base_convert() with subsequent floating-point mathematical operations, compounding precision errors beyond what base_convert() alone would cause. Financial applications calculating transaction amounts stored in different bases might use: $amount = base_convert($hexAmount, 16, 10) * $exchangeRate, where both the conversion and multiplication use floating-point, accumulating rounding errors. E-commerce pricing engines converting cryptocurrency amounts (often represented in hexadecimal wei or satoshi units) to decimal prices perform: $price = floatval(base_convert($cryptoAmount, 16, 10)) / 1e18, introducing precision loss at both the conversion and division steps. Percentage calculations on converted values: $tax = (base_convert($value, 16, 10) * 0.075) suffer cumulative floating-point errors. Developers incorrectly assume that once a value is converted to decimal string by base_convert(), it can be safely used in floating-point arithmetic without understanding that base_convert() already returned an imprecise floating-point value internally converted to string. The compounding effect is particularly problematic in iterative calculations where precision loss accumulates: summing thousands of base_convert() results, calculating running averages, or performing compound interest calculations. These patterns appear in blockchain analytics tools processing transaction values, scientific computing applications converting between number bases for calculations, and data migration scripts transforming legacy data formats, where precision errors can compound to produce significantly incorrect results affecting financial integrity, scientific accuracy, or data consistency.

Fixes

1

Use BCMath or GMP Extensions for Arbitrary Precision Arithmetic

Replace base_convert() with PHP's BCMath or GMP extensions which provide arbitrary precision arithmetic without floating-point limitations. For hexadecimal to decimal conversion with GMP: $decimal = gmp_strval(gmp_init($hexString, 16), 10); handles arbitrarily large values maintaining full precision. GMP supports base conversion with gmp_init($value, $fromBase) to parse input and gmp_strval($gmpNumber, $toBase) to format output, supporting bases 2-62. For BCMath (when GMP unavailable), implement manual base conversion: function hex2dec($hex) { $dec = '0'; $len = strlen($hex); for ($i = 0; $i < $len; $i++) { $dec = bcmul($dec, '16'); $dec = bcadd($dec, hexdec($hex[$i])); } return $dec; }. BCMath provides bcadd(), bcsub(), bcmul(), bcdiv(), bcpow() for arithmetic operations maintaining precision through string-based calculations. Check extension availability with extension_loaded('gmp') or extension_loaded('bcmath') and implement fallback: if (extension_loaded('gmp')) { return gmp_strval(gmp_init($hex, 16), 10); } else { /* BCMath implementation */ }. Set BCMath scale for decimal precision: bcscale(20); before calculations requiring fractional accuracy. For optimal performance, prefer GMP over BCMath as GMP uses native C integer arithmetic while BCMath uses string operations, making GMP significantly faster for large number operations.

2

Validate Input Size and Range Before Using base_convert()

Implement strict validation to ensure input values remain within base_convert()'s safe precision range before conversion. For hexadecimal to decimal conversions, limit input length to 13 characters (52 bits) to stay within IEEE 754 double precision mantissa: if (strlen($hexValue) > 13) { throw new OverflowException('Value too large for safe conversion'); }. Calculate safe thresholds for different bases: base 10 accepts up to 15 digits, base 2 up to 52 bits, base 36 up to 10 characters for safe conversion without precision loss. Implement range validation using string comparison for large numbers: if (function_exists('gmp_cmp') && gmp_cmp(gmp_init($hexValue, 16), gmp_init(PHP_INT_MAX)) > 0) { /* use GMP instead */ }. For API inputs, document maximum acceptable value sizes in API specifications and return HTTP 400 Bad Request with clear error messages when oversized values are submitted. Implement configuration-based limits: define('MAX_SAFE_HEX_LENGTH', 13); if (strlen($input) > MAX_SAFE_HEX_LENGTH) { /* reject or use alternative method */ }. Create validation helper functions: function canSafelyConvert($value, $fromBase, $toBase): bool { return strlen($value) <= getMaxSafeLength($fromBase); } to centralize validation logic. Log instances where inputs exceed safe ranges to identify patterns requiring architecture changes. Combine validation with graceful degradation: attempt base_convert() for small values (better performance) and automatically switch to GMP/BCMath for large values.

3

Implement Custom Conversion Functions for Large Numbers

Create application-specific conversion functions that handle large numbers correctly using GMP/BCMath internally with consistent API. Implement a conversion utility class: class BaseConverter { public static function convert(string $value, int $fromBase, int $toBase): string { if (self::canUseBaseConvert($value, $fromBase)) { return base_convert($value, $fromBase, $toBase); } if (extension_loaded('gmp')) { return gmp_strval(gmp_init($value, $fromBase), $toBase); } return self::bcConvert($value, $fromBase, $toBase); } private static function canUseBaseConvert(string $value, int $fromBase): bool { $maxLengths = [2 => 52, 10 => 15, 16 => 13, 36 => 10]; return strlen($value) <= ($maxLengths[$fromBase] ?? 10); } }. Implement specialized converters for common use cases: class UUIDConverter { public static function toDecimal(string $uuid): string { $hex = str_replace('-', '', $uuid); return gmp_strval(gmp_init($hex, 16), 10); } public static function fromDecimal(string $decimal): string { $hex = str_pad(gmp_strval(gmp_init($decimal, 10), 16), 32, '0', STR_PAD_LEFT); return sprintf('%s-%s-%s-%s-%s', substr($hex, 0, 8), substr($hex, 8, 4), substr($hex, 12, 4), substr($hex, 16, 4), substr($hex, 20)); } }. Document precision guarantees in PHPDoc: /** @param string $value Arbitrary precision number * @return string Exact conversion without precision loss */. Write unit tests verifying large number handling: assertEquals('340282366920938463463374607431768211455', BaseConverter::convert('FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF', 16, 10));.

4

Use String-Based Operations for Precise Number Handling

Maintain numeric values as strings throughout processing pipelines to avoid implicit floating-point conversions that cause precision loss. Store large identifiers, tokens, and cryptographic values as VARCHAR/TEXT in databases rather than BIGINT to preserve full precision: CREATE TABLE tokens (id VARCHAR(64) PRIMARY KEY, value VARCHAR(128)); prevents database truncation. Use string comparison for equality checks on converted values: if (hash_equals($expectedToken, $actualToken)) instead of numeric comparison if ($expectedToken == $actualToken) which would cast to float. Implement string-based arithmetic for simple operations: function addHex($hex1, $hex2): string { return gmp_strval(gmp_add(gmp_init($hex1, 16), gmp_init($hex2, 16)), 16); } avoiding conversion to intermediate numeric types. For JSON encoding, use string types for large numbers: json_encode(['id' => (string)$largeId], JSON_PRESERVE_ZERO_FRACTION) ensures values remain strings in JSON output. Configure database ORM/query builders to return large integers as strings: $pdo->setAttribute(PDO::ATTR_STRINGIFY_FETCHES, true); for MySQL BIGINT columns. Implement value objects that encapsulate string-based number handling: class LargeInteger { private string $value; public function __construct(string $value) { if (!ctype_digit($value)) throw new InvalidArgumentException(); $this->value = $value; } public function toString(): string { return $this->value; } } ensures type safety. Use BCMath/GMP string outputs directly without converting to int/float: $sum = bcadd($value1, $value2); /* returns string */ maintaining precision through calculation chains.

5

Consider Alternative Libraries for Cryptographic Operations

Replace manual base conversion in cryptographic contexts with specialized libraries designed for security-sensitive operations. Use random_bytes() and bin2hex() for token generation instead of base conversion: $token = bin2hex(random_bytes(32)); produces a 64-character hexadecimal string with full 256-bit entropy without precision concerns. For UUID generation, use ramsey/uuid library: $uuid = Uuid::uuid4(); which handles encoding/representation correctly without base conversion. Implement HMAC operations using hash_hmac() returning hexadecimal strings directly: $hmac = hash_hmac('sha256', $data, $key); /* 64-char hex string */ and compare with hash_equals($expectedHmac, $computedHmac) maintaining string format throughout. For JWT token handling, use firebase/php-jwt or lcobucci/jwt libraries that manage token encoding/decoding internally without exposing base conversion operations: $token = (new Builder())->issuedBy('issuer')->getToken($signer, $key); Use phpseclib for cryptographic operations requiring big integer arithmetic: use phpseclib3\Math\BigInteger; $bigInt = new BigInteger($hexValue, 16); $decimal = $bigInt->toString(); For blockchain integrations, use web3.php or similar libraries providing proper big number handling: use Web3\Utils; $weiValue = Utils::toWei('1.5', 'ether'); handles cryptocurrency amount conversions with arbitrary precision. Audit cryptographic dependencies ensuring they use GMP/BCMath internally rather than base_convert(): check library source or documentation confirming large number support. These specialized libraries provide security-audited implementations avoiding common pitfalls like precision loss, timing attacks, and insufficient entropy.

6

Test Conversion Accuracy with Boundary Values

Implement comprehensive test suites verifying conversion accuracy at precision boundaries to detect precision loss before production deployment. Create unit tests checking round-trip conversion accuracy: $original = 'FFFFFFFFFFFFFFFF'; $decimal = convert($original, 16, 10); $back = convert($decimal, 10, 16); assertEquals($original, strtoupper($back)); detects precision loss when round-trip fails. Test boundary values around floating-point precision limits: test with 0x1FFFFFFFFFFFFF (53 bits, safe), 0x20000000000000 (54 bits, unsafe), and 0xFFFFFFFFFFFFFFFF (64 bits, definitely unsafe). Generate test cases programmatically: for ($bits = 50; $bits <= 60; $bits++) { $value = gmp_strval(gmp_pow(2, $bits), 16); /* test conversion */ } to identify exact precision breakpoint. Compare base_convert() against GMP results to verify correctness: assertEquals(gmp_strval(gmp_init($hex, 16), 10), base_convert($hex, 16, 10), 'Precision loss detected'); Use property-based testing with libraries like php-quickcheck generating random large numbers: forAll(Generator::largeHex(), function($hex) { /* verify conversion accuracy */ }); Implement integration tests using real production data samples: test actual UUIDs, API tokens, transaction IDs from production logs to ensure conversions handle realistic inputs. Test edge cases: zero, maximum safe values, values just exceeding safe range, extremely large values (256-bit, 512-bit). Create monitoring/alerting for production: log when inputs exceed safe conversion ranges, compare GMP and base_convert() results in production (using GMP for actual operation, base_convert() for verification), alert on discrepancies. Include conversion accuracy tests in CI/CD pipelines failing builds when precision loss detected.

Detect This Vulnerability in Your Code

Sourcery automatically identifies php base_convert() precision loss vulnerability and many other security issues in your codebase.