杰瑞科技汇

Python中如何处理JSON里的float?

Of course! Working with JSON floats in Python is a common task, but it has some important nuances, especially regarding precision. Here’s a complete guide covering everything from basic parsing to advanced precision handling.

Python中如何处理JSON里的float?-图1
(图片来源网络,侵删)

The Basic Case: Parsing JSON with Floats

When you parse a JSON string into a Python object, the JSON standard's number type is automatically converted to Python's float type.

JSON Standard: The JSON standard does not have a distinct "float" or "int" type. It has a single number type. The distinction between integer and floating-point is made by the presence or absence of a decimal point and/or an exponent.

Python's Behavior: Python's json library makes a best-effort guess:

  • If the JSON number has a decimal point or an exponent (e.g., 14, 1e2), it becomes a Python float.
  • If the JSON number is a simple integer without a decimal point (e.g., 42, -10), it becomes a Python int.

Example: Basic Parsing

import json
json_string = '''
{
  "pi": 3.14159,
  "gravity": 9.8,
  "population": 8024374, # This is an integer in JSON
  "scientific_notation": 6.022e23
}
'''
# Parse the JSON string into a Python dictionary
data = json.loads(json_string)
# Check the types of the parsed values
print(f"Type of 'pi': {type(data['pi'])}")
print(f"Value of 'pi': {data['pi']}")
print(f"Type of 'gravity': {type(data['gravity'])}")
print(f"Value of 'gravity': {data['gravity']}")
print(f"Type of 'population': {type(data['population'])}") # This will be an int
print(f"Value of 'population': {data['population']}")
print(f"Type of 'scientific_notation': {type(data['scientific_notation'])}")
print(f"Value of 'scientific_notation': {data['scientific_notation']}")

Output:

Python中如何处理JSON里的float?-图2
(图片来源网络,侵删)
Type of 'pi': <class 'float'>
Value of 'pi': 3.14159
Type of 'gravity': <class 'float'>
Value of 'gravity': 9.8
Type of 'population': <class 'int'>
Value of 'population': 8024374
Type of 'scientific_notation': <class 'float'>
Value of 'scientific_notation': 6.022e+23

The Precision Problem: float vs. decimal.Decimal

This is the most critical point to understand about JSON floats in Python.

The Issue: Python's float type is implemented using IEEE 754 double-precision binary floating-point. This means it has limited precision and cannot represent some decimal numbers exactly.

When you parse a JSON float like 1, Python doesn't store the exact value of one and one-tenth. It stores the closest possible binary representation, which is slightly more. This can lead to unexpected results.

Example: The Precision Problem

import json
# A JSON number that looks simple but can't be represented exactly as a binary float
json_string_with_float = '{"price": 123.45}'
data = json.loads(json_string_with_float)
price_float = data['price']
print(f"The parsed float is: {price_float}")
print(f"Is it exactly 123.45? {price_float == 123.45}")
# See the true internal representation
print(f"Internal representation: {repr(price_float)}")

Output:

Python中如何处理JSON里的float?-图3
(图片来源网络,侵删)
The parsed float is: 123.45
Is it exactly 123.45? True  # Python's display is smart and hides the tiny error
Internal representation: 123.4499999999999886313150946360958099365234375

The repr() function reveals the true value. This tiny error can accumulate and cause problems in financial, scientific, or any application where exact decimal precision is required.


Solution: Using parse_float for decimal.Decimal

The json.loads() function has a powerful argument called parse_float. It allows you to specify a custom function to use instead of the default float constructor. This is the perfect place to use Python's decimal module for precise decimal arithmetic.

The decimal module creates a Decimal type that stores numbers as decimals, not binary floats, avoiding the precision issues.

Example: Parsing with decimal.Decimal

import json
from decimal import Decimal
json_string_with_float = '{"price": 123.45, "weight": 5.5}'
data_string = '{"pi": 3.14159, "e": 2.71828}'
# Define the custom parser function
def parse_decimal(json_float_str):
    """Converts a JSON float string to a Python Decimal."""
    return Decimal(json_float_str)
# Use the parse_float argument
data_precise = json.loads(json_string_with_float, parse_float=parse_decimal)
data_pi = json.loads(data_string, parse_float=parse_decimal)
print("--- Precise Data ---")
print(f"Type of 'price': {type(data_precise['price'])}")
print(f"Value of 'price': {data_precise['price']}") # Exact value
print(f"Is it exactly 123.45? {data_precise['price'] == Decimal('123.45')}")
print(f"Type of 'weight': {type(data_precise['weight'])}")
print(f"Value of 'weight': {data_precise['weight']}")
print("\--- Pi and e Data ---")
print(f"Type of 'pi': {type(data_pi['pi'])}")
print(f"Value of 'pi': {data_pi['pi']}")

Output:

--- Precise Data ---
Type of 'price': <class 'decimal.Decimal'>
Value of 'price': 123.45
Is it exactly 123.45? True
Type of 'weight': <class 'decimal.Decimal'>
Value of 'weight': 5.5
--- Pi and e Data ---
Type of 'pi': <class 'decimal.Decimal'>
Value of 'pi': 3.14159

Notice that int values from the JSON (like if we had "count": 10) would still be parsed as Python ints, which is correct and efficient.


Serialization: Python float to JSON

When you serialize Python objects back to a JSON string using json.dumps(), Python's float objects are automatically converted to JSON numbers. You can control the formatting of these numbers.

Example: Basic Serialization

import json
data = {
    "measurement": 98.6,
    "score": 95.9999999999999, # A float with many decimal places
    "id": 123
}
# Basic dumps
json_output = json.dumps(data)
print(f"Default JSON output: {json_output}")

Output:

Default JSON output: {"measurement": 98.6, "score": 96.0, "id": 123}

Notice how Python's float formatting logic rounded 9999999999999 to 0.

Example: Controlling Float Formatting with indent

The indent argument makes the output human-readable by pretty-printing it, but it doesn't change the default number formatting.

# Using indent for pretty printing
pretty_json_output = json.dumps(data, indent=2)
print(f"Pretty-printed JSON:\n{pretty_json_output}")

Output:

Pretty-printed JSON:
{
  "measurement": 98.6,
  "score": 96.0,
  "id": 123
}

Example: Controlling Precision with floatformat

If you need to control the number of decimal places, you can use the floatformat filter in a templating engine like Jinja2, or you can preprocess your data.

A common manual approach is to format the floats as strings before dumping, but this is often not what you want. A better way is to use a custom serializer.

Using a Custom Serializer for decimal.Decimal:

If you used Decimal for parsing, you'll need to serialize it back. The json library can't do this by default, so you provide a default function.

import json
from decimal import Decimal
data = {
    "price": Decimal("123.45"),
    "weight": Decimal("5.5")
}
def serialize_decimal(obj):
    """Custom JSON serializer for Decimal objects."""
    if isinstance(obj, Decimal):
        return float(obj)  # Convert back to float for JSON
    # Raise a TypeError for other unserializable types
    raise TypeError(f"Object of type {obj.__class__.__name__} is not JSON serializable")
json_output = json.dumps(data, default=serialize_decimal)
print(f"Serialized Decimal data: {json_output}")

Output:

Serialized Decimal data: {"price": 123.45, "weight": 5.5}

This works, but it brings you back to the original float precision issue. If you need to maintain precision in the JSON output itself (e.g., for a config file), you might serialize the Decimal as a string and handle it on the client side.


Summary and Best Practices

Task The "Easy" Way The "Precise" Way (Recommended for finance/science)
Parsing JSON to Python data = json.loads(json_string) data = json.loads(json_string, parse_float=Decimal)
Resulting Type float or int decimal.Decimal or int
Pros Simple, fast, works for most general use cases. Exact decimal precision. No floating-point errors.
Cons Limited precision. Can lead to rounding errors. Slightly slower, requires from decimal import Decimal.
Serializing Python to JSON json_string = json.dumps(python_dict) json_string = json.dumps(python_dict, default=serialize_func)
Handling Decimal on Serialize N/A Provide a default function that converts Decimal to float or str.

Recommendation:

  • For most web APIs and general data interchange: The default float behavior is usually fine. The precision errors are often negligible.
  • For financial, accounting, or scientific applications: Always use parse_float=Decimal. The cost of potential floating-point errors is far greater than the minor performance overhead of using decimal.Decimal.
分享:
扫描分享到社交APP
上一篇
下一篇