Python defaultdict如何自动初始化默认值？-杰瑞科技汇

Of course! Let's dive deep into collections.defaultdict in Python.

（图片来源网络，侵删）

What is a `defaultdict`?

A defaultdict is a subclass of Python's built-in dict. It behaves almost exactly like a regular dictionary, but with one key difference: it never raises a KeyError when you try to access or modify a key that doesn't exist.

Instead, it automatically creates a default value for that key the first time it's accessed. This is incredibly useful for avoiding repetitive if key in dict: or try...except KeyError: blocks.

The Problem: Why Do We Need `defaultdict`?

Imagine you want to count the frequency of each word in a sentence. With a standard dictionary, you might write code like this:

# The "manual" way with a regular dict
text = "the quick brown fox jumps over the lazy dog"
word_counts = {}
for word in text.split():
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1
print(word_counts)
# Output: {'the': 2, 'quick': 1, 'brown': 1, 'fox': 1, 'jumps': 1, 'over': 1, 'lazy': 1, 'dog': 1}

This works, but it's a bit clunky. You have to check for the key's existence every time. A try...except block is another common but verbose pattern.

（图片来源网络，侵删）

The Solution: Using `defaultdict`

Now, let's solve the same problem using defaultdict. The magic happens when you initialize it. You provide a "factory function" that will be called to create a default value whenever a new key is accessed.

The most common factory is list, which creates an empty list []. Another is int, which creates the integer 0.

Example 1: Counting Words (using `int`)

We can initialize our defaultdict with int. When a new key is accessed, int() is called, which returns 0.

from collections import defaultdict
text = "the quick brown fox jumps over the lazy dog"
# Initialize defaultdict with int, which returns 0 for new keys
word_counts = defaultdict(int)
for word in text.split():
    # If 'word' is not in word_counts, it's automatically added with a value of 0.
    # Then, we add 1 to it. No 'if' or 'try' needed!
    word_counts[word] += 1
print(word_counts)
# Output: defaultdict(<class 'int'>, {'the': 2, 'quick': 1, 'brown': 1, 'fox': 1, 'jumps': 1, 'over': 1, 'lazy': 1, 'dog': 1})
# You can access it just like a regular dict
print(word_counts['the'])
# Output: 2
# Accessing a non-existent key won't raise an error
print(word_counts['non_existent_word'])
# Output: 0

This is much cleaner and more readable!

（图片来源网络，侵删）

Example 2: Grouping Items (using `list`)

A very common use case is grouping items from a list into categories. Let's say we have a list of pets and we want to group them by their species.

from collections import defaultdict
pets = [
    {'name': 'Fido', 'species': 'dog'},
    {'name': 'Whiskers', 'species': 'cat'},
    {'name': 'Rex', 'species': 'dog'},
    {'name': 'Garfield', 'species': 'cat'},
    {'name': 'Goldie', 'species': 'fish'}
]
# Initialize defaultdict with list, which returns [] for new keys
pets_by_species = defaultdict(list)
for pet in pets:
    # Append the pet's name to the list for its species.
    # If the species key doesn't exist, an empty list is created first.
    pets_by_species[pet['species']].append(pet['name'])
print(pets_by_species)
# Output:
# defaultdict(<class 'list'>,
#             {'dog': ['Fido', 'Rex'],
#              'cat': ['Whiskers', 'Garfield'],
#              'fish': ['Goldie']})

How Does It Work? The `default_factory`

The default_factory is the core of defaultdict. It's stored as an attribute and is a function that takes no arguments.

Initialization: defaultdict(list) sets default_factory to the list function.
Access: When you do my_dict['new_key'], defaultdict checks if 'new_key' exists.
- If it exists, it returns the associated value.
- If it does not exist, it calls default_factory() (e.g., list()), assigns the result (an empty list []) to 'new_key', and then returns that new value.

`defaultdict` vs. `dict.setdefault()`

You might be familiar with the dict.setdefault() method, which also provides a way to handle missing keys. Let's compare it to our first example.

Using `setdefault()`

text = "the quick brown fox jumps over the lazy dog"
word_counts = {}
for word in text.split():
    # setdefault returns the value for the key if it exists.
    # If the key doesn't exist, it sets the key to the default value
    # and returns that default value.
    word_counts.setdefault(word, 0) += 1
print(word_counts)
# Output: {'the': 2, 'quick': 1, 'brown': 1, 'fox': 1, 'jumps': 1, 'over': 1, 'lazy': 1, 'dog': 1}

Comparison:

Feature	`defaultdict`	`dict.setdefault()`
Readability	Excellent. The logic is clear: `counts[word] += 1`.	Good, but slightly more verbose. The `setdefault` call is mixed with the update logic.
Performance	Faster. The key lookup happens only once.	Slower. The key is looked up twice: once by `setdefault` and once by the operator.
Best For	When you are frequently accessing and mutating keys that may not exist.	When you need to set a default value only once for a specific key and then perform other operations.

Conclusion: For the common use cases of counting and grouping, defaultdict is generally preferred for its superior performance and cleaner syntax.

Common Factory Functions

Here are the most common factory functions you'll use with defaultdict:

Factory	Default Value	Use Case
`int`	`0`	Counting, summing, or accumulating numerical values.
`list`	`[]` (empty list)	Grouping items into lists.
`set`	`set()` (empty set)	Grouping unique items.
`dict`	(empty dict)	Creating nested dictionaries.
`lambda x: None`	`None`	Any situation where a simple `None` placeholder is sufficient.

Example: Grouping Unique Items with `set`

from collections import defaultdict
data = ['a', 'b', 'a', 'c', 'b', 'a', 'd']
# Use set to automatically collect unique items
unique_items = defaultdict(set)
for item in data:
    unique_items[item].add(item) # Or just: unique_items[item] = item
print(unique_items)
# Output: defaultdict(<class 'set'>, {'a': {'a'}, 'b': {'b'}, 'c': {'c'}, 'd': {'d'}})

Important Caveats

The default_factory is only for missing keys. It is not called if the key exists and its value is None, 0, [], or any other "falsy" value.

dd = defaultdict(list)
dd['existing_key'] = [] # The key exists
dd['existing_key'].append('item') # This works fine
print(dd)
# Output: defaultdict(<class 'list'>, {'existing_key': ['item']})

defaultdict can still have missing keys. Just because it creates a value on access doesn't mean all keys are pre-populated. It only creates the value when you try to access it.

dd = defaultdict(int)
# The key 'missing' does not exist in the dictionary yet.
print('missing' in dd) # False
print(dd['missing']) # This access creates the key with value 0
print('missing' in dd) # Now it's True

Be careful with mutable defaults. This is a classic Python pitfall that also applies to defaultdict. If you use a mutable object like list or dict as a default, it can lead to unexpected behavior if you're not careful.
```
# DANGEROUS: Using a mutable default argument in a function
def add_to_list(key, item, my_dict=defaultdict(list)):
    my_dict[key].append(item)
    return my_dict
d1 = add_to_list('a', 1)
d2 = add_to_list('b', 2)
# The default dictionary is shared between calls!
print(d2) # You might expect {'b': [2]}
# Output: defaultdict(<class 'list'>, {'a': [1], 'b': [2]})
```
The same applies if you initialize a defaultdict with a lambda that returns the same mutable object. For grouping, this is usually what you want, but it's good to be aware of.

Summary

What it is: A dict subclass that provides a default value for missing keys.
Why use it: To write cleaner, more readable, and more performant code when dealing with keys that may not exist, especially for counting and grouping.
How it works: You initialize it with a default_factory function (like int, list, or set). This function is called to create a default value when a new key is first accessed.
Key Takeaway: If you find yourself writing if key in my_dict: or my_dict.setdefault(key, default), you should probably be using a defaultdict.

Python defaultdict如何自动初始化默认值？

What is a `defaultdict`?

The Problem: Why Do We Need `defaultdict`?

The Solution: Using `defaultdict`

Example 1: Counting Words (using `int`)

Example 2: Grouping Items (using `list`)

How Does It Work? The `default_factory`

`defaultdict` vs. `dict.setdefault()`

Using `setdefault()`

Common Factory Functions

Example: Grouping Unique Items with `set`

Important Caveats

Summary

99ANYc3cd6

ClickHouse Python如何高效连接查询？

Python statsmodels如何入门？

Java编程思想第五版PDF哪里能找到？

Python命令行与Python有何区别？

LeaderRank算法Python实现步骤详解？

Java Socket编程实例具体怎么实现？

U盘装Windows系统教程，详细步骤是怎样的？

python http encoding

Java memcache client如何正确使用？

Python dist packages是什么？如何管理？

Android Java Eclipse 如何开发？

Java的Date与MySQL的Date如何正确转换？

JavaScript基础教程第8版，如何快速掌握核心知识点？

Python HTTPError 302如何解决与重定向？

Python算法PDF哪里找？

string转datetime java，如何正确转换？

Python defaultdict如何自动初始化默认值？

What is a defaultdict?

The Problem: Why Do We Need defaultdict?

The Solution: Using defaultdict

Example 1: Counting Words (using int)

Example 2: Grouping Items (using list)

How Does It Work? The default_factory

defaultdict vs. dict.setdefault()

Using setdefault()

Common Factory Functions

Example: Grouping Unique Items with set

Important Caveats

Summary

相关推荐

Java Socket编程实例具体怎么实现？

What is a `defaultdict`?

The Problem: Why Do We Need `defaultdict`?

The Solution: Using `defaultdict`

Example 1: Counting Words (using `int`)

Example 2: Grouping Items (using `list`)

How Does It Work? The `default_factory`

`defaultdict` vs. `dict.setdefault()`

Using `setdefault()`

Example: Grouping Unique Items with `set`