目录 | 上一节 (2.4 序列) | [下一节 (2.6 列表推导式)]()

2.5 collections 模块

collections 模块为数据处理提供了许多有用的对象。本局部简要介绍其中的一些个性。

示例：事物计数

假如要把每只股票的总份额表格化。

portfolio = [
    ('GOOG', 100, 490.1),
    ('IBM', 50, 91.1),
    ('CAT', 150, 83.44),
    ('IBM', 100, 45.23),
    ('GOOG', 75, 572.45),
    ('AA', 50, 23.15)
]

此表中有两个 IBM 条目，两个 GOOG 条目，它们应该以某种形式合并到一起。

计数

解决方案：应用 Counter 模块。

from collections import Counter
total_shares = Counter()
for name, shares, price in portfolio:
    total_shares[name] += shares

total_shares['IBM']     # 150

示例：一对多映射

问题：把一个键映射到多个值。

portfolio = [
    ('GOOG', 100, 490.1),
    ('IBM', 50, 91.1),
    ('CAT', 150, 83.44),
    ('IBM', 100, 45.23),
    ('GOOG', 75, 572.45),
    ('AA', 50, 23.15)
]

像之前的示例那样，键 IBM 应具备两个不同的元组。

解决方案：应用 defaultdict 模块。

from collections import defaultdict
holdings = defaultdict(list)
for name, shares, price in portfolio:
    holdings[name].append((shares, price))
holdings['IBM'] # [ (50, 91.1), (100, 45.23) ]

defaultdict模块确保每次拜访键的时候获取到一个默认值。

示例：保留历史记录

问题：咱们须要最近 N 件事的历史。

解决方案：应用 deque 模块。

from collections import deque

history = deque(maxlen=N)
with open(filename) as f:
    for line in f:
        history.append(line)
        ...

练习

collections 可能是最有用的库模块之一，用于解决非凡用处的数据处理问题，例如表格化或者索引化。

在本练习中，咱们来看几个简略的例子。首先运行report.py ，以便在交互模式下可能加载股票投资组合。

bash % python3 -i report.py

练习 2.18：应用 Counter 模块表格化

假如须要将每支股票的份额总数表格化，那么应用 Counter 对象会很容易。试试看：

>>> portfolio = read_portfolio('Data/portfolio.csv')
>>> from collections import Counter
>>> holdings = Counter()
>>> for s in portfolio:
        holdings[s['name']] += s['shares']

>>> holdings
Counter({'MSFT': 250, 'IBM': 150, 'CAT': 150, 'AA': 100, 'GE': 95})
>>>

仔细观察portfolio 中的 MSFT 和 IBM 的多个条目是如何合并的。

能够像字典一样应用 Counter 模块检索单个值。

>>> holdings['IBM']
150
>>> holdings['MSFT']
250
>>>

如果想要对值排名，这样做：

>>> # Get three most held stocks
>>> holdings.most_common(3)
[('MSFT', 250), ('IBM', 150), ('CAT', 150)]
>>>

让咱们获取另一个股票投资组合并生成一个新的 Counter 对象：

>>> portfolio2 = read_portfolio('Data/portfolio2.csv')
>>> holdings2 = Counter()
>>> for s in portfolio2:
          holdings2[s['name']] += s['shares']

>>> holdings2
Counter({'HPQ': 250, 'GE': 125, 'AA': 50, 'MSFT': 25})
>>>

最初，通过一个简略的操作把所有的 holdings 变量合并。

>>> holdings
Counter({'MSFT': 250, 'IBM': 150, 'CAT': 150, 'AA': 100, 'GE': 95})
>>> holdings2
Counter({'HPQ': 250, 'GE': 125, 'AA': 50, 'MSFT': 25})
>>> combined = holdings + holdings2
>>> combined
Counter({'MSFT': 275, 'HPQ': 250, 'GE': 220, 'AA': 150, 'IBM': 150, 'CAT': 150})
>>>

这只是对 Counter 性能的一个小尝试，如果发现须要对值进行表格化，那么就应该思考应用它。

阐明：collections 模块

collections 模块是 Python 所有库中最有用的库模块之一。实际上，咱们能够为此做一个拓展教程，然而，当初这样做会扩散注意力。从当初开始，把collections列为您的睡前读物，以备后用。

目录 | 上一节 (2.4 序列) | [下一节 (2.6 列表推导式)]()

注：残缺翻译见 https://github.com/codists/practical-python-zh

关于python:翻译实用的Python编程0205Collections

2.5 collections 模块

示例：事物计数

计数

示例：一对多映射

示例：保留历史记录

练习

练习 2.18：应用 Counter 模块表格化

阐明：collections 模块

评论

发表回复取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

关于python:翻译实用的Python编程0205Collections

2.5 collections 模块

示例：事物计数

计数

示例：一对多映射

示例：保留历史记录

练习

练习 2.18：应用 Counter 模块表格化

阐明：collections 模块

评论

发表回复 取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

发表回复取消回复