[分享] Python Pandas之学习(Lesson_2)
582 查看
4 回复
 楼主 | 发布于 2018-04-15 | 只看楼主
分享到:

下面我们对于时区进行计数。一种是使用标准库,另一种是使用panda。计数的方法是在遍历时区的过程中将计数值保存在字典中:

import json

path = './Python Data Analysis Example_1.txt'

records = [json.loads(line) for line in open(path)]

#print(records[0]['tz'])

time_zones = [rec['tz'] for rec in records if 'tz' in rec]

#print(time_zones[:10])


def get_counts(sequence):

    counts = {}

    for x in sequence:

        if x in counts:

            counts[x] += 1

        else:

            counts[x] = 1

    return counts

counts = get_counts(time_zones)

print(counts['America/New_York'])

print(len(time_zones))

输出结果:

1251

3440

如果想得到前十位的时区以及其技术值,则将上面的程序补充:

import json

path = './Python Data Analysis Example_1.txt'

records = [json.loads(line) for line in open(path)]

#print(records[0]['tz'])

time_zones = [rec['tz'] for rec in records if 'tz' in rec]

#print(time_zones[:10])

def get_counts(sequence):

    counts = {}

    for x in sequence:

        if x in counts:

            counts[x] += 1

        else:

            counts[x] = 1

    return counts

def top_counts(count_dict, n=10):

    value_key_pairs = [(count, tz) for tz, count in count_dict.items()]

    value_key_pairs.sort()

    return value_key_pairs[-n:]

counts = get_counts(time_zones)

print(counts['America/New_York'])

print(len(time_zones))

print(top_counts(counts))

输出结果:

1251

3440

[(33, 'America/Sao_Paulo'), (35, 'Europe/Madrid'), (36, 'Pacific/Honolulu'), (37, 'Asia/Tokyo'), (74, 'Europe/London'), (191, 'America/Denver'), (382, 'America/Los_Angeles'), (400, 'America/Chicago'), (521, ''), (1251, 'America/New_York')]

我们可一使用Python标准库中的collection.Counter,程序如下:

from collections import Counter

counts = Counter(time_zones)

print(counts.most_common(10))

输出结果:

[('America/New_York', 1251), ('', 521), ('America/Chicago', 400), ('America/Los_Angeles', 382), ('America/Denver', 191), ('Europe/London', 74), ('Asia/Tokyo', 37), ('Pacific/Honolulu', 36), ('Europe/Madrid', 35), ('America/Sao_Paulo', 33)]


NEXT。。。

(0 ) (0 )
回复 举报

回复于 2018-04-15 沙发

感谢分享;
(0 )
评论 (0) 举报

回复于 2018-04-16 2#

感谢分享
(0 )
评论 (0) 举报

回复于 2018-04-16 3#

多谢分享!!!
(0 )
评论 (0) 举报

回复于 2018-04-17 4#

都是高难度,谢谢分享!!!
(0 )
评论 (0) 举报
  • 发表回复
    0/3000





    举报

    请选择举报类别

    • 广告垃圾
    • 违规内容
    • 恶意灌水
    • 重复发帖

    全部板块

    返回顶部