前言
Weekly Contest 142的 大样本统计:
我们对
0
到255
之间的整数进行采样,并将结果存储在数组count
中:count[k]
就是整数k
的采样个数。我们以 浮点数 数组的形式,分别返回样本的最小值、最大值、平均值、中位数和众数。其中,众数是保证唯一的。
我们先来回顾一下中位数的知识:
- 如果样本中的元素有序,并且元素数量为奇数时,中位数为最中间的那个元素;
- 如果样本中的元素有序,并且元素数量为偶数时,中位数为中间的两个元素的平均值。
示例1:
输入:count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]输出:[1.00000,3.00000,2.37500,2.50000,3.00000]
示例2:
输入:count = [0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]输出:[1.00000,4.00000,2.18182,2.00000,1.00000]
提示:
count.length == 256
1 <= sum(count) <= 10^9
计数表示的众数是唯一的
- 答案与真实值误差在
10^-5
以内就会被视为正确答案
解题思路
本地难度为中等,首先需要读懂题目意思,本题的入参数组count
其实算是一个压缩数据后的数组。
我们对0
到255
之间的整数进行采样,并将结果存储在数组count
中:count[k]
就是整数k
的采样个数。
简单来说就是,数组count
的第k
个元素就是k
在压缩前的数组中出现count[k]
个。以示例1的count
为例
[0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
解压的过程如下:
第0个元素为0,则解压后的数组为[]第1个元素为4,则解压后的数组为[1,1,1,1]第2个元素为3,则解压后的数组为[1,1,1,1,2,2,2]第3个元素为2,则解压后的数组为[1,1,1,1,2,2,2,3,3]第4个元素为2,则解压后的数组为[1,1,1,1,2,2,2,3,3,4,4]......省略后续步骤
搞清楚count
的数据特征后,选择使用TreeMap
对count
进行处理,将有效数字及其出现个数存储起来(有效数字指的是count[k]
不为0的元素)。根据就是根据题目要求分别处理以下指标:
- 最小值:
TreeMap
中第一个key
- 最大值:
TreeMap
中最后一个key
- 平均值:
TreeMap
的key
之和除以value
之和 中位数:
- 计算出数组实际的元素个数(即
value
之和) - 根据元素个数的奇偶性,获取对应的值
- 计算出数组实际的元素个数(即
- 众数:出现次数最多的数字,即
TreeMap
中value
最大的键值对的key
实现代码
/** * 1093. 大样本统计 * * @param count * @return */ public double[] sampleStats(int[] count) { // 使用TreeMap有序存储数字及其出现次数 TreeMap<Integer, Integer> countMap = new TreeMap<>(); double[] result = new double[5]; // 总和 double sum = 0L; // 数字出现总次数 double total = 0L; // 最大出现次数 long maxTimes = 0; // 最小值 double min; // 最大值 double max; // 平均值 double average; // 中位数 double middle = 0; // 众数,出现次数最多的数字 double mode = 0; for (int i = 0; i < count.length; i++) { if (count[i] != 0) { countMap.put(i, count[i]); sum = sum + i * count[i]; total += count[i]; if (count[i] > maxTimes) { maxTimes = count[i]; mode = i; } } } min = countMap.firstKey().doubleValue(); max = countMap.lastKey().doubleValue(); average = sum / total; // 是否为奇数 boolean odd = total % 2 != 0; // 中位数索引 int middleIndex = (int) ((total - 1) / 2); int index = -1; Iterator<Map.Entry<Integer, Integer>> it = countMap.entrySet().iterator(); while (it.hasNext()) { Map.Entry<Integer, Integer> entry = it.next(); int num = entry.getKey(); int times = entry.getValue(); index += times; if (index > middleIndex) { middle = num; break; } else if (index == middleIndex) { if (odd) { middle = num; break; } else { middle = (num + it.next().getKey()) / 2.0; break; } } } result[0] = min; result[1] = max; result[2] = average; result[3] = middle; result[4] = mode; return result; }