前言

Weekly Contest 142的 大样本统计:

我们对 0255 之间的整数进行采样,并将结果存储在数组 count 中:count[k] 就是整数 k 的采样个数。

我们以 浮点数 数组的形式,分别返回样本的最小值、最大值、平均值、中位数和众数。其中,众数是保证唯一的。

我们先来回顾一下中位数的知识:

  • 如果样本中的元素有序,并且元素数量为奇数时,中位数为最中间的那个元素;
  • 如果样本中的元素有序,并且元素数量为偶数时,中位数为中间的两个元素的平均值。

示例1:

输入:count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]输出:[1.00000,3.00000,2.37500,2.50000,3.00000]

示例2:

输入:count = [0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]输出:[1.00000,4.00000,2.18182,2.00000,1.00000]

提示:

  1. count.length == 256
  2. 1 <= sum(count) <= 10^9
  3. 计数表示的众数是唯一的
  4. 答案与真实值误差在 10^-5 以内就会被视为正确答案

解题思路

本地难度为中等,首先需要读懂题目意思,本题的入参数组count其实算是一个压缩数据后的数组。

我们对 0255 之间的整数进行采样,并将结果存储在数组 count 中:count[k] 就是整数 k 的采样个数。

简单来说就是,数组count的第k个元素就是k在压缩前的数组中出现count[k]个。以示例1count为例

[0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

解压的过程如下:

第0个元素为0,则解压后的数组为[]第1个元素为4,则解压后的数组为[1,1,1,1]第2个元素为3,则解压后的数组为[1,1,1,1,2,2,2]第3个元素为2,则解压后的数组为[1,1,1,1,2,2,2,3,3]第4个元素为2,则解压后的数组为[1,1,1,1,2,2,2,3,3,4,4]......省略后续步骤

搞清楚count的数据特征后,选择使用TreeMapcount进行处理,将有效数字及其出现个数存储起来(有效数字指的是count[k]不为0的元素)。根据就是根据题目要求分别处理以下指标:

  • 最小值:TreeMap中第一个key
  • 最大值:TreeMap中最后一个key
  • 平均值:TreeMapkey之和除以value之和
  • 中位数:

    1. 计算出数组实际的元素个数(即value之和)
    2. 根据元素个数的奇偶性,获取对应的值
  • 众数:出现次数最多的数字,即TreeMapvalue最大的键值对的key

实现代码

    /**     * 1093. 大样本统计     *     * @param count     * @return     */    public double[] sampleStats(int[] count) {        // 使用TreeMap有序存储数字及其出现次数        TreeMap<Integer, Integer> countMap = new TreeMap<>();        double[] result = new double[5];        // 总和        double sum = 0L;        // 数字出现总次数        double total = 0L;        // 最大出现次数        long maxTimes = 0;        // 最小值        double min;        // 最大值        double max;        // 平均值        double average;        // 中位数        double middle = 0;        // 众数,出现次数最多的数字        double mode = 0;        for (int i = 0; i < count.length; i++) {            if (count[i] != 0) {                countMap.put(i, count[i]);                sum = sum + i * count[i];                total += count[i];                if (count[i] > maxTimes) {                    maxTimes = count[i];                    mode = i;                }            }        }        min = countMap.firstKey().doubleValue();        max = countMap.lastKey().doubleValue();        average = sum / total;        // 是否为奇数        boolean odd = total % 2 != 0;        // 中位数索引        int middleIndex = (int) ((total - 1) / 2);        int index = -1;        Iterator<Map.Entry<Integer, Integer>> it = countMap.entrySet().iterator();        while (it.hasNext()) {            Map.Entry<Integer, Integer> entry = it.next();            int num = entry.getKey();            int times = entry.getValue();            index += times;            if (index > middleIndex) {                middle = num;                break;            } else if (index == middleIndex) {                if (odd) {                    middle = num;                    break;                } else {                    middle = (num + it.next().getKey()) / 2.0;                    break;                }            }        }        result[0] = min;        result[1] = max;        result[2] = average;        result[3] = middle;        result[4] = mode;        return result;    }