关于数据库:MongoDB学习之聚合操作

3次阅读

共计 29207 个字符,预计需要花费 74 分钟才能阅读完成。

MongoDB 学习之基本操作

mongo 的聚合操作和 mysql 的查问类比

SQL 操作 / 函数 mongodb 聚合操作
where $match
group by $group
having $match
select $project
order by $sort
limit $limit
sum() $sum
count() $sum
join $lookup

上面一些例子和 sql 做比照

上面是应用数据库的一个根本构造

数据链接(提取码:gqh2)

{_id: ObjectId("5dbe7a545368f69de2b4d36e"),
  street: '493 Hilll Curve',
  city: 'Champlinberg',
  state: 'Texas',
  country: 'Malaysia',
  zip: '24344-1715',
  phone: '425.956.7743 x4621',
  name: 'Destinee Schneider',
  userId: 3573,
  orderDate: 2019-03-26T03:20:08.805Z,
  status: 'created',
  shippingFee: Decimal128("8.00"),
  orderLines: [ 
     { 
       product: 'Refined Fresh Tuna',
       sku: '2057',
       qty: 25,
       price: Decimal128("56.00"),
       cost: Decimal128("46.48") },
     {
       product: 'Refined Concrete Ball',
       sku: '1732',
       qty: 61,
       price: Decimal128("47.00"),
       cost: Decimal128("47") 
     },
  ],
  total: Decimal128("407") 
}

先来一些操作案例

select sum(total) from orders

db.orders.aggregate({$group:{_id:null,total:{$sum:"$total"}}})
后果:{_id: null, 'total': Decimal128("44019609") }
select count(1) from orders

db.orders.aggregate({$group:{_id:null,total:{$sum:1}}})
后果:{_id: null, total: 100000}
select count(1) from orders group by status

db.orders.aggregate({$group:{_id:"$status",total:{$sum:1}}})
后果:{_id: 'created', total: 20087}
{_id: 'shipping', total: 20017}
{_id: 'cancelled', total: 19978}
{_id: 'completed', total: 20015}
{_id: 'fulfilled', total: 19903}
select count(1) from orders group by status having count(1) > 20000

db.orders.aggregate([{$group:{_id:{status:'$status'},total:{$sum:1}}},
    {$match:{total:{$gte:20000}}}
 ])
后果:{_id: { status: 'created'}, total: 20087 }
{_id: { status: 'shipping'}, total: 20017 }
{_id: { status: 'completed'}, total: 20015 }
select count(1) total
from orders 
group by status,year(orderDate),month(orderDate)
order by year(orderDate),month(orderDate)

db.orders.aggregate([
  {
    $group:{
      _id:{
         status:'$status',
         orderDate:{year:{$year:"$orderDate"},
           month:{$month:"$orderDate"}
         }
       },
       total:{$sum:1}
     }
  },{$sort:{"_id.orderDate.year":1,"_id.orderDate.month":1}
  }
])
后果:{_id: { status: 'cancelled', orderDate: { year: 2019, month: 1} }, total: 2066 }
{_id: { status: 'shipping', orderDate: { year: 2019, month: 1} }, total: 2058 }
{_id: { status: 'completed', orderDate: { year: 2019, month: 1} }, total: 2068 }
{_id: { status: 'created', orderDate: { year: 2019, month: 1} }, total: 2047 }
{_id: { status: 'fulfilled', orderDate: { year: 2019, month: 1} }, total: 2076 }
{_id: { status: 'cancelled', orderDate: { year: 2019, month: 2} }, total: 1816 }
{_id: { status: 'created', orderDate: { year: 2019, month: 2} }, total: 1817 }
{_id: { status: 'shipping', orderDate: { year: 2019, month: 2} }, total: 1844 }
{_id: { status: 'completed', orderDate: { year: 2019, month: 2} }, total: 1813 }
{_id: { status: 'fulfilled', orderDate: { year: 2019, month: 2} }, total: 1913 }
......
select *
from(select month(orderDate) month,name,status
    from orders
) order
where month = 2

db.orders.aggregate([{$project:{month:{$month:"$orderDate"},name:1,status:1}},{$match:{month:2}}]) 
后果:{_id: ObjectId("5dbe7a542411dc9de6429190"),name: 'Kris Hansen',status: 'cancelled',month: 2 }
{_id: ObjectId("5dbe7a542411dc9de6429191"),name: 'Constantin Wuckert',status: 'completed',month: 2 }
{_id: ObjectId("5dbe7a545368f69de2b4d375"),name: 'Reed Jerde',status: 'fulfilled',month: 2 }
{_id: ObjectId("5dbe7a54cd023b9de4efc1d2"),name: 'Lyric Hodkiewicz',status: 'cancelled',month: 2 }
.....
select count(*) from orders where month(orderDate) >= 3 group by month(orderDate)

db.orders.aggregate([{$match:{$expr:{$gte:[{$month:"$orderDate"},3]}}},
    {$group:{_id:{month:{$month:"$orderDate"}},count:{$sum:1}}}
 ]);
#后果
{_id: { month: 6}, count: 9915 }
{_id: { month: 4}, count: 10083 }
{_id: { month: 10}, count: 9928 }
{_id: { month: 5}, count: 10142 }
{_id: { month: 8}, count: 10194 }
{_id: { month: 9}, count: 9779 }
{_id: { month: 7}, count: 10240 }
{_id: { month: 3}, count: 10201 }

MongoDB 聚合(Aggregate)

MongoDB 中聚合 (aggregate) 次要用于解决数据(诸如统计平均值,求和等),并返回计算后的数据后果

一个 aggregate 由多个阶段 (Stage) 组成。上一阶段产生的后果会作为下一阶段的输出,所以也会被形象的称为流水线(Pipeline)。

表达式:解决输出文档并输入。表达式是无状态的,只能用于计算以后聚合管道的文档,不能解决其它的文档。

这里咱们介绍一下聚合框架中罕用的几个操作:

  • $project:批改输出文档的构造。能够用来重命名、减少或删除域,也能够用于创立计算结果以及嵌套文档。
  • $match:用于过滤数据,只输入符合条件的文档。$match应用 MongoDB 的规范查问操作。
  • $limit:用来限度 MongoDB 聚合管道返回的文档数。
  • $skip:在聚合管道中跳过指定数量的文档,并返回余下的文档。
  • $unwind:将文档中的某一个数组类型字段拆分成多条,每条蕴含数组中的一个值。
  • $group:将汇合中的文档分组,可用于统计后果。
  • $sort:将输出文档排序后输入。
  • $geoNear:输入靠近某一地理位置的有序文档。

上面是一个 aggregate 的根本解决流程

  • db.collection.aggregate() 能够用多个构件创立一个管道,对于一连串的文档进行解决。这些构件包含:筛选操作的match、映射操作的project、分组操作的group、排序操作的sort、限度操作的limit、和跳过操作的skip
  • db.collection.aggregate()应用了 MongoDB 内置的原生操作,聚合效率十分高, 反对相似于 SQL Group By 操作的性能。
  • 每个阶段管道限度为 100MB 的内存。如果一个节点管道超过这个极限,MongoDB将产生一个谬误。为了可能在解决大型数据集, 能够设置 allowDiskUse 为 true 来在聚合管道节点把数据写入临时文件。这样就能够解决 100MB 的内存的限度。
  • db.collection.aggregate()能够作用在分片汇合,但后果不能输在分片汇合,MapReduce能够 作用在分片汇合,后果也能够输在分片汇合。
  • db.collection.aggregate()办法能够返回一个指针(cursor),数据放在内存中,间接操作。跟Mongo shell 一样指针操作。
  • db.collection.aggregate()输入的后果只能保留在一个文档中,BSON Document大小限度为 16M。能够通过返回指针解决,版本 2.6 中前面:db.collect.aggregate() 办法返回一个指针,能够返回任何后果集的大小。

$count

返回文档统计数

先看一些非聚合操作中的 count 应用办法

# 对应查问进去的是 orders 这个汇合中的所有数据总和
db.orders.count();
#后果:{"result": 100000}
#对应查出来自 Malaysia 这个国家的订单总和
db.orders.find({country:"Malaysia"}).count()
#后果:{"result": 392}

应用聚合操作中 $count 来汇总行数

# 应用聚合查出来自 Malaysia 这个国家的订单总和, 并且返回给 counts 字段
db.orders.aggregate([{$match:{country:"Malaysia"}},
    {$count:"counts"}
])
#后果:{"counts": 392}
#上面是两种不同的写法只是在 $match 的时候有所区别(能够先体验以下)
db.orders.aggregate([{$match:{country:{$eq:"Malaysia"}}},
    {$count:"counts"}
])
db.orders.aggregate([{$match:{$expr:{$eq:["$country","Malaysia"]}}},
    {$count:"counts"}
])
#后果:{"counts": 392}

除此以外能够灵便应用 group+$sum 来实现$count

# 对应查问进去的是 orders 这个汇合中的所有数据总和, 并且返回给 counts 字段
db.orders.aggregate({$group:{_id:null,counts:{$sum:1}}})
#后果:{"_id": null,"counts": 100000}
#应用聚合查出来自 Malaysia 这个国家的订单总和, 并且返回给 counts 字段
db.orders.aggregate([{$match:{country:{$eq:"Malaysia"}}},
    {$group:{_id:null,counts:{$sum:1}}}
])
#后果:{"_id": null,"counts": 392}

$group

依照指定表达式对文档进行分组

$group应用的根本语法:

{$group: { _id: <expression>, <field1>: { <accumulator1> : <expression1>}, ... } }
  1. _id+表达式 用来做分组条件,也就是 _id 前面的内容与 sqlgroup by前面的表达式的用处雷同
  2. _id前面的 字段 +accumulator操作符与 sql 中做完 group by 后在 select 前面的的聚合函数用处雷同,例如:sum()avg()max()min()

例如:

db.orders.aggregate({$group:{_id:"$country",total:{$sum:"$total"}}})
#后果
{_id: 'Guam', total: Decimal128("182335") }
{_id: 'El Salvador', total: Decimal128("159475") }
{_id: 'Saint Martin', total: Decimal128("163267") }
{_id: 'Botswana', total: Decimal128("189330") }
{_id: 'San Marino', total: Decimal128("174200") }
{_id: 'Czech Republic', total: Decimal128("178602") }
{_id: 'Estonia', total: Decimal128("172816") }
.......
#下面的 mql 相当于 sql 中的
select sum(total) from orders group by country

$group阶段的内存限度为 100M。默认状况下,如果stage 超过此限度,$group将产生谬误。然而,要容许解决大型数据集,请将 allowDiskUse 选项设置为 true 以启用 $group 操作以写入临时文件。

名称 形容 类比 sql
$avg 计算均值 avg
$first 返回每组第一个文档,如果有排序,依照排序,如果没有依照默认的存储的程序的第一个文档。 limit 0,1
$last 返回每组最初一个文档,如果有排序,依照排序,如果没有依照默认的存储的程序的最初个文档。
$max 依据分组,获取汇合中所有文档对应值得最大值。 max
$min 依据分组,获取汇合中所有文档对应值得最小值。 min
$push 将指定的表达式的值增加到一个数组中。
$addToSet 将表达式的值增加到一个汇合中(无反复值,无序)。
$sum 计算总和 sum
$stdDevPop 返回输出值的总体标准偏差(population standard deviation)
$stdDevSamp 返回输出值的样本标准偏差(the sample standard deviation)

上面咱们依照以上文档顺次用一下每一个表达式

  1. $avg计算平均值

    -- 计算每个国家的每个订单的均匀生产
    db.orders.aggregate({$group:{
            _id:"$country",
            avgMoney:{$avg:"$total"}
        }})
    -- 后果    
    {_id: 'Saudi Arabia',avgMoney: Decimal128("433.4898419864559819413092550790068") }
    {_id: 'New Caledonia',avgMoney: Decimal128("441.9833729216152019002375296912114") }
    {_id: 'Congo',avgMoney: Decimal128("451.8834951456310679611650485436893") }
    {_id: 'Turkey',avgMoney: Decimal128("425.7422434367541766109785202863962") }
    {_id: 'Cuba',avgMoney: Decimal128("437.2074074074074074074074074074074") }
    {_id: 'Uruguay',avgMoney: Decimal128("434.1564792176039119804400977995110") }
    ......
  2. $first返回第一个文档

    -- 依据国家分组,每组第一笔订单的订单商品列表
    db.orders.aggregate({$group:{
        _id:"$country",
        firstOrderLines:{$first:"$orderLines"}
    }})
    -- 后果
    { _id: 'Malta',firstOrderLines: [{ product: 'Sleek Frozen Table',sku: '368',qty: 30,price: Decimal128("24.00"),cost:Decimal128("19.44") },
      {product: 'Intelligent Metal',sku: '179',qty: 62,price: Decimal128("91.00"),cost: Decimal128("90.09") },
      {product: 'Intelligent Granite',sku: '9',qty: 31,price: Decimal128("68.00"),cost: Decimal128("61.88") },
      {product: 'Licensed Cotton',sku: '6846',qty: 9,price: Decimal128("16.00"),cost: Decimal128("15.68") } 
    ] }
    { _id: 'Papua New Guinea',firstOrderLines: [{ product: 'Fantastic Wooden',sku: '4333',qty: 32,price: Decimal128("58.00"),cost: Decimal128("57.42") }
      ...
    ] }
    ......
    
    -- 依据国家分组,每组第一笔订单的订单商品列表外面的第一条商品信息
    db.orders.aggregate({$group:{
        _id:"$country",
        firstOrder:{$first:{$first:"$orderLines"}}
    }})
    --- 后果
    { _id: 'Malta',firstOrder:  
     {product: 'Sleek Frozen Table',sku: '368',qty: 30,price: Decimal128("24.00"),cost:Decimal128("19.44") }
    }
    { _id: 'Papua New Guinea',firstOrder:
      {product: 'Fantastic Wooden',sku: '4333',qty: 32,price: Decimal128("58.00"),cost: Decimal128("57.42") }
    }
    ......
  3. $last返回最初一个文档

    -- 依据每个国家分组,每笔最初一个订单的 orderDate
    db.orders.aggregate([{$group:{
        _id:"$country",
        lastOrderDate:{$last:"$orderDate"}
    }}])
    -- 后果
    {_id: 'Micronesia', lastOrderDate: 2019-01-15T07:23:18.002Z}
    {_id: 'Malaysia', lastOrderDate: 2019-05-15T20:16:56.644Z}
    {_id: 'San Marino', lastOrderDate: 2019-09-29T06:10:07.292Z}
  4. $max$min:最大值和最小值

    -- 依据年月分组,查出每组第一笔订单工夫和最初一组订单工夫
    db.orders.aggregate({$group:{_id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
        maxDate:{$max:"$orderDate"},
        minDate:{$min:"$orderDate"}
    }})
    -- 后果
    {_id: { year: 2019, month: 1}, maxDate: 2019-01-31T23:53:57.308Z, minDate: 2019-01-01T00:03:59.661Z }
    {_id: { year: 2019, month: 4}, maxDate: 2019-04-30T23:57:03.352Z, minDate: 2019-04-01T00:02:12.224Z }
    {_id: { year: 2019, month: 3}, maxDate: 2019-03-31T23:55:10.312Z, minDate: 2019-03-01T00:13:53.761Z }
    {_id: { year: 2019, month: 7}, maxDate: 2019-07-31T23:55:51.718Z, minDate: 2019-07-01T00:00:07.540Z }
  5. $push将指定值增加到一个数组当中 能够 push 到一个曾经存在的数组当中,如果不存在会创立这样一个数组

    -- 依据城市、年、月分组,将每组的下单工夫 push 到一个新的 orderDates 数组当中
    db.orders.aggregate({$group:{_id:{city:"$city",year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
        orderDates:{$push:"$orderDate"},
    }})
    -- 后果
    {_id: { city: 'Kennedifurt', year: 2019, month: 9}, orderDates: [2019-09-30T10:25:19.763Z] }
    {_id: { city: 'South Jewelstad', year: 2019, month: 1}, orderDates: [2019-01-06T19:59:03.380Z] }
    {_id: { city: 'Germanmouth', year: 2019, month: 9}, orderDates: [2019-09-25T07:45:54.260Z] }
    {_id: { city: 'Fayebury', year: 2019, month: 8}, orderDates: [2019-08-12T11:08:37.815Z] }
    {_id: { city: 'New Lailaport', year: 2019, month: 1}, orderDates: [2019-01-19T12:28:56.978Z] }
    {_id: { city: 'Port Bennyside', year: 2019, month: 2}, orderDates: [2019-02-25T01:18:21.657Z] }
    {_id: { city: 'Abernathymouth', year: 2019, month: 6}, orderDates: 
       [ 2019-06-03T18:03:21.149Z,
         2019-06-13T23:35:32.994Z,
         2019-06-18T11:32:22.229Z ] 
    }
  6. $addToSet将指定值增加到一个汇合当中 汇合是无序的并且会去重

    -- 依据月份分组,将每个月都下单过的国家都增加到 countrySet 中去
    db.orders.aggregate({
        $group:{_id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            countrySet:{$addToSet:"$country"}
        }
    })
    -- 后果  
    {
        "_id": {
          "year": 2019,
          "month": 1
        },
        "countrySet": ["French Guiana", "Germany", "Poland", "Comoros", "Portugal", "Fiji", "France", "Benin", "Greece", "Belarus", "Vietnam", "Ireland", "Vanuatu", "Netherlands Antilles", "Iceland", "Palestinian Territory", "Malawi", "Brazil", "Libyan Arab Jamahiriya", "Kuwait", "Liechtenstein", "Suriname", "Uganda", "New Caledonia", "Bolivia", "Nicaragua", "Burundi", "Uzbekistan", "Jamaica", "South Georgia and the South Sandwich Islands", "Tajikistan", "Mexico", "Singapore", "Sri Lanka", "Antarctica (the territory South of 60 deg S)", "Myanmar", "Tonga", "Slovenia", "Latvia", "Ukraine", "Oman", "Saint Helena", "Bosnia and Herzegovina", "Hungary", "Aruba", "Jordan", "Solomon Islands", "Mozambique", "Svalbard & Jan Mayen Islands", "Taiwan", "Cyprus", "Thailand", "Equatorial Guinea", "Belize", "Niger", "Israel", "Hong Kong", "Senegal", "Costa Rica", "Sierra Leone", "Kiribati", "Lesotho", "Nepal", "Serbia", "Barbados", "Spain", "Czech Republic", "Saint Martin", "Saint Pierre and Miquelon", "Togo", "Somalia", "Northern Mariana Islands", "Maldives", "British Indian Ocean Territory (Chagos Archipelago)", "Montenegro", "Cote d'Ivoire","United Arab Emirates","Guernsey","Bulgaria","Netherlands","Greenland","Niue","Colombia","Egypt","Madagascar","Brunei Darussalam","Iraq","Mauritius","French Polynesia","Jersey","Canada","Grenada","Honduras","New Zealand","Cocos (Keeling) Islands","Mayotte","Virgin Islands, British","Finland","Macedonia","Cook Islands","Micronesia","Christmas Island","Turks and Caicos Islands","Falkland Islands (Malvinas)","El Salvador","Estonia","Eritrea","Afghanistan","San Marino","Malaysia","Cambodia","Anguilla","Philippines","Zambia","Republic of Korea","Mauritania","Yemen","South Africa","Gambia","Namibia","Peru","Samoa","Qatar","Guinea","Monaco","Mongolia","Cayman Islands","Bouvet Island (Bouvetoya)","Romania","Sweden","Guam","Guyana","Japan","Bangladesh","Djibouti","Reunion","Central African Republic","Martinique","Sudan","Norway","Guadeloupe","Malta","Papua New Guinea","Macao","Tunisia","Iran","Ghana","Trinidad and Tobago","Syrian Arab Republic","French Southern Territories","Russian Federation","Botswana","Pakistan","Luxembourg","Ethiopia","Austria","Rwanda","Holy See (Vatican City State)","American Samoa","Tanzania","Morocco","Lao People's Democratic Republic", "Faroe Islands", "Bahrain", "China", "Indonesia", "Ecuador", "Tuvalu", "Panama", "Algeria", "Gibraltar", "Nigeria", "Kyrgyz Republic", "Chile", "Cape Verde", "Palau", "Armenia", "Dominican Republic", "Bhutan", "Liberia", "India", "Mali", "Switzerland", "Isle of Man", "Argentina", "Virgin Islands, U.S.", "Swaziland", "Timor-Leste", "Azerbaijan", "Bahamas", "Guatemala", "Saint Lucia", "Sao Tome and Principe", "United States Minor Outlying Islands", "Australia", "Italy", "Paraguay", "Tokelau", "Gabon", "Wallis and Futuna", "Cameroon", "Norfolk Island", "Guinea-Bissau", "Chad", "Zimbabwe", "Nauru", "Pitcairn Islands", "Georgia", "Kenya", "Bermuda", "Kazakhstan", "Democratic People's Republic of Korea","Puerto Rico","Croatia","Antigua and Barbuda","Seychelles","Marshall Islands","Burkina Faso","Denmark","United Kingdom","Dominica","Albania","Angola","Slovakia (Slovak Republic)","Western Sahara","Belgium","Saudi Arabia","Turkey","Congo","Cuba","Uruguay","Montserrat","United States of America","Lebanon","Saint Vincent and the Grenadines","Saint Kitts and Nevis","Saint Barthelemy","Haiti","Moldova","Heard Island and McDonald Islands","Lithuania","Turkmenistan","Venezuela","Andorra"]
      },
      {
        "_id": {
          "year": 2019,
          "month": 9
        },
        "countrySet": ["Germany", "Poland", "French Guiana", "Fiji", "France", "Comoros", "Portugal", "Benin", "Greece", "Belarus", "Ireland", "Vietnam", "Brazil", "Malawi", "Vanuatu", "Netherlands Antilles", "Palestinian Territory", "Iceland", "Kuwait", "Libyan Arab Jamahiriya", "Liechtenstein", "New Caledonia", "Suriname", "Uganda", "Bolivia", "Uzbekistan", "Burundi", "Nicaragua", "Tajikistan", "Jamaica", "South Georgia and the South Sandwich Islands", "Sri Lanka", "Mexico", "Singapore", "Antarctica (the territory South of 60 deg S)", "Tonga", "Myanmar", "Slovenia", "Latvia", "Oman", "Saint Helena", "Ukraine", "Bosnia and Herzegovina", "Aruba", "Jordan", "Hungary", "Mozambique", "Solomon Islands", "Svalbard & Jan Mayen Islands", "Thailand", "Taiwan", "Cyprus", "Equatorial Guinea", "Belize", "Niger", "Senegal", "Hong Kong", "Israel", "Kiribati", "Costa Rica", "Sierra Leone", "Lesotho", "Saint Martin", "Spain", "Barbados", "Nepal", "Togo", "Maldives", "Czech Republic", "Somalia", "Saint Pierre and Miquelon", "Serbia", "Northern Mariana Islands", "Montenegro", "British Indian Ocean Territory (Chagos Archipelago)", "Cote d'Ivoire","United Arab Emirates","Guernsey","Niue","Bulgaria","Netherlands","Egypt","Colombia","Greenland","Brunei Darussalam","Madagascar","Mauritius","Iraq","Canada","French Polynesia","Jersey","Grenada","Cocos (Keeling) Islands","New Zealand","Honduras","Virgin Islands, British","Mayotte","Cook Islands","Finland","Macedonia","Micronesia","Turks and Caicos Islands","Christmas Island","Estonia","Falkland Islands (Malvinas)","El Salvador","Eritrea","Malaysia","San Marino","Afghanistan","Anguilla","Cambodia","Zambia","Republic of Korea","Mauritania","Philippines","South Africa","Gambia","Yemen","Qatar","Peru","Namibia","Guinea","Samoa","Cayman Islands","Monaco","Mongolia","Bouvet Island (Bouvetoya)","Romania","Sweden","Guam","Guyana","Djibouti","Japan","Bangladesh","Reunion","Central African Republic","Sudan","Norway","Martinique","Guadeloupe","Papua New Guinea","Malta","Tunisia","Macao","Iran","Ghana","Syrian Arab Republic","Trinidad and Tobago","French Southern Territories","Botswana","Luxembourg","Russian Federation","Pakistan","Ethiopia","Holy See (Vatican City State)","Panama","Austria","Rwanda","American Samoa","Faroe Islands","Tanzania","Morocco","Lao People's Democratic Republic", "Ecuador", "China", "Indonesia", "Bahrain", "Algeria", "Tuvalu", "Gibraltar", "Nigeria", "Kyrgyz Republic", "Chile", "Palau", "Cape Verde", "Bhutan", "Dominican Republic", "Armenia", "Mali", "Isle of Man", "Liberia", "India", "Switzerland", "Argentina", "Virgin Islands, U.S.", "Timor-Leste", "Swaziland", "Azerbaijan", "United States Minor Outlying Islands", "Saint Lucia", "Bahamas", "Guatemala", "Australia", "Sao Tome and Principe", "Tokelau", "Paraguay", "Italy", "Wallis and Futuna", "Gabon", "Cameroon", "Guinea-Bissau", "Chad", "Norfolk Island", "Zimbabwe", "Nauru", "Georgia", "Kenya", "Pitcairn Islands", "Bermuda", "Kazakhstan", "Democratic People's Republic of Korea","Croatia","Puerto Rico","Antigua and Barbuda","Seychelles","Marshall Islands","Burkina Faso","Dominica","Denmark","Albania","United Kingdom","Angola","Slovakia (Slovak Republic)","Western Sahara","Belgium","Turkey","Congo","Saudi Arabia","Uruguay","Cuba","United States of America","Montserrat","Lebanon","Saint Kitts and Nevis","Saint Vincent and the Grenadines","Saint Barthelemy","Haiti","Moldova","Lithuania","Heard Island and McDonald Islands","Turkmenistan","Venezuela","Andorra"]
      }
    .......
  7. $sum计算总和

    -- 依据月份分组,获取每组的支出总和 sumTotal
    db.orders.aggregate({
        $group:{_id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            sumTotal:{$sum:"$total"}
        }
    })
    -- 后果
    {_id: { year: 2019, month: 2}, sumTotal: Decimal128("4072808") }
    {_id: { year: 2019, month: 10}, sumTotal: Decimal128("4356471") }
    {_id: { year: 2019, month: 5}, sumTotal: Decimal128("4460433") }
    ......
  8. $stdDevPop返回输出值的总体标准偏差

    -- 依据月份分组, 计算总体准偏差计算
    db.orders.aggregate({
        $group:{_id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            popTotal:{$stdDevPop:"$total"}
        }
    })
    -- 后果
    {_id: { year: 2019, month: 2}, popTotal: 189.3064965965138 }
    {_id: { year: 2019, month: 10}, popTotal: 187.19676293125292 }
    {_id: { year: 2019, month: 5}, popTotal: 189.54277980510432 }
    {_id: { year: 2019, month: 8}, popTotal: 189.52305549485735 }
    {_id: { year: 2019, month: 6}, popTotal: 189.99641948294692 }
    {_id: { year: 2019, month: 1}, popTotal: 188.89723701416594 }
    {_id: { year: 2019, month: 4}, popTotal: 189.33635941008336 }
    {_id: { year: 2019, month: 3}, popTotal: 190.39465578257668 }
    {_id: { year: 2019, month: 7}, popTotal: 189.01641050584374 }
    {_id: { year: 2019, month: 9}, popTotal: 188.10379143822877 }
  9. $stdDevSamp返回输出值的样本标准偏差

    -- 依据月份分组, 计算样本准偏差计算
    db.orders.aggregate({
        $group:{_id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            sampTotal:{$stdDevSamp:"$total"}
        }
    })
    -- 后果
    {_id: { year: 2019, month: 2}, sampTotal: 189.31678247750685 }
    {_id: { year: 2019, month: 9}, sampTotal: 188.1134099175866 }
    {_id: { year: 2019, month: 7}, sampTotal: 189.02564049879336 }
    {_id: { year: 2019, month: 3}, sampTotal: 190.40398862519802 }
    {_id: { year: 2019, month: 5}, sampTotal: 189.55212494401323 }
    {_id: { year: 2019, month: 4}, sampTotal: 189.34574899869335 }
    {_id: { year: 2019, month: 1}, sampTotal: 188.90639411415503 }
    {_id: { year: 2019, month: 8}, sampTotal: 189.53235199281477 }
    {_id: { year: 2019, month: 6}, sampTotal: 190.00600146946147 }
    {_id: { year: 2019, month: 10}, sampTotal: 187.20619136123352 }

$match

承受一个指定查问条件的文档。查问语法与读操作查问语法雷同。

根本的语法{$match: { <query>} }

在理论利用中尽可能将 $match 放在管道的后面地位。这样有两个益处:

  1. 能够疾速将不须要的文档过滤掉,以 缩小管道的工作量
  2. 如果再投射和分组之前执行 $match 查问能够应用索引
-- 相似于 in 查问
db.orders.aggregate({
    $match:{country:{$in:["Romania", "Sweden", "Guam", "Guyana"]}
    }
})
-- 后果:查出这几个国家的订单
-- 范畴查问
db.orders.aggregate({
    $match:{orderDate:{$gte:ISODate("2019-02-01"),$lt:ISODate("2019-02-04")}
    }
})
-- 后果: 查出 2019-02-01 到 2019-02-03 这三天的所有订单

$expr应用聚合表达式

-- 查问 3 月份以及往后的数据 - 依据月份分组 - sum 订单数
db.orders.aggregate([{$match:{$expr:{$gte:[{$month:"$orderDate"},3]}}},
    {$group:{_id:{month:{$month:"$orderDate"}},count:{$sum:1}}}
 ]);
-- 后果
{_id: { month: 7}, count: 10240 }
{_id: { month: 5}, count: 10142 }
{_id: { month: 6}, count: 9915 }
{_id: { month: 4}, count: 10083 }
{_id: { month: 10}, count: 9928 }
{_id: { month: 9}, count: 9779 }
{_id: { month: 3}, count: 10201 }
{_id: { month: 8}, count: 10194 }

$mod应用取模运算符

-- 查问 total 属性前面是 00 结尾的订单
db.orders.aggregate([{$match:{total:{$mod:[100,0]}}}
 ]);
-- 后果
{_id: ObjectId("5dbe7a575368f69de2b4d4db"),
  street: '5929 Elroy Points',
  city: 'Retaberg',
  state: 'Utah',
  country: 'Cote d\'Ivoire',
  zip: '73722-0034',
  phone: '113.509.1520',
  name: 'Sanford Runte',
  userId: 7843,
  orderDate: 2019-02-21T20:26:32.458Z,
  status: 'completed',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Steel Shoes',
       sku: '1374',
       qty: 82,
       price: Decimal128("15.00"),
       cost: Decimal128("13.35") },
     { product: 'Sleek Frozen Salad',
       sku: '2698',
       qty: 79,
       price: Decimal128("41.00"),
       cost: Decimal128("33.21") },
     { product: 'Intelligent Granite Mouse',
       sku: '17',
       qty: 55,
       price: Decimal128("54.00"),
       cost: Decimal128("50.76") },
     { product: 'Handcrafted Wooden Chicken',
       sku: '2079',
       qty: 4,
       price: Decimal128("17.00"),
       cost: Decimal128("17") } ],
  total: Decimal128("500") }
{_id: ObjectId("5dbe7a575368f69de2b4d50c"),
  street: '6159 Vandervort Camp',
  city: 'South Bobby',
  state: 'Montana',
  country: 'Guernsey',
  zip: '55141',
  phone: '173.672.8440 x661',
  name: 'Jovan Rice',
  userId: 3526,
  orderDate: 2019-09-14T21:05:45.049Z,
  status: 'shipping',
  shippingFee: Decimal128("9.00"),
  orderLines: 
   [ { product: 'Small Metal Sausages',
       sku: '8130',
       qty: 11,
       price: Decimal128("80.00"),
       cost: Decimal128("67.2") },
     { product: 'Intelligent Rubber Chicken',
       sku: '3775',
       qty: 61,
       price: Decimal128("10.00"),
       cost: Decimal128("8") },
     { product: 'Generic Rubber Table',
       sku: '7102',
       qty: 36,
       price: Decimal128("10.00"),
       cost: Decimal128("8.5") } ],
  total: Decimal128("100") }
......

$regex应用正则表达式匹配

-- 以 184 结尾的手机号的订单数量
db.orders.aggregate([{$match:{ phone: { $regex: /^184/}}},
    {$count:"counts"}
 ]);
-- 后果
{"counts": 55}

$unwind

将数组拆分成独自的文档

格局

{
  $unwind:
  {
      path: <field path>,
      includeArrayIndex: <string>,  
      preserveNullAndEmptyArrays: <boolean> 
  } 
}

includeArrayIndex:可选, 一个新字段的名称用于寄存元素的数组索引。该名称不能以 $ 结尾。

preserveNullAndEmptyArrays:可选, 默认为 false,若为 true, 如果 path 没有对应的字段或者对应的数组 size 为 0,则 $unwind 输入文档,默认 false 不输入。

-- 筛选一条数据,将数组拆分
db.orders.aggregate([{$match:{_id:ObjectId("5dbe7aa650fc769de3e1b551")}},
    {$unwind:"$orderLines"},
 ]);
-- 后果
{_id: ObjectId("5dbe7aa650fc769de3e1b551"),
  street: '3340 Marielle Manors',
  city: 'New Maymie',
  state: 'Connecticut',
  country: 'Malawi',
  zip: '22434-3104',
  phone: '184.544.4826 x4858',
  name: 'Annette Langworth',
  userId: 9830,
  orderDate: 2019-01-23T11:56:14.972Z,
  status: 'shipping',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   { product: 'Sleek Granite Gloves',
     sku: '6176',
     qty: 31,
     price: Decimal128("74.00"),
     cost: Decimal128("71.04") },
  total: Decimal128("313") }
{_id: ObjectId("5dbe7aa650fc769de3e1b551"),
  street: '3340 Marielle Manors',
  city: 'New Maymie',
  state: 'Connecticut',
  country: 'Malawi',
  zip: '22434-3104',
  phone: '184.544.4826 x4858',
  name: 'Annette Langworth',
  userId: 9830,
  orderDate: 2019-01-23T11:56:14.972Z,
  status: 'shipping',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   { product: 'Licensed Soft Cheese',
     sku: '2702',
     qty: 70,
     price: Decimal128("55.00"),
     cost: Decimal128("53.9") },
  total: Decimal128("313") }
......

$project

从文档中指定想要的字段和不想要的字段

格局

{$project: { <specification(s)> } }

specifications 有以下模式:<field>: <1 or true> 蕴含该字段
<field>: <0 or false> 不蕴含该字段

_id: <0 or false>        是否指定_id 字段
-- 如果有一个属性为或几个属性为 1,那么只显示这一个或几个属性 + _id
db.orders.aggregate({$project:{name:1}
})
-- 后果
{_id: ObjectId("5dbe7a545368f69de2b4d36e"), name: 'Destinee Schneider' }
{_id: ObjectId("5dbe7a545368f69de2b4d36f"), name: 'Ashlynn Sipes' }
{_id: ObjectId("5dbe7a54cd023b9de4efc1cc"), name: 'Genoveva Bauch' }
{_id: ObjectId("5dbe7a542411dc9de6429190"), name: 'Kris Hansen' }

-- 如果有一个属性为或几个属性为 0,那么显示除了这个一个或几个属性的其它所有属性
db.orders.aggregate({$project:{orderLines:0}
})
-- 后果
{"_id": {"$oid": "5dbe7a545368f69de2b4d36e"},
    "city": "Champlinberg",
    "country": "Malaysia",
    "name": "Destinee Schneider",
    "orderDate": {"$date": "2019-03-26T03:20:08.805Z"},
    "phone": "425.956.7743 x4621",
    "shippingFee": {"$numberDecimal": 8.00},
    "state": "Texas",
    "status": "created",
    "street": "493 Hilll Curve",
    "total": {"$numberDecimal": 407},
    "userId": 3573,
    "zip": "24344-1715"
},
{"_id": {"$oid": "5dbe7a545368f69de2b4d36f"},
    "city": "Linwoodburgh",
    "country": "United States of America",
    "name": "Ashlynn Sipes",
    "orderDate": {"$date": "2019-07-18T07:21:53.530Z"},
    "phone": "508.326.5494 x1218",
    "shippingFee": {"$numberDecimal": 7.00},
    "state": "Indiana",
    "status": "shipping",
    "street": "39476 Lacey Harbor",
    "total": {"$numberDecimal": 439},
    "userId": 2500,
    "zip": "84551"
}
......

-- 只展现嵌套属性
db.orders.aggregate({$project:{"orderLines.price":1}
})
或者
db.orders.aggregate({$project:{orderLines:{price:1}}
})
-- 后果
{_id: ObjectId("5dbe7a542411dc9de6429193"),
  orderLines: 
   [{ price: Decimal128("75.00") },
     {price: Decimal128("64.00") },
     {price: Decimal128("34.00") },
     {price: Decimal128("98.00") },
     {price: Decimal128("88.00") },
     {price: Decimal128("20.00") },
     {price: Decimal128("59.00") },
     {price: Decimal128("20.00") },
     {price: Decimal128("90.00") },
     {price: Decimal128("45.00") },
     {price: Decimal128("42.00") },
     {price: Decimal128("28.00") } ] }
{_id: ObjectId("5dbe7a5450fc769de3e19d20"),
  orderLines: 
   [{ price: Decimal128("51.00") },
     {price: Decimal128("10.00") },
     {price: Decimal128("63.00") },
     {price: Decimal128("12.00") },
     {price: Decimal128("37.00") },
     {price: Decimal128("43.00") },
     {price: Decimal128("39.00") },
     {price: Decimal128("68.00") },
     {price: Decimal128("21.00") } ] }
......

$condif-then-else的应用相当于 SQL 中的case-when-then-else

$$REMOVE是在满足这个条件的时候移除这个属性

-- 不是 7 月的文档,移除这个属性
db.orders.aggregate({
    $project:{
        name:1,
        orderDate:{
            $cond: {if: { $ne: [ {"$month":"$orderDate"}, 7 ] },
                   then: "$$REMOVE",
                   else: "$orderDate"
            }
        }
    }
})
-- 后果
{_id: ObjectId("5dbe7a545368f69de2b4d36e"), name: 'Destinee Schneider' }
{_id: ObjectId("5dbe7a545368f69de2b4d36f"), name: 'Ashlynn Sipes', orderDate: 2019-07-18T07:21:53.530Z }
{_id: ObjectId("5dbe7a54cd023b9de4efc1cc"), name: 'Genoveva Bauch' }
{_id: ObjectId("5dbe7a542411dc9de6429190"), name: 'Kris Hansen' }

映射到一个属性蕴含多个属性

-- 应用 substr 截取第一个字母,应用 strLenCP 取 name 的长度
db.orders.aggregate({
    $project:{
        _id: 0,
        nm:{
            name:"$name",
            firstLetter:{$substr:["$name",0,1]},
            nameLenth:{$strLenCP:"$name"}
        }
    }
})
-- 后果
{nm: { name: 'Destinee Schneider', firstLetter: 'D', nameLenth: 18} }
{nm: { name: 'Ashlynn Sipes', firstLetter: 'A', nameLenth: 13} }
{nm: { name: 'Genoveva Bauch', firstLetter: 'G', nameLenth: 14} }
{nm: { name: 'Kris Hansen', firstLetter: 'K', nameLenth: 11} }
{nm: { name: 'Dudley Kertzmann', firstLetter: 'D', nameLenth: 16} }
......

将多个属性的值映射到一个数组当中

db.orders.aggregate({
    $project:{
        _id: 0,
        msg:["$name","$orderDate","$orderLines.price"]
    }
})
-- 后果
{msg: 
   [ 
     'Gudrun Stamm',
     2019-09-10T01:00:00.679Z,
     [Decimal128("17.00"),
       Decimal128("91.00"),
       Decimal128("51.00"),
       Decimal128("10.00"),
       Decimal128("18.00"),
       Decimal128("46.00"),
       Decimal128("69.00"),
       Decimal128("18.00"),
       Decimal128("89.00"),
       Decimal128("99.00") 
    ] 
  ] 
}
{ msg: 
   [ 
     'Jalon Erdman',
     2019-03-06T08:30:55.042Z,
     [Decimal128("37.00"),
       Decimal128("91.00"),
       Decimal128("88.00"),
       Decimal128("20.00"),
       Decimal128("75.00"),
       Decimal128("46.00") 
    ] 
  ] 
}
{ msg: 
   [ 
     'Mossie Ankunding',
     2019-05-25T09:40:13.662Z,
     [Decimal128("14.00"),
       Decimal128("49.00"),
       Decimal128("38.00"),
       Decimal128("55.00"),
       Decimal128("20.00") 
     ] 
   ] 
}
{ msg: 
   [ 
     'Jorge Toy',
     2019-09-28T23:07:35.137Z,
     [Decimal128("71.00"),
       Decimal128("62.00"),
       Decimal128("59.00"),
       Decimal128("43.00"),
       Decimal128("55.00"),
       Decimal128("65.00"),
       Decimal128("57.00") 
     ] 
   ] 
}
......

$limit

限度条数,获取前 n 条数据

db.orders.aggregate({$limit:2})
-- 后果
{_id: ObjectId("5dbe7a545368f69de2b4d36e"),
  street: '493 Hilll Curve',
  city: 'Champlinberg',
  state: 'Texas',
  country: 'Malaysia',
  zip: '24344-1715',
  phone: '425.956.7743 x4621',
  name: 'Destinee Schneider',
  userId: 3573,
  orderDate: 2019-03-26T03:20:08.805Z,
  status: 'created',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   [ { product: 'Refined Fresh Tuna',
       sku: '2057',
       qty: 25,
       price: Decimal128("56.00"),
       cost: Decimal128("46.48") },
     { product: 'Intelligent Wooden Towels',
       sku: '5674',
       qty: 72,
       price: Decimal128("84.00"),
       cost: Decimal128("68.88") },
     { product: 'Refined Steel Bacon',
       sku: '5009',
       qty: 8,
       price: Decimal128("53.00"),
       cost: Decimal128("50.35") } ],
  total: Decimal128("407") }
{_id: ObjectId("5dbe7a545368f69de2b4d36f"),
  street: '39476 Lacey Harbor',
  city: 'Linwoodburgh',
  state: 'Indiana',
  country: 'United States of America',
  zip: '84551',
  phone: '508.326.5494 x1218',
  name: 'Ashlynn Sipes',
  userId: 2500,
  orderDate: 2019-07-18T07:21:53.530Z,
  status: 'shipping',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Soft Soap',
       sku: '6274',
       qty: 71,
       price: Decimal128("91.00"),
       cost: Decimal128("89.18") },
     { product: 'Intelligent Steel Chair',
       sku: '8278',
       qty: 13,
       price: Decimal128("67.00"),
       cost: Decimal128("62.31") },
     { product: 'Small Rubber Shoes',
       sku: '3534',
       qty: 60,
       price: Decimal128("76.00"),
       cost: Decimal128("71.44") } ],
  total: Decimal128("439") }
......

$skip

跳过前 n 行数据查问

-- 查问第 2、3 条
db.orders.aggregate([{$skip:1},{$limit:2}])
-- 后果
{_id: ObjectId("5dbe7a545368f69de2b4d36f"),
  street: '39476 Lacey Harbor',
  city: 'Linwoodburgh',
  state: 'Indiana',
  country: 'United States of America',
  zip: '84551',
  phone: '508.326.5494 x1218',
  name: 'Ashlynn Sipes',
  userId: 2500,
  orderDate: 2019-07-18T07:21:53.530Z,
  status: 'shipping',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Soft Soap',
       sku: '6274',
       qty: 71,
       price: Decimal128("91.00"),
       cost: Decimal128("89.18") },
     { product: 'Incredible Concrete Chips',
       sku: '3756',
       qty: 6,
       price: Decimal128("18.00"),
       cost: Decimal128("15.12") },
     { product: 'Intelligent Steel Chair',
       sku: '8278',
       qty: 13,
       price: Decimal128("67.00"),
       cost: Decimal128("62.31") },
     { product: 'Small Rubber Shoes',
       sku: '3534',
       qty: 60,
       price: Decimal128("76.00"),
       cost: Decimal128("71.44") } ],
  total: Decimal128("439") }
{_id: ObjectId("5dbe7a54cd023b9de4efc1cc"),
  street: '699 Harvey Row',
  city: 'Electamouth',
  state: 'South Dakota',
  country: 'Burundi',
  zip: '61826',
  phone: '(936) 449-4255 x58095',
  name: 'Genoveva Bauch',
  userId: 8302,
  orderDate: 2019-03-15T13:53:48.925Z,
  status: 'shipping',
  shippingFee: Decimal128("5.00"),
  orderLines: 
   [ { product: 'Intelligent Soft Salad',
       sku: '3711',
       qty: 85,
       price: Decimal128("86.00"),
       cost: Decimal128("76.54") },
     { product: 'Generic Cotton Ball',
       sku: '2112',
       qty: 44,
       price: Decimal128("21.00"),
       cost: Decimal128("19.32") },
     { product: 'Rustic Plastic Keyboard',
       sku: '6451',
       qty: 19,
       price: Decimal128("81.00"),
       cost: Decimal128("77.76") } ],
  total: Decimal128("341") }

$sort

对文档进行排序 升序:1 降序:-1

-- 用名字排序
db.orders.aggregate([{$sort:{name:1}
    },{$project:{_id:0,name:1}
    }
])
-- 后果
{name: 'Aaliyah Bruen'}
{name: 'Aaliyah Erdman'}
{name: 'Aaliyah Fahey'}
{name: 'Aaliyah Gerhold'}
{name: 'Aaliyah Graham'}
{name: 'Aaliyah Greenfelder'}
{name: 'Aaliyah Konopelski'}
{name: 'Aaliyah Kovacek'}
{name: 'Aaliyah Kuphal'}
{name: 'Aaliyah Lueilwitz'}
{name: 'Aaliyah Maggio'}
......

$sortByCount

依据某个字段分组计算 count 值而后依照这个值降序排序

db.orders.aggregate({$sortByCount:"$status"})
-- 后果
{_id: 'created', count: 20087}
{_id: 'shipping', count: 20017}
{_id: 'completed', count: 20015}
{_id: 'cancelled', count: 19978}
{_id: 'fulfilled', count: 19903}
正文完
 0