关于程序员:Python时序分析基础

54次阅读

共计 3157 个字符，预计需要花费 8 分钟才能阅读完成。

在 == 时序剖析 == 泛滥模型中，最为根底也是最为重要的有 AR§模型，MA(q)模型，以及两者的联合 ARMA(p,q)模型，同时思考 ARMA 模型的平稳性，若有一个或多个根落于单位圆上，则此时的 ARMA 模型称作自回归单整挪动均匀过程，ARIMA(p,d,q)模型。

在这里插入图片形容
这里介绍 Python 绘制 ACF 和 PACF 图，进行模型定阶

导入模块
import sys
import os
import pandas as pd
import matplotlib.pylab as plt
%matplotlib inline
import statsmodels.api as sm
import statsmodels.formula.api as smf
import statsmodels.tsa.api as smt
from statsmodels.tsa.stattools import adfuller
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.graphics.api import qqplot
“”” 中文显示问题 ”””
plt.rcParams[‘font.family’] = [‘sans-serif’]
plt.rcParams[‘font.sans-serif’] = [‘SimHei’]
加载数据
data = pd.read_excel(“data.xlsx”,index_col=” 年份 ”,parse_dates=True)
data.head()
<style scoped> </style>
xt
年份
1952-01-01 100.00000
1953-01-01 101.60000
1954-01-01 103.30000
1955-01-01 111.50000
1956-01-01 116.50000
平稳性测验
时序图
data[“diff1”] = data[“xt”].diff(1).dropna()
data[“diff2”] = data[“diff1”].diff(1).dropna()
data1 = data.loc[:,[“xt”,”diff1″,”diff2″]]
data1.plot(subplots=True, figsize=(18, 12),title=” 差分图 ”)
在这里插入图片形容

时序图测验 – – 全靠肉眼的判断和判断人的教训，不同的人看到同样的图形，很可能会给出不同的判断。因而咱们须要一个更有说服力、更加主观的统计办法来帮忙咱们测验工夫序列的平稳性，这种办法，就是单位根测验。

单位根测验
print(“ 单位根测验:\n”)
print(ADF(data.diff1.dropna()))
单位根测验:

Test Statistic -3.156
P-value 0.023

Trend: Constant
Critical Values: -3.63 (1%), -2.95 (5%), -2.61 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
单位根测验 – - 对其一阶差分进行单位根测验，失去：1%、%5、%10 不同水平回绝原假如的统计值和 ADF Test result 的比拟，本数据中，P-value 为 0.023, 靠近 0，ADF Test result 同时小于 5%、10% 即阐明很好地回绝该假如，本数据中，ADF 后果为 -3.156，回绝原假如，即一阶差分后数据是安稳的。
白噪声测验
判断序列是否为非白噪声序列

from statsmodels.stats.diagnostic import acorr_ljungbox
acorr_ljungbox(data.diff1.dropna(), lags = [i for i in range(1,12)],boxpierce=True)
(array([11.30402, 13.03896, 13.37637, 14.24184, 14.6937 , 15.33042,

    16.36099, 16.76433, 18.15565, 18.16275, 18.21663]),

array([0.00077, 0.00147, 0.00389, 0.00656, 0.01175, 0.01784, 0.02202,

    0.03266, 0.03341, 0.05228, 0.07669]),

array([10.4116 , 11.96391, 12.25693, 12.98574, 13.35437, 13.85704,

    14.64353, 14.94072, 15.92929, 15.93415, 15.9696 ]),

array([0.00125, 0.00252, 0.00655, 0.01135, 0.02027, 0.03127, 0.04085,

    0.06031, 0.06837, 0.10153, 0.14226]))

通过 P <α, 回绝原假如，故差分后的序列是安稳的非白噪声序列，能够进行下一步建模

模型定阶
当初咱们曾经失去一个安稳的工夫序列，接来下就是抉择适合的 ARIMA 模型，即 ARIMA 模型中适合的 p,q。
第一步咱们要先查看安稳工夫序列的自相干图和偏自相干图。通过 sm.graphics.tsa.plot_acf 和 sm.graphics.tsa.plot_pacf 失去图形

在这里插入图片形容

截尾是指工夫序列的自相干函数（ACF）或偏自相干函数（PACF）在某阶后均为 0 的性质（比方 AR 的 PACF）；拖尾是 ACF 或 PACF 并不在某阶后均为 0 的性质（比方 AR 的 ACF）。

== 截尾 ==：在大于某个常数 k 后疾速趋于 0 为 k 阶截尾
== 拖尾 ==：始终有非零取值，不会在 k 大于某个常数后就恒等于零(或在 0 左近随机稳定)

从一阶差分序列的自相干图和偏自相干图能够发现:

自相干图拖尾或一阶截尾
偏自相干图一阶截尾,
所以咱们能够建设 ARIMA(1,1,0)、ARIMA(1,1,1)、ARIMA(0,1,1)模型。
def draw_acf_pacf(data):

"""
输出须要求解 ACF\PACF 的数据,
data["xt"]
"""plt.rcParams['font.sans-serif'] = ['SimHei']

#模型的平稳性测验
"""时序图"""
plt.rcParams['font.sans-serif']=['SimHei']
data.plot(figsize=(12,8))
plt.legend(bbox_to_anchor=(1.25, 0.5))
plt.title("时序图")
fig = plt.figure(figsize=(12,8))     
"""单位根测验"""
print("单位根测验:\n")
print(adfuller(data))    
    
"""ACF"""
ax1 = fig.add_subplot(211)
fig = sm.graphics.tsa.plot_acf(data, lags=20,ax=ax1)
ax1.xaxis.set_ticks_position('bottom')
fig.tight_layout();
"""PACF"""
ax2 = fig.add_subplot(212)
fig = sm.graphics.tsa.plot_pacf(data, lags=20, ax=ax2)
ax2.xaxis.set_ticks_position('bottom')
fig.tight_layout();

draw_acf_pacf(data[“xt”])
在这里插入图片形容
在这里插入图片形容

到这里就完结了，如果对你有帮忙，欢送点赞关注评论，你的点赞对我很重要

正文完