MUArch

UArch is short for Univariate ARCH models and MUArch stands for multiple (or many) Univariate ARCH models. In essence MUArch, is a list of many UArch models. This helps when you need to simulate many univariate ARCH models together. Also, it is helpful when you need to specify the marginals as in a Copula-GARCH model.

class muarch.muarch.MUArch(n: Union[int, Collection[muarch.uarch.UArch]], mean='Constant', lags=0, vol='GARCH', p=1, o=0, q=1, power=2.0, dist='Normal', hold_back=None, scale=1)[source]

Multi-univariate ARCH model. Unlike a multivariate ARCH model, this fits each univariate time series individually. Any simulations returns simulations of each univariate series column bound together.

__init__(n: Union[int, Collection[muarch.uarch.UArch]], mean='Constant', lags=0, vol='GARCH', p=1, o=0, q=1, power=2.0, dist='Normal', hold_back=None, scale=1)[source]

Initializes the MUArch model.

The MUArch model holds multiple univariate models which are determined during fitting. If the models are not specified, the global default options will be used. Models can be individually specified after initializing the MUArch instance.

Parameters
  • n (int or list of UArch models) – Number of univariate models to fit. Alternatively, a list of UArch (univariate) models can be specified.

  • mean ({ 'zero', 'constant', 'harx', 'har', 'ar', 'arx', 'ls' }, optional) –

    Name of the global default mean model. Currently supported options are:

    • Constant (default) - Constant mean model

    • Zero - Zero mean model

    • AR - Autoregression model

    • ARX - Autoregression model with exogenous regressors. Falls back to AR if no exogenous regressors

    • HAR - Heterogeneous Autoregression model

    • HARX - Heterogeneous Autoregressions with exogenous regressors

    • LS - Least squares model

    For more information on the different models, check out the documentation at https://arch.readthedocs.io/en/latest/univariate/mean.html

  • lags (int or list (int), optional) – Global default lag. Either a scalar integer value indicating lag length or a list of integers specifying lag locations.

  • vol ({ 'GARCH', 'ARCH', 'EGARCH', 'FIGARCH' and 'HARCH', 'CONSTANT' }, optional) –

    Name of the global default volatility model. Currently supported options are:

    • GARCH (default) - Standard GARCH process which can be used to specify the following models:

      • ARCH(p)

      • GARCH(p,q)

      • GJR-GARCH(p,o,q)

      • AVARCH(p)

      • AVGARCH(p,q)

      • TARCH(p,o,q)

      • Models with arbitrary, pre-specified powers

    • ARCH - ARCH process

    • EGARCH - EGARCH process

    • FIGARCH - Fractionally Integrated (FI) GARCH process

    • HARCH - Heterogeneous ARCH process

    • Constant (default) - Constant volatility process

  • p (int, optional) – Global default lag order of the symmetric innovation

  • o (int, optional) – Global default lag order of the asymmetric innovation

  • q (int, optional) – Global default lag order of lagged volatility or equivalent

  • power (float, optional) – Global default power to use with the innovations, Default is 2.0, which produces ARCH and related models. Using 1.0 produces AVARCH and related models. Other powers can be specified, although these should be strictly positive, and usually larger than 0.25.

  • dist ({ 'normal', 'gaussian', 'studentst', 't', 'skewstudent', 'skewt', 'ged', 'generalized error' }, optional) –

    Name of the global default distribution for the innovations. Currently supported options are:

    • normal, gaussian (default) - Standard Normal distribution

    • t, studentst - Standardized Student’s distribution

    • skewstudent, skewt - Standardized Skewed Student’s distribution.

    • ged, **generalized error” - Generalized Error Distribution

  • hold_back (int) – Global default. Number of observations at the start of the sample to exclude when estimating model parameters. Used when comparing models with different lag lengths to estimate on the common sample.

  • scale (float) – Global default factor to scale data up or down by. This is useful when your data is too small leading to numerical errors when fitting. It will be used to scale simulation data

fit(y: Union[pandas.core.frame.DataFrame, numpy.ndarray], x: Optional[Union[Collection[Optional[numpy.ndarray]], numpy.ndarray]] = None, update_freq=1, disp='off', cov_type='robust', show_warning=True, tol: Optional[float] = None, options=None)[source]

Fits the MUArch model.

If finer control over the MUArch models is required, set the UArch models separately. Otherwise, method will set the default parameters.

Parameters
  • y ({ndarray, Series}) – The dependent variable. If a vector is passed in, it is assumed that the same vector (endog) is used for all models. Otherwise, the last value of the shape must match the number of models

  • x (list of {ndarray, None}, optional) – Exogenous regressors. Ignored if model does not permit exogenous regressors. If passed in, the first shape must match the number of models.

  • update_freq (int, optional) – Frequency of iteration updates. Output is generated every update_freq iterations. Set to 0 to disable iterative output

  • disp ('final' or 'off' (default)) – Either ‘final’ to print optimization result or ‘off’ to display nothing

  • cov_type (str, optional) – Estimation method of parameter covariance. Supported options are ‘robust’, which does not assume the Information Matrix Equality holds and ‘classic’ which does. In the ARCH literature, ‘robust’ corresponds to Bollerslev-Wooldridge covariance estimator.

  • show_warning (bool, optional) – Flag indicating whether convergence warnings should be shown.

  • tol (float, optional) – Tolerance for termination

  • options (dict, optional) – Options to pass to scipy.optimize.minimize. Valid entries include ‘ftol’, ‘eps’, ‘disp’, and ‘maxiter’

Returns

Fitted self instance

Return type

MUArch

residuals(standardize=True)numpy.ndarray[source]

Model residuals

The residuals will be burnt by the maximum lag of the underlying models. For example, given 3 models - AR(1), AR(10), Constant with 400 data points each, the residuals will be 399, 390 and 400 long. The function will cut off the first 10 data points in this instance.

Parameters

standardize (bool, optional) – Whether to standardize residuals. Residuals are standardized by dividing it with the conditional volatility

Returns

Residuals

Return type

ndarray

simulate(nobs, burn=500, initial_value=None, x=None, initial_value_vol=None, data_only=True, custom_dist: Optional[Union[Callable[[Union[int, Collection[int]]], numpy.ndarray], numpy.ndarray]] = None)[source]

Simulates data from the multiple ARMA-GARCH models

Parameters
  • nobs (int) – Length of series to simulate

  • burn (int, optional) – Number of values to simulate to initialize the model and remove dependence on initial values

  • initial_value ({ndarray, float}, optional) – Either a scalar value or max(lags) array set of initial values to use when initializing the model. If omitted, 0.0 is used. If array, the last column must be of the same size as the number of models

  • x ({ndarray, list of ndarray}, optional) –

    If supplied as a list, this list should have the same number of elements as the number of models in the MUArch model. Each array inside is the specified exogenous variable for that particular model and this must be a nobs + burn by k matrix of exogenous variables to include in the simulation. Otherwise, leave the value as None to indicate no exogenous variables are used for simulation in the model.

    If an array is supplied directly, it means every model has an exogenous variable associated with it. In this case, it should be a 3 dimensional tensor where the first dimension represents the number of models.

  • initial_value_vol ({ndarray, float}, optional) – An array or scalar to use when initializing the volatility process. If array, the last column must be of the same size as the number of models

  • data_only (bool, default True) – If True, this returns only the simulated data, omits the volatility and error. In this case, it will return as a numpy array. Otherwise, it returns a data frame with the data, volatility and error

  • custom_dist ({ndarray, Callable}, optional) –

    Optional density from which to simulate the innovations (Distribution) in the GARCH models. This is useful when working with the copula-GARCH model where each univariate model innovations has dependence on others. It is assumed that the values supplied are standardized [0, 1] innovations instead of the unstandardized residuals.

    The shape of the array must be at least as long as the simulation size required after accounting for burn and type of innovation process. If unsure, use simulation_size_required to check. It must also account for the number of dimensions of the MUArch model. For example, if MUArch model is simulating a horizon of 120 time steps, 10000 trials and has 5 UArch models, the shape of the numpy array should be (120, 10000, 5).

    If a random number generator function is passed in, ensure that it only takes only argument and returns a numpy array. The argument can be an integer or a tuple of integers. In this case, the size will be automatically derived to save the user the trouble.

Returns

simulated_data – List of DataFrame with columns data containing the simulated values, volatility, containing the conditional volatility and errors containing the errors used in the simulation

Return type

{List[DataFrame], ndarray}

See also

UArch.simulation_horizon_required

Calculates the simulation size required

simulate_mc(nobs, reps, burn=500, initial_value=None, x=None, initial_value_vol=None, custom_dist: Optional[Union[Callable[[Union[int, Collection[int]]], numpy.ndarray], numpy.ndarray]] = None, n_jobs: Optional[int] = None)[source]

Simulates data from the multiple ARCH-GARCH models.

This function is specially crafted for Monte-Carlo simulations.

Parameters
  • nobs (int) – Length of series to simulate

  • reps (int) – Number of repetitions

  • burn (int, optional) – Number of values to simulate to initialize the model and remove dependence on initial values

  • initial_value ({ndarray, float}, optional) – Either a scalar value or max(lags) array set of initial values to use when initializing the model. If omitted, 0.0 is used. If array, the last column must be of the same size as the number of models

  • x ({ndarray, list of ndarray}, optional) –

    If supplied as a list, this list should have the same number of elements as the number of models in the MUArch model. Each array inside is the specified exogenous variable for that particular model and this must be a nobs + burn by k matrix of exogenous variables to include in the simulation. Otherwise, leave the value as None to indicate no exogenous variables are used for simulation in the model.

    If an array is supplied directly, it means every model has an exogenous variable associated with it. In this case, it should be a 3 dimensional tensor where the first dimension represents the number of models.

  • initial_value_vol ({ndarray, float}, optional) – An array or scalar to use when initializing the volatility process. If array, the last column must be of the same size as the number of models

  • custom_dist ({ndarray, Callable}, optional) –

    Optional density from which to simulate the innovations (Distribution) in the GARCH models. This is useful when working with the copula-GARCH model where each univariate model innovations has dependence on others. It is assumed that the values supplied are standardized [0, 1] innovations instead of the unstandardized residuals.

    The shape of the array must be at least as long as the simulation size required after accounting for burn and type of innovation process. If unsure, use simulation_size_required to check.

    If a random number generator function is passed in, he size will be automatically derived to save the user the trouble. However, the function must:

    • take as it first argument an integer or a tuple of integer

    • have other parameters that are optional

    • return a numpy array

  • n_jobs (int or None, optional) – The number of jobs to run in parallel for simulation. This is particularly useful when simulating large number of repetitions with more than 1 dimension. None defaults to using 1 processor. Any numbers less or equal to 0 means to use all processors. Even if a large number is used, it will be capped at the maximum number of processors available.

Returns

simulated_data – Array containing simulated data from the Monte Carlo Simulation

Return type

numpy array

summary(short=False, dp=4)[source]

Summary of fitted models

Parameters
  • short – bool, default False Whether to show short summary or full summary.

  • dp – int, default 4 Number of decimal places to show in short summary

Returns

SummaryList summary of fitted models