A short tutorial on Fuzzy Time Series — Part III

= [u, l].

Some models (such as EnsembleFTS and PWFTS) allow the specification of the method of interval.


IntervalFTS (IFTS): The most basic method for generating prediction intervals, it is an extension of the HighOrderFTS.

The generated prediction intervals do not have some probabilistic mean, they just measure the upper and lower bounds of the fuzzy sets that were involved on forecasting process, i.


, the fuzzy uncertainty.

The method is described here.


AllMethodEnsembleFTS: The EnsembleFTS is a meta model composed by several base models.

The AllMethodEnsembleFTS creates one instance of each monovariate FTS method implemented in pyFTS and set them as its base models.

The forecasting is computed from the forecasts of the base models.

A brief description of the method can be found here.

There is basically two ways to compute prediction intervals in EnsembleFTS: extremum and quantile (default).

In extremum method the maximum and minimum values between the forecasts of the base models are chosen.

On quantile method the alpha parameter must be informed and then the forecasts of the base models will be ordered and the quantile interval will be extracted.

from pyFTS.


ensemble import ensemblepart = Grid.

GridPartitioner(data=train, npart=11)model = ensemble.

AllMethodEnsembleFTS(partitioner=part)forecasts = model.

predict(test, type='interval', mode='extremum')forecasts2 = model.

predict(test, type='interval', mode='quantile', alpha=.


ProbabilisticWeightedFTS (PWFTS): As its name says, this is the most complex method and still under review (on its way to be published).

There is basically two ways to produce prediction intervals on PWFTS: heuristic (default) and quantile.

In the heuristic method the interval bounds are calculated as the expected value of the fuzzy sets bounds and its empirical probabilities and the quantile method generates a full probability distribution and then extracts the quantiles (using the alpha parameter).

forecasts1 = model.

predict(test, type='interval', method='heuristic')forecasts2 = model.

predict(test, type='interval', method='quantile', alpha=.



MVFTS: This multivariate method uses the same approach of IFTS to produce prediction intervals.



WeightedMVFTS: This weighted multivariate method uses the same approach of IFTS to produce prediction intervals.

In the module pyFTS.


Util we can find the function plot_interval which allows us to easily draw the intervals:Intervals generated by the monovariate methods (source)Intervals generated by the multivariate methods (source)The generated intervals try to demonstrate the range of possible variations that the model takes into account.

You can see that some models generate wider intervals than others and sometimes (especially on the weighted models that have the thinner intervals) the original values fall outside the interval.

The best intervals have balanced widths, neither too wide to show high uncertainty and neither too thin to not cover the real values.

In contrast to interval forecasting, the probabilistic forecasting has its own class to represent a probability distribution — the class pyFTS.



There are several ways to represent this probability distribution that is, by definition, a discrete probability distribution.

Some methods of this class have especial interest for us now: density (returns the probability of the input value(s)), cumulative (returns the cumulative probability of the input value(s)), quantile (returns the quantile value of the input value(s)) and plot (plots the probability distribution on the input matplotlib axis).

Like the intervals, the probabilistic forecasting has its own boolean flag to indicate which models are enabled to perform it:if model.

has_probabilistic_forecasting: distributions = model.

predict(test, type='distribution')Now let’s take a look on some probabilistic forecasting enabled methods on pyFTS:pwfts.

ProbabilisticWeightedFTS (PWFTS): This method was entirely designed for probabilistic forecasting and is the best method for this on pyFTS.

Its rules contains empirical probabilities associated with the fuzzy sets and also present a specific defuzzyfication rule that transforms an input (crisp) value on a probability distribution to the future value;ensemble.

EnsembleFTS: The before mentioned ensemble creates probabilistic forecastings using Kernel Density Estimators (KDE) over the point forecasts of the base models.

The KDE also requires the specification of the kernel function and the width parameter.

Let’s see how the probability forecasting looks like:The probabilistic forecasting of 4 days using the Util.

plot_density function (source)The probabilistic forecasting of 24 hours using ProbabilityDistribution.

plot function (source)In above pictures the probabilistic forecasting is shown in two different perspectives.

The first picture is generated with the method plot_density in the module common.

Util, where each probability distribution is plotted as a shade of blue and its intensity corresponds to the probability.

This method allows to plot the original time series and the forecast probability distributions on top of it.

The second picture shows each probability distribution individually in relation to the universe of discourse using the method plot of the class ProbabilityDistribution.

Of course it is not everything!.We have to consider the interval and probabilistic forecasting for many steps ahead, which we expect to tell us how the uncertainty evolve as the prediction horizon increases.

Yes!.It is fascinating but I still have many things to show, so I will let it as an exercise for you, ok?Let’s walk now on a trickiest road…The land of the Non-StationaritiesYou may remember that old and universally known quote:“the only certainty is that nothing is certain”.


Forecasting may be unfair because things change all the time and we have to deal with it.

Stationarity means, in layman terms, that the statistical properties of a stochastic process (like their expected value and variance) do not change along the time, whose the ultimate mean is stability.

This is awesome for the forecasting models: the test data set will behave exactly as the train data set!In other hand, the non-stationarity means that the statistical properties change.

But not all non-stationarities are created equal.

Some of them are predictable as trend and seasonality.

Dealing with seasonality is not tricky, you can use High Order, Seasonal or Multivariate methods (you may remember our last tutorial).

To deal with trends it is not too complicated either, you can de-trend the data using a difference transformation.

The original time series — NASDAQ — and the differentiated time series (source)Suppose that we split the above time series in half, and call these subsets training and testing data.

You can see that the test subset (after the instance number 2000) have values that did not appear before, in the train subset.

This is a drawback for most FTS methods: what happens when the input data fall outside the known Universe of Discourse?.The model never saw that region before and doesn’t know how to proceed, than it fails tragically.

You can also see on the above image that the differentiated time series is much more well behaved and, indeed, it is stationary.

How can we use the Difference transformation on pyFTS?.Just import the Transformations module from pyFTS.

common and instantiate it.

Don’t forget to inform the transformation to the partitioning method and also add it to the model (with the method append_transformation).

from pyFTS.

data import NASDAQfrom pyFTS.

models import chenfrom pyFTS.

partitioners import Gridfrom pyFTS.

common import Transformationsdiff = Transformations.

Differential(1)train = data[:2000]test = data[2000:]part = Grid.

GridPartitioner(data=train, npart=15, transformation=diff)model = chen.



fit(train)forecasts = model.

predict(test)Look the behavior of the classical Chen’s model with and without the Differential transformation for the NASDAQ dataset:The degradation effect of the FTS when the test data falls out of the known Universe of Discourse (source)While the time series is still fluctuating inside the known Universe of Discourse both models performed well.

But when the time series jumped out the Universe of Discourse of the training data, the model without the Differentiate transformation started to deteriorate because it does not know what to do with that unknown data.

Then the transformations help us not only with trending patterns, but also with the unknown ranges of the universe of discourse.

But some non-stationarities are unpredictable, and sometimes they are painful to deal with.

The nightmare of the Concept-DriftConcept drifts are unforeseen changes (on mean, variance or both) which can happen gradually or suddenly.

Some times these drifts occur in cycles (with irregular periods) and, in other scenario, the drift is temporary.

There are some questions to answer when a concept drifts happen: Is it temporary?.Is the change finished (established) or it will keep changing?We have also to make the distinction between concept-drift and outlier (or a blip).

Outliers are not change, they belong to the known signal but are rare events.

Concept drifts are nightmares — not only FTS methods, other computational intelligence and statistical techniques fear them too — but we need to learn how to live together them.

Despite the complexity of the problem there are some simple (somehow expensive unfortunately) techniques to tackle them.

Time Variant MethodsAll FTS methods we saw before are time invariant methods, which means that they assume that the future fluctuations of the time series will behave according to patterns already happened before.

In other words: the behavior of the time series, which was described by the fuzzy temporal rules of the model, will not change in the future.

This is works fine for many time series (for example the environmental seasonal time series like we studied before) but fails terribly in others (for instance the stock exchange asset prices).

In that cases we need to apply time variant models,incremental.


Retrainer: The Time Variant model is the simplest (but efficient) approach to tackle concept drifts and non stationarity.

This class implement a metamodel, what means that you can choose any FTS method to be its base method, then at each batch_size inputs the metamodel retrain its internal model with the last window_length inputs.

These are the main parameters of the model: the window_length and the batch_size.

As an meta model you can also specify which FTS method to use (the fts_method parameter) and which partitioner you want to use inside him (the partitioner_method and partitioner_params parameters).

from pyFTS.


incremental import TimeVariantmodel = TimeVariant.

Retrainer( partitioner_method=Grid.

GridPartitioner, partitioner_params={'npart': 35}, fts_method=hofts.

WeightedHighOrderFTS, fts_params={}, order=2 , batch_size=50, window_length=200)incremental.

IncrementalEnsembleFTS: Works similarly to the TimeVariant but in a EnsembleFTS approach.

In TimeVariant there is only one internal model that is recreated after n inputs (which means that the batch_size is the unique memory it has).

On IncrementalEnsemble we also have the window_length and the batch_size parameters but also the num_models that says how many internal models to hold.

As soon new models are created (with the incoming data) the older ones are dropped from the ensemble.

from pyFTS.


incremental import IncrementalEnsemblemodel = IncrementalEnsemble.

IncrementalEnsembleFTS( partitioner_method=Grid.

GridPartitioner, partitioner_params={'npart': 35}, fts_method=hofts.

WeightedHighOrderFTS, fts_params={}, order=2 , batch_size=50, window_length=200, num_models=3)nonstationary.


NonStationaryFTS (NSFTS): The non-stationary fuzzy sets are fuzzy sets that can be modified along the time, allowing them to adapt to changes in the data by translating and/or scaling its parameters.

The NSFTS method is very similitar to time invariant FTS methods, with the exception that their fuzzy sets are not static: for each forecast performed by an NSFTS model the error is calculated and stored and the fuzzy sets are changed to fix that error.

For this method, the error is a measure of how much the test data is different from the train data.

This method is on this way to be published.

from pyFTS.


nonstationary import partitioners as nspartfrom pyFTS.


nonstationary import nsftspart = nspart.

simplenonstationary_gridpartitioner_builder( data=train, npart=35, transformation=None)model3 = nsfts.

NonStationaryFTS(partitioner=part)The pyFTS.

data module contains a lot of non-stationary and concept drifted time series as NASDAQ, TAIEX, S&P 500, Bitcoin, Ethereum, etc.

You can also use the class data.


SignalEmulator to create synthetic and complex patterns.

The SignalEmulater is designed to work as an method chain / fluent interface, so you can simulate complex signals by chaining methods that produce specific signals that are added to previous one or replacing it.

The method stationary_signal creates a simple stationary signal with constant mean and variance, the method incremental_gaussian creates a signal where the mean and/or variance is incremented at each time, the method periodic_gaussian fluctuate the mead and/or variance in constant periods and the blip method adds an outlier on a random location.

Every time one of this methods is called its effects are added to the previous signal except if you inform the start parameter — indicating when (which iteration) the method start to work — or set the boolean parameter additive to False, making the stop the previous signal and start this new one.

To render the whole signal you just need to call the function run.

from pyFTS.

data import artificialsignal = artificial.








run()Now let’s put it all together, create 3 non-stationary time series, with concept drifts and employ the above presented methods to forecast them:Performance of the time variant methods for artificial time series with concept drifts (source)Time Variant methods have to balance some kind of exploitation and exploration when dealing with non-stationarities and concept drifts.

To exploit what the model already knows — its memory, the last learned patterns from the data — or to explore new data and learn new patterns.

Each method has its own mechanisms: the Retrainer is controlled by the window_length and batch_size, the Incremental Ensemble for the both and the num_models, the NSFTS uses the magnitude of its own errors to adjust the fuzzy sets .

After all, the time spent to adapt to concept drifts is one of the most important aspects of the time variant methods.

The same principle we saw in previous tutorials apply in this one: each FTS method has its own features and parameters and the best method will depend on the context.

Well guys!.That’s enough for today, ok?In these tutorials we have covered — even superficially — a good portion of the time series forecasting field, with their problems and solutions using FTS methods.

We did not finished yet!.We will always have problems to solve, new improved methods and optimizations.

In next tutorials I will cover some new approaches, like hyperparameter optimization and how to tackle big time series with distributed computing.

See you there guys!.

. More details

Leave a Reply