Member-only story
ARIMA vs. LSTM: Forecasting Electricity Consumption
Which model performs better?

Note: The full article can be found here, along with a link to the relevant GitHub repository for this example.
In this example, the ARIMA and LSTM models are used to predict electricity consumption patterns for the Dublin City Council Civic Offices, Ireland.
The data in question is sourced from data.gov.ie.
Specifically, the data is provided in terms of kilowatt consumption every 15 minutes.
The analysis takes three stages:
- Relevant data manipulation procedures are invoked in order to aggregate the total kilowatt consumption per day, i.e. form a daily time series.
- Forecast kilowatt consumption across the test set using an ARIMA model.
- Generate another forecast across the test set using an LSTM model and examine if the predictions improve.
Data Manipulation
Here is the original set of data loaded into Python.
df = pd.read_csv('dccelectricitycivicsblocks34p20130221-1840.csv', engine='python', skipfooter=3)
df

We can see that for each date, the relevant kilowatt consumption is provided across 15-minute intervals.
However, in this instance we wish to forecast the total daily consumption and there is expected to be too much volatility if the time series is formed on a 15-minute basis — so as to make any forecasts quite superficial.
In this regard, the data is sorted on a daily basis:
df2=df.rename(columns=df.iloc[0])
df3=df2.drop(df.index[0])
df3
df3.drop(df3.index[0])
df4=df3.drop('Date', axis=1)
df5=df4.drop('Values', axis=1)
df5
df6=df5.dropna()
df7=df6.values
df7
dataset=np.sum(df7, axis=1, dtype=float)
dataset
The relevant columns are renamed accordingly, and the kilowatt consumption per day is aggregated.
A numpy array containing the aggregated daily data is formed: