Hourly Interpolation for Electricity Cost Forecasting

Forecasting electricity costs is a basic competency for energy retailers. It is critical for the retailer’s cash flow management and accounting. But it also enables the retailer to hedge financial risk and sign fixed-price contracts with customers. Accurate electricity cost forecasting can be a key advantage in competitive markets.

Bounding the Problem

Calculating electricity costs follows a basic formula:

Electricity Cost = Electricity Volume * Electricity Price

To forecast electricity costs we need to forecast electricity volumes and prices, then multiply the two together. Forecasting electricity volume is interesting, but falls generally under the umbrella of time-series forecasting. This area has lots of prior art and although I am happy to to write a future blog post on this if there is interest, I am going to focus here on electricity price forecasting.

Big Caveat

Let us not forget that the cost of electricity is a major part of the cost of providing power, but in many markets it is only 60% of the total cost. The remaining 40% includes a plethora of costs such as grid improvement projects, ancillary services, capacity markets, and renewable portfolio standards. Below is a nice chart from Open Energy Services that illustrates this point.

For the sake of this post I am going to focus solely on the cost of electricity, but don’t forget that this is not the whole story!

Electricity Forward Curves

Forward curves are the foundation of electricity price forecasting. Forward curves are monthly commodity prices that extend years into the future. They are based on activity in forward markets. It is possible to buy various contracts in electricity forward markets, but the most common contract is a baseload contract with the same volume every hour of every day.

The below chart from KYOS illustrates the relationship between forward markets and forward curves.

The yellow lines are three-month baseload forward contracts and the grey lines are one-year baseload contracts. The various forward contracts can be combined using mathematical formulas into the red line – the forward curve.

Not all contracts in electricity forward markets are public. As a result, it is typical to purchase forward curves from parties who have access to privileged information. Their forward curves should be better than what can be generated using public trading information.

Using Forward Curves to Forecast Prices

Forward curves are not price forecasts (!!!). They are a snapshot in time of the forward market. So why are they used to forecast electricity costs?

Forward curves represent the prices you can (maybe*) get in forward markets right now. If you buy forward contracts now, then you can clearly use those prices for forecasting costs. But even if you do not buy a forward contract, other people will. When they sell the electricity purchased through the forward market contract, it will put pressure on the market to converge toward the price they paid.

This sounds pretty deterministic, but the reality is not so rosy. Using forward curves to forecast prices is a huge approximation based on incomplete and imperfect data. What makes forward curves different and more useful than standard forecasts is that they reflect current market conditions instantaneously and they have (limited) indicative power over realized future prices.

*Just because someone can sign a forward contract at a certain price doesn’t guarantee that price is widely available.

The Hourly Price Problem

We have now established that forward curves should be the basis of our price forecasts. And we should be able to forecast our hourly electricity supply volume using standard time series forecasting techniques. Are we done? Can we now accurately forecast our electricity costs? Let’s take an example using Texas realtime market price data for the Houston load zone.

Let’s assume we have a forward curve that represents the average monthly cost of electricity with 100% accuracy. We can construct an hourly forward curve where every hour of the month is equal to our forward curve price for that month. Our hourly forward curve will have a median hourly error of 33% compared to real prices in 2023! This significant inaccuracy is due to the large hourly variation in electricity prices, as shown by the following histogram of the logarithmic absolute difference between the monthly average price and the realized price for every hour of 2020 in the Houston load zone:

Much of this hourly variation is seasonal and predictable. We need to modify (interpolate) the monthly forward curve values to reflect this hourly variation and produce accurate price forecasts.

Data Format for Hourly Price Interpolation

It may seem like a minor point, but the choice of data format to store hourly price interpolation data is important because it determines the level of granularity possible. I have tried creating a yearly interpolation where there is one row per hour of the year (~8760 hours):

Hour of year	Interpolation factor
0	0.01

But this format is fragile against daylight savings time and does not properly handle weekends. What works better is to have one row per hour of day, month of year, and whether the row falls in a weekend (576 rows):

Hour of day	Month of year	Is weekend	Interpolation factor
0	0	False	0.01

This format is robust against daylight savings and is easy to join with volume forecasts. It does not capture seasonality within a week, but that is minimal and it could be extended to include day of week instead of the boolean weekend flag.

Comparing Strategies for Hourly Price Interpolation

Again using 2013-2023 data from the Texas real time market price data for the Houston load zone, we can compare three different strategies for hourly interpolation of monthly forward curves.

Flat interpolation strategy

This strategy sets the price for every hour of every month to the forward curve value (as discussed in The Hourly Price Problem).

Previous year strategy

This strategy sets the price for each hour based on the hourly variation from the most recent year we have complete data. The assumption here is that the hourly variation should persist across years.

Machine learning

I use the XDGBoost machine learning library with input of the most recent two years for which we have complete data to generate the hourly interpolation values.

I compare these thee strategies on their ability to predict hourly prices two years into the future.

	Flat interpolation	Previous year	XDGBoost
50th percentile error	35.7	8.1	6.8
90th percentile error	91.5	66.6	45.1
99th percentile error	482.6	413.4	294.8

The big advantage of machine learning models is their ability to incorporate additional data. Future directions for this research could include incorporating long term weather forecasts into the price interpolation model.

You can view the code used to compare these strategies at my GitHub.

Honestly, all this is a lot of work. Maybe you should just use our interpolation factors instead:

Tags:

Post by Max Willard
Feb 19, 2025 8:20:02 PM