Solar mixture

0.1 Nomenclatures

$H_{t}^{extra}$: extraterrestrial radiation for day $t$. It is the theoretical radiation outside the atmosphere.
$C_{t}^{clearsky}$: clear sky radiation for day $t$. It refers to the amount of solar irradiation that would be received at the Earth’s surface under cloud-free conditions.
$GHI_{t}$: global horizontal irradiance for day $t$. It is the radiation that is able to reach the ground.

1 Clear sky radiation

Clear sky radiation provides an upper bound for the actual GHI received at the ground level. Notably, Hottel (1976) proposed an approach to estimate the transmittance of direct solar radiation through the clear atmosphere for each location in the world. He develop a deterministic model that takes into account factors that affect the transmittance, such as latitude, altitude, the relative positions of the Earth and the Sun, and the extraterrestrial radiation. However such model is much more reliable when considering hourly data with respect to daily ones. Therefore in order to estimate the clear sky radiation for daily data, denoted as $C_t^{clearsky}$, we will use the following functional form:

\[C_t^{clearsky} = \delta H_t^{extra}\]

The equation above express the clear sky radiation as fraction of extraterrestrial radiation $H_t^{extra}$. In order to estimate the parameter $\delta$ we cannot rely on minimum squared error minimization since we need to find the smallest $\delta$ such as $C_t^{clearsky} - GHI_t > 0 \; \forall t$. Therefore the optimal parameter $\delta$ is estimated to solve this problem:

\[\inf_{\delta \in(0,1)}\biggl\{\delta \text{:} \sum_{\forall t} \mathbb{1}_{\delta H_t^{extra} - GHI_t < 0} = 0 \biggl\}\]

2 The model

The formulation of a stochastic model for solar radiation presents unique challenges distinct from those encountered in modeling other natural phenomena, such as wind or temperature. The fundamental distinction lies in the necessity for the solar radiation model to conform to strict lower and upper bounds, a requirement not as pronounced in other environmental variables.

To elucidate, it is not merely adequate to assume that the $GHI_t$ remains positive $GHI_t > 0$ over time, but also in addition to this lower bound constraint, it is imperative that we establish an upper bound, denoted as $C_{t}^{clearsky}$, which reflects the maximum plausible value for $GHI$. This dual requirement arises from the physical realities governing solar radiation, where excessive values beyond an upper bound are inherently unattainable. In our pursuit of formulating a stochastic model that seamlessly integrates these constraints, we introduce a transformed process, denoted as $X_t$, which encapsulates the essence of solar radiation dynamics. Mathematically, the transformation is expressed as follows:

\[X_t = \frac{C^{clearsky}_t - GHI}{C^{clearsky}_t} = 1 - \frac{GHI_t}{C^{clearsky}_t}\]

The numerator in our transformation, representing the disparity between $GHI$ and the clear sky value ($C^{clearsky}_t$), serves as a pivotal element in our endeavor to model solar radiation dynamics. An important consideration is that this numerator, by definition remains strictly positive and greater than zero. This characteristic is ingrained in the definition of $C^{clearsky}_t$ itself, thereby ensuring that the transformed variable, $X_t$, cannot assume its extreme values of 0 or 1 on the estimated data. To mitigate the seasonal behavior throughout the year, we adopt a normalization strategy by dividing the difference by $C^{clearsky}_t$, thereby rendering it dimensionless while retaining its essential characteristics.

To facilitate the proper modeling of this normalized variable, we introduce the transformed variable, denoted as $Y_t$, defined as follows:

\[Y_t = \log(-\log(X_t))\]

The rationale behind employing the log-transform of $-log(X_t)$ is grounded in the fact that $-log(x)$ inherently resides within the positive domain $\forall x \in (0, 1)$. Furthermore, to seamlessly transition back to the original variable, $X_t$, we can employ an inverse transform:

\[X_t = e^{-e^{Y_t}}\]

This transformation retains the bounded nature of $X_t$, ensuring that it resides within the interval $[0, 1]$. In theory with simulated data it could be possible that $X_t = 0$ or $X_t = 1$, but in practice $\mathbb{P}\{X_t = 0\}\approx 0$ and $\mathbb{P}\{X_t = 1\}\approx 0$. With this approach we can express the $GHI_t$ as:

\[GHI_t = C^{clearsky}_t - X_t \cdot C^{clearsky}_t = C^{clearsky}_t - e^{-e^{Y_t}} \cdot C^{clearsky}_t = C^{clearsky}_t(1 - e^{-e^{Y_t}})\]

To capture the temporal dependencies within the transformed variable $Y_t$, we employ an AutoRegressive (AR) process of order 7, denoted $AR(7)$, and defined as:

\[Y_t = \alpha_0 + \alpha_1 Y_{t-1} + \alpha_2 Y_{t-2} + \alpha_3 Y_{t-3} + \alpha_4 Y_{t-4} + \alpha_5 Y_{t-5} + \alpha_6 Y_{t-6} + \alpha_7 Y_{t-7} + \epsilon_t\]

It is crucial to acknowledge and account for the seasonality present in the residuals, denoted as $\epsilon_t$, of the AR(7) process. To achieve this, we introduce a seasonal component, $\sigma_t^{S}$, which standardizes these residuals as follows:

\[\sigma_t^{S} = \sqrt{c_0 + c_1 \cos(\omega t) + c_2 \sin(\omega t)}\]

This seasonal variance $\sigma_t^{S}$ effectively normalizes the residuals, allowing us to examine their temporal patterns and fluctuations within a consistent framework. Building upon this standardized representation, we proceed to model the residuals of the AR(7) process, expressed as $\bar{\epsilon_t} = \frac{\epsilon_t}{\sigma_t^{g}}$, using a GARCH(1,1) model:

\[\bar{\epsilon_t} = \sigma_t u_t\]

Within the GARCH(1,1) framework, the conditional variance $\sigma_t^2$ is estimated as follows:

\[\sigma_t^2 = \omega_0 + \omega_1 \bar{\epsilon}_{t-1}^2 + \omega_2 \sigma^2_{t-1}\]

This comprehensive modeling strategy not only captures the temporal dependencies within the transformed variable but also accounts for the heteroscedasticity in the residuals, providing a robust framework for understanding and forecasting solar radiation dynamics. In subsequent sections, we delve into the empirical results and practical implications of this modeling approach, shedding light on its utility in the domain of solar energy forecasting and risk management.

Finally we fit the GARCH(1,1) residuals with a normal mixture model for each month $m = 1,2,...12$. For the $t$-th day of the year we will have that:

\[u_t \sim \sum_{m = 1}^{12} \mathbb{1}_{m_t} \bigl[ (1-p_{m}) \cdot \phi(\mu_{m}^{1}, \sigma_{m}^{1}) + p_{m} \cdot \phi(\mu_{m}^{2}, \sigma_{m}^{2})\bigl]\] where

$\mathbb{1}_{m_t}$ is a dummy variable that assumes 1 if the $t$-th day of the year is in the month $m$, otherwise 0.
$\phi(\mu, \sigma)$ is the normal pdf.

The application of the normal mixture model to solar radiation not only offers a robust modeling framework but also provides valuable insights into the underlying physical processes governing solar radiation variability. This model can be intuitively interpreted as representing distinct distributions for different weather conditions, shedding light on the dynamics of cloudy and sunny days. In this framework, we can conceptualize the two component distributions as models for these weather conditions, each characterized by its own unique parameters. Let’s delve deeper into this interpretation:

Cloudy Days: this distribution typically exhibits a positive mean, represented as $\mu^{cloudy}$, capturing the expected solar radiation intensity on such days. It is worth noting that cloudy days tend to have higher variability, reflected in the standard deviation $\sigma^{cloudy}$. This dispersion is indicative of the unpredictable and fluctuating nature of solar radiation under overcast conditions.
Sunny Days: in contrast, sunny days tend to have a positive mean, represented as $\mu^{sunny}$, which reflects the higher solar radiation intensity typically observed during clear skies. The standard deviation, denoted as $\sigma^{sunny}$, is comparatively lower for sunny days, signifying a more stable and predictable solar radiation pattern.

This interpretation aligns with the practical observation that solar radiation dynamics are fundamentally influenced by the prevailing weather conditions. cloudy days, characterized by their lower solar radiation levels and higher variability, manifest in a distribution centered around a negative mean. Conversely, sunny days, marked by their higher radiation levels and relative stability, manifest in a distribution centered around a positive mean.

In the subsequent sections, we explore the empirical implications of this normal mixture model, highlighting its capacity to capture the nuanced interplay between weather conditions and solar radiation dynamics. Furthermore, we examine the utility of this model in enhancing the accuracy of solar energy forecasting and risk management, offering a valuable tool for stakeholders in the solar energy sector.

3 Validation

Here we validate our model on Bologna, the results for the others cities are presented in Appendix B.

3.1 AR model

The transformed variable $Y_t$ presents a persistent autocorrelation function that decays slowly. Therefore in order to have good compromise between precision and parsimony we have found adequate to model it with an AR(7) model.

[1] "The process is: stationary"

AR(7):estimated parameters (Berlino)
term	estimate	std.error	p.value
$$\alpha_0$$	-0.064	0.012	0.000
$$\alpha_1$$	0.378	0.017	0.000
$$\alpha_2$$	0.049	0.018	0.006
$$\alpha_3$$	0.046	0.018	0.009
$$\alpha_4$$	0.014	0.018	0.426
$$\alpha_5$$	0.053	0.018	0.003
$$\alpha_6$$	0.049	0.018	0.005
$$\alpha_7$$	0.044	0.017	0.007

The autocorrelation function computed on the AR(7) residuals shows we were able to remove autocorrelation in mean but not in variance.

3.2 Seasonal variance

Firstly we estimate a seasonal function $\sigma_t^S$ and then we fit a GARCH(1,1) model on the standardized residuals. For most of the locations in Italy, the seasonal variance exhibits a decreasing behavior in Spring and Summer, while in Autumn/Winter an increasing one. As shown in Larsson, Green, and Benth (2023), in order to ensure the positivity of the process we must have that:

\[ c_0 > \sqrt{c_1^2 + c_2^2} \]

[1] "The process is ensured to be positive"

Seasonal variance: estimated parameters (Berlino)
term	estimate	std.error	statistic	p.value
$$c_0$$	0.452	0.009	48.551	0.000
$$c_1$$	0.025	0.013	1.881	0.060
$$c_2$$	0.022	0.013	1.680	0.093

3.3 Garch model

Since the Garch model is estimated on the standardized residuals we should ensure that the unconditional mean is equal to 1. In order to do so we impose a constraint on $\omega_0$.

[1] "Unconditional variance: 1.001"

[1] "Unconditional variance: 1"

GARCH (1, 1) model: estimated parameters (Berlino)
term	estimate	std.error	p.value
$$\omega_0$$	0.098	0.054	0.078
$$\omega_1$$	0.041	0.014	0.006
$$\omega_2$$	0.862	0.064	0.000

3.4 Normal mixture: Monthly estimate

Here we estimate a set of parameters for each month. As we can see from the below figure, for each month we identify two distributions. In general one is centered around negative values and it is more dispersed while the other is centered around positive values and it is less disperced. For each day the probability of being in one state or the other is determined by a bernoulli variable that can take values in 0,1.

3.4.1 Parameters

Mixture normal: Monthly estimate (Berlino)
Month	$$p^{sunny}$$	$$\mu^{sunny}$$	$$\sigma^{sunny}$$	$$p^{cloudy}$$	$$\mu^{cloudy}$$	$$\sigma^{cloudy}$$
Jan	0.513	0.372	0.816	0.487	-0.917	0.587
Feb	0.459	0.852	0.686	0.541	-0.766	0.737
Mar	0.194	1.215	0.633	0.806	-0.358	1.007
Apr	0.255	0.978	0.466	0.745	-0.183	0.888
May	0.640	0.528	0.806	0.360	-0.785	1.087
Jun	0.382	0.788	0.628	0.618	-0.228	0.977
Jul	0.253	0.917	0.484	0.747	-0.112	0.977
Aug	0.347	0.815	0.563	0.653	-0.069	0.916
Sep	0.716	0.608	0.661	0.284	-1.134	0.736
Oct	0.420	0.922	0.684	0.580	-0.680	1.033
Nov	0.665	0.323	0.880	0.335	-1.203	0.605
Dec	0.202	0.803	0.561	0.798	-0.508	0.722

3.5 Simulation

3.5.1 More simulations

On the figure on the left we overlap the realized GHI for each year while on the right the GHI simulated with our model. The expected value computed respectively on realized and simulated paths is the line in yellow.

3.6 Empirical density

Here we compare the empirical density of the $GHI$ and of $X_t$ with the density that we can obtain from Monte Carlo simulations.

3.7 Esscher tranform

Here we apply the Esscher transform on the pdf obtained from Monte Carlo simulations.

\[\mathbb{E}_{\lambda}\{ f(x) \} = f(x;\lambda) = \frac{e^{\lambda x \cdot f(x)}}{\int_{-\infty}^{\infty} e^{\lambda x} \cdot f(x) dx} \qquad(1)\]

In general, the empirical distribution of a discrete random variable $X$ can be defined as follows:

\[\widehat{F}_{X}(x) = \frac{1}{n} \sum_{i = 1}^{n} \text{1}_{[X_i \leq x]} \qquad(2)\]

Given two probability distributions $Q$ and $P$, defined on the same sample space $\chi$, the Kullback-Leibler divergence from $Q$ to $P$ can be calculated as:

\[\mathcal{D}_{\text{KL}} (P \;||\;Q) = \sum_{x \in \chi } P(x) \cdot \text{log} \biggl( \frac{P(x)}{Q(x)}\biggl) \qquad(3)\]

Here, we can interpret the distance $\mathcal{D}_{\text{KL}} (P \;||\;Q)$ as the information gain achieved if $P$ is used instead of $Q$. Since we have not contracts traded on the market, in order to price derivative contracts following a quasi-risk-neutral pricing we choose as target distribution $P$, the empirical distribution. In this context, $Q$ will be the theoretical distribution computed as an Esscher transform of a normal random variable. Finally we seek for the optimal lambda that minimizes the information loss, given by the distance $\mathcal{D}_{\text{KL}} (P \;||\;Q)$, that comes by using the theoretical model $Q$ in place of the empirical one $P$.

\[\underset{\lambda_u}{\text{argmin}} \; \mathcal{D}_{\text{KL}}(\hat{F}_{u}(x) \;||\; E_{\lambda_u}\{ f_X^u(x) \}) \qquad(4)\]

Finally here we compare the realized cumulated GHI over the year for each year considered with the same computations done on simulated values.

4 Weather derivative

In this section we dive into the definition of a weather derivative contract with the aim of reducing the volatility of the cash flows. Given that the solar irradiance exhibit a seasonal behaviour we can immagine a derivative contract having a deterministic time dependent strike moving during the year.

$GHI_t^{m}$: seasonal mean estimated as proportion of extraterrestrial radiation $GHI_t^m = \beta H_0$. The parameter $\beta$ is estimated with minimizing the mse of the residuals. This procedure should be very clear in the definition of “seasonal mean” into the contract.

To be more generic we can increase/decrease the hedging around the seasonal mean setting an additional parameter $h$

For example we can immagine a put contract:

\[Put = (hGHI_t^m - GHI_t)^+\]

For example, a contract that pays if the $GHI_t$ drops more than $-5\%$ from the seasonal mean. It can be seen as strong protection and the premium will be higher.:

\[Put = (0.95 \cdot GHI_t^m - GHI_t)^+\]

For example, a contract that pays if the $GHI_t$ drops below the seasonal mean increased by $2\%$. It can be seen as not strong protection and the premium will be lower:

\[Put = (1.02 \cdot GHI_t^m - GHI_t)^+\]

Is a contract that pays if the $GHI_t$ drops below the seasonal mean increased by $2\%$. It can be seen as weak protection and the premium will be lower.

$$Year$$	$$Hedged$$	$$Unhedged$$	$$Sd(Hedged)$$	$$Sd(Unhedged)$$	$$protection$$	$$premia$$
2010	224.4617	228.999	0.3792545	0.4435175	1	0

References

Hottel, H. C. 1976. “A Simple Model for Estimating the Trasmittance of Direct Solar Radiation Through Clear Atmospheres.” Solar Energy 18: 129–34. https://doi.org/10.1016/0038-092X(76)90045-1.

Larsson, Karl, Rikard Green, and Fred Espen Benth. 2023. “A Stochastic Time-Series Model for Solar Irradiation.” Energy Economics 117: 106421. https://doi.org/https://doi.org/10.1016/j.eneco.2022.106421.