Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off-by-one error in clear sky mid-month calculation for Linke turbidity #2287

Open
josephmckinsey opened this issue Oct 31, 2024 · 2 comments

Comments

@josephmckinsey
Copy link

Description

On line 260 at

dayofyear = time_utc.dayofyear
, we see that _interpolate_turbidity uses time_utc.dayofyear which is 1-indexed. When the middle of the month is calculated, this uses 0-indexing.

Usually in other parts of the code, this is corrected by an offset of 1, but in this case, it is not.

Reproduction

For instance, this year, February 15th 12:00 PM was the midpoint of February. This is the 46th day of the year, so the index is 45.5 which is correctly calculated by _calendar_month_middles.

Expected Behavior

This can be seen for some midpoints which you would expect to be exactly what's in the HDF5 file, but are not. For example, we would expect $\frac{57}{20} = 2.85$ for these coordinates at this time:

>>> pvlib.clearsky.lookup_linke_turbidity(pd.DatetimeIndex([datetime.datetime(2024,2,15,12)]), 38, -93)
2024-02-15 12:00:00    2.861667

I imagine this is of very low priority, since these calculations only have the resolution of a single day, but I noticed it and thought I should make an issue.

  • pvlib.__version__ == '0.11.1'
  • pandas.__version__ == '2.2.3'
  • Python 3.13.0
@cwhanse
Copy link
Member

cwhanse commented Nov 1, 2024

Maybe the issue is a lack of clarity in what to expect. What value should be used to represent the entire month in these cases?

February 2024: the middle of the month day is noon of Feb 15 (14.5 days before, 14.5 days remaining). The returned index is 45.5, which is what I would expect (31 + 14.5). But starting from day of year (46 for Feb 15, 2024) I can understand why seeing 45.5 appears to be off.

February 2023: the middle of the month is midnight on Feb 14. The returned index is 45. The day of year for that datetime is 46, but 1 second earlier would be 45.

import datetime
import pandas as pd
from pvlib import clearsky


dt = pd.DatetimeIndex([datetime.datetime(2024,2,15,12)])
print(dt.dayofyear)

middles = clearsky._calendar_month_middles(2024)
print(middles[2])

dt = pd.DatetimeIndex([datetime.datetime(2023,2,15,0)])
print(dt.dayofyear)
dt = pd.DatetimeIndex([datetime.datetime(2023,2,14,23,59,59)])
print(dt.dayofyear)

middles = clearsky._calendar_month_middles(2023)
print(middles[2])

The lookup and interpolation methods originated with pvlib code, so we aren't bound to follow a reference here and can choose how to do this.

@josephmckinsey
Copy link
Author

I don't really have too much stake in this; I just thought it was weird. My own suggestion would be to use actual times instead of dayofyear calculations, so the interpolated value is a continuous function of datetime. You still need to get the middle of the month in epoch time or something like that.

I haven't tested the sensitivity of a final PV power output with respect to turbidity, so I'm not sure how much this matters, especially given how the input data is in uint8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants