You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Specific suggestions and questions to think about:
State clearly that all (public) NASA data is accessible from Fornax, regardless of where the data lives (archive's in-house or cloud storage).
Explain when and why users should care about where the datasets live.
Use case 1: "I really care about X dataset and already know how to access it."
More than likely, they're accessing in-house data. What basic instructions can we provide to help them determine when and why they should go to the effort of looking for it somewhere else (i.e., in cloud storage)?
Use case 2: "I really care about Y targets (stars, galaxies, ...) and am doing a mass search for data in NASA archives."
Currently ~all of the data being put in cloud storage by NASA archives is a copy of what they're already serving from their in-house storage. If the user wants to access the cloud copy, they'll usually have to make an explicit choice to do this. But they've probably never had to make this kind of choice before and the current documentation is not very clear about what is available from where. Users may assume that if the data is available in cloud storage, that's what they'll automatically be accessing without having to do anything different or proactive.
This confusion is compounded when we point users to the NASA-NAVO Workshops Notebooks. It is a very useful overview for NASA data, but AFAIK it doesn't contain any information about accessing data from cloud storage. Since the Fornax documentation emphasizes cloud-hosted data and NAVO documentation doesn't even mention it, this can lead the user to assume that by following the NAVO tutorials they are already accessing cloud-hosted data.
The text was updated successfully, but these errors were encountered:
Users may assume that if the data is available in cloud storage, that's what they'll automatically be accessing without having to do anything different or proactive.
Can we make it a thing that all users in Fornax automatically accesses cloud data if it exists? Some environmental variable type idea? Because I think that would help keep much of this information away from the user.
That would be great, though I don't see a solution right now. The ways I know how to tell people to load data all involve having to know the path (and thus, the location) like pd.read_parquet('path-to-catalog') or astropy.io.fits.open('path-to-image'). Also, accessing cloud storage sometimes requires different arguments to handle the different filesystem and/or the permissions/credentials.
I don't think any one solution will ever work for all use cases because there are so many different ways to access data. But potentially in a more narrow context... I know there is ongoing cloud-access related work happening in the astropy + VO universe. Maybe some option for this is envisioned? I'm not up on the details enough to know.
Specific suggestions and questions to think about:
Some background:
Currently ~all of the data being put in cloud storage by NASA archives is a copy of what they're already serving from their in-house storage. If the user wants to access the cloud copy, they'll usually have to make an explicit choice to do this. But they've probably never had to make this kind of choice before and the current documentation is not very clear about what is available from where. Users may assume that if the data is available in cloud storage, that's what they'll automatically be accessing without having to do anything different or proactive.
This confusion is compounded when we point users to the NASA-NAVO Workshops Notebooks. It is a very useful overview for NASA data, but AFAIK it doesn't contain any information about accessing data from cloud storage. Since the Fornax documentation emphasizes cloud-hosted data and NAVO documentation doesn't even mention it, this can lead the user to assume that by following the NAVO tutorials they are already accessing cloud-hosted data.
The text was updated successfully, but these errors were encountered: