Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fsutil: account for subvolumes under storage path #9089

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

clinta
Copy link
Contributor

@clinta clinta commented Jul 27, 2022

Related Issues

Filesystems like zfs can contain sub-volumes which are distinct datasets
but share free space. There are reasons an operator may want the sealed
or cache directory to be on distinct volumes, which makes the accounting
for the drive capacity inaccurate.

Proposed Changes

Enumerate all mounts under the storage path, and accumulate
the used space to accurately determine filesystem capacity.

Checklist

Before you mark the PR ready for review, please make sure that:

  • All commits have a clear commit message.
  • The PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, INTERFACE BREAKING CHANGE, CONSENSUS BREAKING, build, chore, ci, docs,perf, refactor, revert, style, test
    • area: api, chain, state, vm, data transfer, market, mempool, message, block production, multisig, networking, paychan, proving, sealing, wallet, deps
  • This PR has tests for new functionality or change in behaviour
  • If new user-facing features are introduced, clear usage guidelines and / or documentation updates should be included in https://lotus.filecoin.io or Discussion Tutorials.
  • CI is green

@clinta
Copy link
Contributor Author

clinta commented Jul 28, 2022

This has been tested on a calibnet miner and works as expected, now accurately showing disk usage that includes subvolumes.

I'm interested in any advise on writing tests for this. I think testing behavior will require creating filesystems which will require tests to be run as root, which I do not think is desirable.

Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that #9013 now lets you specify what you want stored in each storage path, I'd say that maybe it's better to require that there are no sub-mounts in storage paths.

Supporting multiple mounts in each path generates way too many weird edge-cases to support correctly - e.g. we check avail space for cache files in storagepath/, and see that it has enough space, but it turns out that storagepath/cache is a different mount with not enogh space - so we'll fail to put the file there.

log.Warnf("could not get stats for mount %s: %e", mount, err)
continue
}
used += int64(mountStat.Blocks-mountStat.Bavail) * int64(mountStat.Bsize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we'll need to make this configurable somehow, or at the very least a bit smarter.

  • What if there are separate filesystems here - in that case we'd want to add those to available as well
  • Does this work with btrfs subvolumes the same way? (I'm forgetting how those work now, but would be nice if they also worked with this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that this would not be accurate if the filesystems do not share available space. I'm not sure how that can be accurately accounted for. I don't know of any way to reliably determine if two mounts share a pool of available space. You could guess that they do if the available space is the same. But with reservations and quotas you could have a situation where the available space is different, but still shared.

My thinking was this wouldn't be any worse than it works today, and would improve things for one specific use case, even if it doesn't fix every possible situation.

I don't have any experience with btrfs, so I can't comment on that.

// force int64 to handle platform specific differences
//nolint:unconvert
return FsStat{
Capacity: int64(stat.Blocks) * int64(stat.Bsize),
Capacity: used + available,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure if this correct, the capacity we care about here is really "how much data can I put in this path in total", so it maybe makes sense that it drops if it's used in another subvolume.

Afaict we're only using the Capacity field in the CLI, which I guess can break the display in some weird ways?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is primarily an interface issue I was aiming to fix. As is, when the sealed subvolume grows, instead of showing more drive space used, the total size of the drive gradually goes down.


func getMountsUnder(path string) ([]string, error) {
mounts := []string{}
f, err := os.Open("/proc/self/mountinfo")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that work on non-linux unixes?

@clinta
Copy link
Contributor Author

clinta commented Aug 1, 2022

How would #9013 handle the situation where two storage paths share a pool of available space?

If a storage path for sealed data, and a storage path for unsealed data share a pool of free space, and there is enough space to store one, but not both of the incoming items, will it fill the drives?

I wasn't aware of #9013 when I wrote this, but now that I am I'm thinking that there needs to be some way to inform lotus that two storage paths share free space, so that space reservations are counted against all paths that share free space.

Filesystems like zfs can contain sub-volumes which are distinct datasets
but share free space. There are reasons an operator may want the sealed
or cache directory to be on distinct volumes, which makes the accounting
for the drive capacity inaccurate.

Enumerating all mounts under the storage path, and accumulating
the used space in each accurately determines filesystem capacity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants