Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Support lists of sources and destinations on pyarrow.fs.copy_files() #44864

Open
Tom-Newton opened this issue Nov 27, 2024 · 0 comments

Comments

@Tom-Newton
Copy link
Contributor

Tom-Newton commented Nov 27, 2024

Describe the enhancement requested

I have a usecase where we want to copy a list of files from source to destination, and they are not all in a directory.

For example:

pyarrow.fs.copy_files(
    ["file0", "directory/file1"], ["directory/file0", "file1"], source_filesystem=..., destination_filesystem=...
)

I think it's reasonable not to support directories in the sources or destinations lists and assume required directories in the destination already exist.

On the C++ side this is already implemented https://github.com/apache/arrow/tree/main/cpp/src/arrow/filesystem/filesystem.cc#L625-L660

To keep compatibility of the public API we could support both list and individual value with something like https://github.com/apache/arrow/tree/main/python/pyarrow/acero.py#L295-L298

Component(s)

Python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant