Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement upload API #1085

Merged
merged 50 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
bc72346
set up upload framework
psilabs-dev Oct 5, 2024
89bc617
initial implementation
psilabs-dev Oct 6, 2024
b1492b8
add title and summary
psilabs-dev Oct 6, 2024
03bd9d2
include category id in upload API
psilabs-dev Oct 6, 2024
6092acf
complete title summary args for handling_incoming_file
psilabs-dev Oct 6, 2024
f20af34
clean and refactor
psilabs-dev Oct 6, 2024
e0a9f2e
rename upload to create
psilabs-dev Oct 6, 2024
8a23168
Merge pull request #1 from psilabs-dev/feature/upload-dev
psilabs-dev Oct 6, 2024
28a0b15
add exception to handle missing payload
psilabs-dev Oct 6, 2024
ca11644
handle missing file better
psilabs-dev Oct 6, 2024
16e87e2
implement sha1 verification
psilabs-dev Oct 6, 2024
fe3fe09
Merge pull request #3 from psilabs-dev/feature/upload-dev
psilabs-dev Oct 6, 2024
c9f6544
reflect mojo upload object
psilabs-dev Oct 6, 2024
19b711a
Merge pull request #4 from psilabs-dev/feature/upload-dev
psilabs-dev Oct 6, 2024
d86637d
add thumbnail generation for upload api
psilabs-dev Oct 7, 2024
acecc72
Merge pull request #5 from psilabs-dev/feature/upload-dev
psilabs-dev Oct 7, 2024
74d5576
Merge branch 'dev' into feature/upload
psilabs-dev Oct 14, 2024
fed1f5b
remove utf downgrade
psilabs-dev Oct 19, 2024
f0c50fd
Merge pull request #6 from psilabs-dev/feature/upload-dev
psilabs-dev Oct 20, 2024
87cf4ae
Merge branch 'dev' into feature/upload
psilabs-dev Oct 23, 2024
1a0fd65
standardize api response
psilabs-dev Oct 23, 2024
6a6f259
put upload API behind login
psilabs-dev Oct 23, 2024
ebf2cea
fix error
psilabs-dev Oct 23, 2024
53d3c79
add status for 400 messages, add sc
psilabs-dev Oct 23, 2024
52aa679
misplaced status code
psilabs-dev Oct 23, 2024
64ffcdf
Merge branch 'dev' into feature/upload
psilabs-dev Oct 26, 2024
4d22602
Merge pull request #11 from psilabs-dev/feature/upload-dev
psilabs-dev Oct 26, 2024
4d36f7f
edit documentation (#12)
psilabs-dev Oct 26, 2024
808cf6b
implement redis lock
psilabs-dev Oct 26, 2024
20de021
move checksum mismatch to 417 and add utf downgrade handler
psilabs-dev Oct 26, 2024
8d5c79a
Merge pull request #13 from psilabs-dev/feature/upload-lock
psilabs-dev Oct 26, 2024
a5a72a0
Merge branch 'dev' into feature/upload
psilabs-dev Oct 30, 2024
69f49d1
show error reason for file upload
psilabs-dev Nov 1, 2024
22a9708
handle_incoming_file return http status code
psilabs-dev Nov 1, 2024
13f677f
be more explicit with responses
psilabs-dev Nov 1, 2024
aa2510c
Merge pull request #16 from psilabs-dev/feature/upload-dev-b
psilabs-dev Nov 1, 2024
a707fc4
Merge branch 'feature/upload' into dev-resolve-mc
psilabs-dev Nov 7, 2024
57077b7
Merge pull request #18 from psilabs-dev/dev-resolve-mc
psilabs-dev Nov 7, 2024
d626832
Update lib/LANraragi/Controller/Api/Archive.pm
psilabs-dev Nov 10, 2024
27ae2eb
edit response when fail move file
psilabs-dev Nov 10, 2024
0fa9d57
no need for success status
psilabs-dev Nov 10, 2024
5e3ef51
upload api docs
psilabs-dev Nov 10, 2024
5376e0e
Merge branch 'dev' into feature/upload
psilabs-dev Nov 10, 2024
3c80f53
this is why i have a chatgpt subscription
psilabs-dev Nov 10, 2024
0fde3d1
add redis lock response code
psilabs-dev Nov 13, 2024
389511e
Update tools/Documentation/api-documentation/archive-api.md
psilabs-dev Nov 15, 2024
3d1bf1b
Update tools/Documentation/api-documentation/archive-api.md
psilabs-dev Nov 15, 2024
228490a
Update tools/Documentation/api-documentation/archive-api.md
psilabs-dev Nov 15, 2024
7e93d28
Merge branch 'dev' into feature/upload
psilabs-dev Nov 18, 2024
9d1e06e
replace utf downgrade with redis encode
psilabs-dev Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
163 changes: 162 additions & 1 deletion lib/LANraragi/Controller/Api/Archive.pm
Original file line number Diff line number Diff line change
@@ -1,14 +1,21 @@
package LANraragi::Controller::Api::Archive;
use Mojo::Base 'Mojolicious::Controller';

use Digest::SHA qw(sha1_hex);
use Redis;
use Encode;
use Storable;
use Mojo::JSON qw(decode_json);
use Scalar::Util qw(looks_like_number);

use LANraragi::Utils::Generic qw(render_api_response);
use File::Temp qw(tempdir);
use File::Basename;
use File::Find;

use LANraragi::Utils::Archive qw(extract_thumbnail);
use LANraragi::Utils::Generic qw(render_api_response is_archive get_bytelength);
use LANraragi::Utils::Database qw(get_archive_json set_isnew);
use LANraragi::Utils::Logging qw(get_logger);

use LANraragi::Model::Archive;
use LANraragi::Model::Category;
Expand Down Expand Up @@ -107,6 +114,160 @@ sub serve_file {
$self->render_file( filepath => $file );
}

# Create a file archive along with any metadata.
# adapted from Upload.pm
sub create_archive {
my $self = shift;

my $logger = get_logger( "Archive API ", "lanraragi");
my $redis = LANraragi::Model::Config->get_redis;

# receive uploaded file
my $upload = $self->req->upload('file');
my $expected_checksum = $self->req->param('file_checksum'); # optional

# require file
if ( ! defined $upload || !$upload ) {
return $self->render(
json => {
operation => "upload",
success => 0,
error => "No file attached"
},
status => 400
);
}

# checksum verification stage.
if ( $expected_checksum ) {
my $file_content = $upload->slurp;
my $actual_checksum = sha1_hex($file_content);
if ( $expected_checksum ne $actual_checksum ) {
return $self->render(
json => {
operation => "upload",
success => 0,
error => "Checksum mismatch: expected $expected_checksum, got $actual_checksum."
},
status => 417
);
}
}

my $filename = $upload->filename;
my $uploadMime = $upload->headers->content_type;
$filename = LANraragi::Utils::Database::redis_encode( $filename );

# lock resource
my $lock = $redis->setnx( "upload:$filename", 1 );
if ( !$lock ) {
return $self->render(
json => {
operation => "upload",
success => 0,
error => "Locked resource: $filename."
},
status => 423
);
}
$redis->expire( "upload:$filename", 10 );

# metadata extraction
my $catid = $self->req->param('category_id');
my $tags = $self->req->param('tags');
my $title = $self->req->param('title');
my $summary = $self->req->param('summary');

# return error if archive is not supported.
if ( !is_archive($filename) ) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't really need this bit if it's already done in handle_incoming_file, right?

(I'm aware Controller->Upload does the same, but I think I'll eventually deprecate that code path in favor of just using the API in the webclient)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_archive ensures that the filename has a valid extension. I don't know how removing this will affect the tempdir logic, I'll need to test it out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the archive check block is removed, an unhandled exception will be thrown when a filename with an invalid extension or no extension is passed:

... fileparse(): need a valid pathname at /home/koyomi/lanraragi/script/../lib/LANraragi/Utils/Archive.pm line 38.

which is the is_pdf subroutine... which is really weird

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, I guess is_pdf wasn't written to handle invalid filenames at all and the issue just never popped up - good to know.

$redis->del("upload:$filename");
$redis->quit();
return $self->render(
json => {
operation => "upload",
success => 0,
error => "Unsupported file extension ($filename)"
},
status => 415
);
}

# Move file to a temp folder (not the default LRR one)
my $tempdir = tempdir();

my ( $fn, $path, $ext ) = fileparse( $filename, qr/\.[^.]*/ );
my $byte_limit = LANraragi::Model::Config->enable_cryptofs ? 143 : 255;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this shouldn't be enforced at the handle_incoming_file level rather than both in here, in Model->Upload->Download_url, and the old Controller upload.

Might be one for later tho

Copy link
Contributor Author

@psilabs-dev psilabs-dev Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the initial purpose of the byte limit? I only kept it there to mirror, I don't know how this will affect temporary file creation if a filename exceeds this limit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's mostly there for synology hardware which has that 255 byte cap, and the notably insane who use cryptoFS on top of that, which limits it down to a further 143 bytes.

I believe this would break even temporary filenames if exceeded.


$filename = $fn;
while ( get_bytelength( $filename . $ext . ".upload") > $byte_limit ) {
$filename = substr( $filename, 0, -1 );
}
$filename = $filename . $ext;

my $tempfile = $tempdir . '/' . $filename;
if ( !$upload->move_to($tempfile) ) {
$logger->error("Could not move uploaded file $filename to $tempfile");
$redis->del("upload:$filename");
$redis->quit();
return $self->render(
json => {
operation => "upload",
success => 0,
error => "Couldn't move uploaded file to temporary location."
},
status => 500
);
}

# Update $tempfile to the exact reference created by the host filesystem
# This is done by finding the first (and only) file in $tempdir.
find(
sub {
return if -d $_;
$tempfile = $File::Find::name;
$filename = $_;
},
$tempdir
);

my ( $status_code, $id, $response_title, $message ) = LANraragi::Model::Upload::handle_incoming_file( $tempfile, $catid, $tags, $title, $summary );

# post-processing thumbnail generation
my %hash = $redis->hgetall($id);
my ( $thumbhash ) = @hash{qw(thumbhash)};
unless ( length $thumbhash ) {
$logger->info("Thumbnail hash invalid, regenerating.");
my $thumbdir = LANraragi::Model::Config->get_thumbdir;
$thumbhash = "";
extract_thumbnail( $thumbdir, $id, 0, 1 );
$thumbhash = $redis->hget( $id, "thumbhash" );
$thumbhash = LANraragi::Utils::Database::redis_decode($thumbhash);
}
$redis->del("upload:$filename");
$redis->quit();

unless ( $status_code == 200 ) {
return $self->render(
json => {
operation => "upload",
success => 0,
error => $message,
id => $id
},
status => $status_code
);
}

return $self->render(
json => {
operation => "upload",
success => 1,
id => $id
},
status => 200
);
}

# Serve an archive page from the temporary folder, using RenderFile.
sub serve_page {
my $self = shift;
Expand Down
30 changes: 20 additions & 10 deletions lib/LANraragi/Model/Upload.pm
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ use File::Temp qw(tempdir);
use File::Find qw(find);
use File::Copy qw(move);

use LANraragi::Utils::Database qw(invalidate_cache compute_id);
use LANraragi::Utils::Logging qw(get_logger);
use LANraragi::Utils::Database qw(invalidate_cache compute_id set_title set_summary);
use LANraragi::Utils::Logging qw(get_logger);
use LANraragi::Utils::Database qw(redis_encode);
use LANraragi::Utils::Generic qw(is_archive get_bytelength);
use LANraragi::Utils::String qw(trim trim_CRLF trim_url);
Expand All @@ -29,18 +29,18 @@ use LANraragi::Model::Category;
# The file will be added to a category, if its ID is specified.
# You can also specify tags to add to the metadata for the processed file before autoplugin is ran. (if it's enabled)
#
# Returns a status value, the ID and title of the file, and a status message.
# Returns an HTTP status code, the ID and title of the file, and a status message.
sub handle_incoming_file {

my ( $tempfile, $catid, $tags ) = @_;
my ( $tempfile, $catid, $tags, $title, $summary ) = @_;
my ( $filename, $dirs, $suffix ) = fileparse( $tempfile, qr/\.[^.]*/ );
$filename = $filename . $suffix;
my $logger = get_logger( "File Upload/Download", "lanraragi" );

# Check if file is an archive
unless ( is_archive($filename) ) {
$logger->debug("$filename is not an archive, halting upload process.");
return ( 0, "deadbeef", $filename, "Unsupported File Extension ($filename)" );
return ( 415, "deadbeef", $filename, "Unsupported File Extension ($filename)" );
}

# Compute an ID here
Expand Down Expand Up @@ -71,7 +71,7 @@ sub handle_incoming_file {
? "This file already exists in the Library." . $suffix
: "A file with the same name is present in the Library." . $suffix;

return ( 0, $id, $filename, $msg );
return ( 409, $id, $filename, $msg );
}

# If we are replacing an existing one, just remove the old one first.
Expand Down Expand Up @@ -109,19 +109,29 @@ sub handle_incoming_file {
}
}

# Set title
if ($title) {
set_title( $id, $title );
}

# Set summary
if ($summary) {
set_summary( $id, $summary );
}

# Move the file to the content folder.
# Move to a .upload first in case copy to the content folder takes a while...
move( $tempfile, $output_file . ".upload" )
or return ( 0, $id, $name, "The file couldn't be moved to your content folder: $!" );
or return ( 500, $id, $name, "The file couldn't be moved to your content folder: $!" );

# Then rename inside the content folder itself to proc Shinobu.
move( $output_file . ".upload", $output_file )
or return ( 0, $id, $name, "The file couldn't be renamed in your content folder: $!" );
or return ( 500, $id, $name, "The file couldn't be renamed in your content folder: $!" );

# If the move didn't signal an error, but still doesn't exist, something is quite spooky indeed!
# Really funky permissions that prevents viewing folder contents?
unless ( -e $output_file ) {
return ( 0, $id, $name, "The file couldn't be moved to your content folder!" );
return ( 500, $id, $name, "The file couldn't be moved to your content folder!" );
}

# Now that the file has been copied, we can add the timestamp tag and calculate pagecount.
Expand Down Expand Up @@ -157,7 +167,7 @@ sub handle_incoming_file {
# Invalidate search cache ourselves, Shinobu won't do it since the file is already in the database
invalidate_cache();

return ( 1, $id, $name, $successmsg );
return ( 200, $id, $name, $successmsg );
}

# Download the given URL, using the given Mojo::UserAgent object.
Expand Down
7 changes: 4 additions & 3 deletions lib/LANraragi/Utils/Minion.pm
Original file line number Diff line number Diff line change
Expand Up @@ -175,8 +175,8 @@ sub add_tasks {
$logger->info("Processing uploaded file $file...");

# Since we already have a file, this goes straight to handle_incoming_file.
my ( $status, $id, $title, $message ) = LANraragi::Model::Upload::handle_incoming_file( $file, $catid, "" );

my ( $status_code, $id, $title, $message ) = LANraragi::Model::Upload::handle_incoming_file( $file, $catid, "", "", "" );
my $status = $status_code == 200 ? 1 : 0;
$job->finish(
{ success => $status,
id => $id,
Expand Down Expand Up @@ -253,7 +253,8 @@ sub add_tasks {
my $tag = "source:$og_url";

# Hand off the result to handle_incoming_file
my ( $status, $id, $title, $message ) = LANraragi::Model::Upload::handle_incoming_file( $tempfile, $catid, $tag );
my ( $status_code, $id, $title, $message ) = LANraragi::Model::Upload::handle_incoming_file( $tempfile, $catid, $tag, "", "" );
my $status = $status_code == 200 ? 1 : 0;

$job->finish(
{ success => $status,
Expand Down
1 change: 1 addition & 0 deletions lib/LANraragi/Utils/Routing.pm
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ sub apply_routes {
$public_api->get('/api/archives/:id/categories')->to('api-archive#get_categories');
$public_api->get('/api/archives/:id/tankoubons')->to('api-tankoubon#get_tankoubons_file');
$public_api->get('/api/archives/:id/metadata')->to('api-archive#serve_metadata');
$logged_in_api->put('/api/archives/upload')->to('api-archive#create_archive');
$logged_in_api->put('/api/archives/:id/thumbnail')->to('api-archive#update_thumbnail');
$logged_in_api->put('/api/archives/:id/metadata')->to('api-archive#update_metadata');
$logged_in_api->delete('/api/archives/:id')->to('api-archive#delete_archive');
Expand Down
Loading