You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran into numerous problems getting this installed.
First, I think your documentation left out the creating Models folder. I found this in sd3_infer.py
# NOTE: Must have folder `models` with the following files:
# - `clip_g.safetensors` (openclip bigG, same as SDXL)
# - `clip_l.safetensors` (OpenAI CLIP-L, same as SDXL)
# - `t5xxl.safetensors` (google T5-v1.1-XXL)
# - `sd3_medium.safetensors` (or whichever main MMDiT model file)
# Also can have
# - `sd3_vae.safetensors` (holds the VAE separately if needed)
Also, to get to this work, I had to install these two
pip install fire safetensors tqdm einops transformers sentencepiece protobuf pillow
I think your link to the t5xxl.safetensors file is wrong or your Python is wrong. I downloaded the file from Hugging Face, and the file had the name t5xxl_F16.safetensors. The app was looking for t5xxl.safetensors. I renamed the file without the F16, and I got to the Models Loaded point.
Then it started generating the images, took a long time and posted this:
(.sd3.5) E:\SD35Turbo.sd3.5\sd3.5>python sd3_infer.py --prompt "cute picture of a dog" --model E:\SD35Turbo\sd3.5_large_turbo.safetensors --width 1920 --height 1080
Loading tokenizers...
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565
Loading OpenAI CLIP L...
Loading OpenCLIP bigG...
Loading Google T5-v1-XXL...
Skipping key 'shared.weight' in safetensors file as 'shared' does not exist in python model
Loading SD3 model sd3.5_large_turbo.safetensors...
Loading VAE model...
Models loaded.
Saving images to outputs\sd3.5_large_turbo\cute picture of a dog_2024-11-01T08-58-20
0%| | 0/4 [00:04<?, ?it/s]
0%| | 0/1 [01:40<?, ?it/s]
Traceback (most recent call last):
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 481, in
fire.Fire(main)
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 465, in main
inferencer.gen_image(
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 358, in gen_image
sampled_latent = self.do_sampling(
^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 286, in do_sampling
latent = sample_fn(
^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\amp\autocast_mode.py", line 44, in decorate_autocast
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 285, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 151, in forward
batched = self.model.apply_model(
^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 126, in apply_model
return self.model_sampling.calculate_denoised(sigma, model_output, x)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 47, in calculate_denoised
return model_input - model_output * sigma
~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (135) must match the size of tensor b (134) at non-singleton dimension 2
--
Any suggestions on how to fix this, or did I do something wrong?
Thanks
The text was updated successfully, but these errors were encountered:
I ran into numerous problems getting this installed.
First, I think your documentation left out the creating Models folder. I found this in sd3_infer.py
Also, to get to this work, I had to install these two
pip install fire safetensors tqdm einops transformers sentencepiece protobuf pillow
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
I think your link to the t5xxl.safetensors file is wrong or your Python is wrong. I downloaded the file from Hugging Face, and the file had the name t5xxl_F16.safetensors. The app was looking for t5xxl.safetensors. I renamed the file without the F16, and I got to the Models Loaded point.
Then it started generating the images, took a long time and posted this:
(.sd3.5) E:\SD35Turbo.sd3.5\sd3.5>python sd3_infer.py --prompt "cute picture of a dog" --model E:\SD35Turbo\sd3.5_large_turbo.safetensors --width 1920 --height 1080
Loading tokenizers...
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the
legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565Loading OpenAI CLIP L...
Loading OpenCLIP bigG...
Loading Google T5-v1-XXL...
Skipping key 'shared.weight' in safetensors file as 'shared' does not exist in python model
Loading SD3 model sd3.5_large_turbo.safetensors...
Loading VAE model...
Models loaded.
Saving images to outputs\sd3.5_large_turbo\cute picture of a dog_2024-11-01T08-58-20
0%| | 0/4 [00:04<?, ?it/s]
0%| | 0/1 [01:40<?, ?it/s]
Traceback (most recent call last):
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 481, in
fire.Fire(main)
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 465, in main
inferencer.gen_image(
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 358, in gen_image
sampled_latent = self.do_sampling(
^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 286, in do_sampling
latent = sample_fn(
^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\amp\autocast_mode.py", line 44, in decorate_autocast
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 285, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 151, in forward
batched = self.model.apply_model(
^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 126, in apply_model
return self.model_sampling.calculate_denoised(sigma, model_output, x)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 47, in calculate_denoised
return model_input - model_output * sigma
~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (135) must match the size of tensor b (134) at non-singleton dimension 2
--
Any suggestions on how to fix this, or did I do something wrong?
Thanks
The text was updated successfully, but these errors were encountered: