the-database / mpv-upscale-2x_animejanai Goto Github PK
View Code? Open in Web Editor NEWReal-time anime upscaling to 4k in mpv with Real-ESRGAN compact models
License: Other
Real-time anime upscaling to 4k in mpv with Real-ESRGAN compact models
License: Other
Since Plex for Windows is mpv-based, it should be possible to upscale with AnimeJaNai in Plex. The steps just need to be documented. Most likely setup will be similar to https://www.svp-team.com/wiki/SVP:Plex_Media_Player
Does this only work on nvidia?
When I press close the player and then open again it starts from the begining instead of where It was before. The option save-position-on-quit is set to yes on the config but it's not working properly
First of all, thank you so much for making such a great program.
This time, I had difficulty introducing 2x_animejanai, including mpv. However, I found two points, corrected them, and I think I overcame them.
Computer used: 13600k & Nvida RTX 4070ti
My coding knowledge: convergence to zero
English proficiency: Poor English skill, but Google Translate handles it
I tried using Standard_V1_Ultra compact for the first time, but there was a huge roar and a drop frame phenomenon.
After several trials and errors, I overcame it by modifying the 2x_SharpLines.vpy value.
core.num_theads = 14
I saw in the task manager that my computer had a CPU 14 core and changed the data value.
I think there will be a lot of improvements if many people change it according to their computer performance.
I noticed that there was a huge overload in processing upscaling twice.
To summarize, I added "SHD_ENGINE" as shown in the example below.
If I put the COMPACT engine twice, it seemed quite burdensome for the computer, so I lowered the engine one step at a time and applied it.
The result was very satisfying and I would be very happy if it would help the developer.
Result
cpu 99% > 20 ~ 40 %
gpu 99% > 60 ~ 80 %
import vapoursynth as vs
import os
SD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_Compact_net_g_120000"
HD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_UltraCompact_net_g_100000"
SHD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_SuperUltraCompact_net_g_100000"
core = vs.core
core.num_threads = 14 # can influence ram usage
colorspace="709"
def scaleTo1080(clip, w=1920, h=1080):
if clip.width / clip.height > 16 / 9:
prescalewidth = w
prescaleheight = w * clip.height / clip.width
else:
prescalewidth = h * clip.width / clip.height
prescaleheight = h
return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)
def upscale2x(clip):
engine_name = SD_ENGINE_NAME if clip.height < 720 else HD_ENGINE_NAME
return core.trt.Model(
clip,
engine_path=f"C:\Program Files (x86)\mpv-lazy-20230404-vsCuda\mpv-lazy\vapoursynth64\plugins\vsmlrt-cuda\{engine_name}.engine",
num_streams=4,
)
def upscale4x(clip):
engine_name = HD_ENGINE_NAME if clip.height < 720 else SHD_ENGINE_NAME
return core.trt.Model(
clip,
engine_path=f"C:\Program Files (x86)\mpv-lazy-20230404-vsCuda\mpv-lazy\vapoursynth64\plugins\vsmlrt-cuda\{engine_name}.engine",
num_streams=4,
)
clip = video_in
if clip.height < 720:
colorspace = "170m"
clip = vs.core.resize.Bicubic(clip, format=vs.RGBS, matrix_in_s=colorspace,
)
if clip.height >= 720 or clip.width >= 1280:
clip = scaleTo1080(clip)
clip = upscale2x(clip)
if clip.height < 2160 and clip.width < 3840:
if clip.height > 1080 or clip.width > 1920:
clip = scaleTo1080(clip)
clip = upscale4x(clip)
clip = vs.core.resize.Bicubic(clip, format=vs.YUV420P8, matrix_s=colorspace)
clip.set_output()
This suggestion is from Lycoris2013. Low bitrate MPEG2 sources are very common among anime captured from Japanese broadcasts. The main 2x_AnimeJaNai models are intended to preserve details as much as possible so they cannot have heavy artifact correction which tends to also remove details. So a separate model could be trained specifically for these MPEG2 sources such as the image below.
This time, I've made optimization after a lot of trial and error.
The amendments are as follows
Upscale_twice or 60 frame videos above 1080p almost unconditionally get frame drops.
To find a solution to this, I tried to manipulate the "animejanai_v2.conf" data and find a figure that prevents frame drops while preserving the most quality.
In conclusion, I think the frame drop was the best when it was "resize_height_before_first_2x=540", and I started to make a code based on this
scale_to_810: 1440p60 upscaling
scale_to_675 : 1080p60 Upscaling
scale_to_540 : 809p30~540p30 upscale_twice
scale_to_1080: 1079p to 810p upscaling
Compact / UltraCompact/ SuperUltraCompact (strong.v1)
I used all three of these to make the best combination
30 frames
2159p - 810p >>> resize1080 + UltraCompact >>> 2160p
809p - 540p >>> resize540 + SuperUltraCompact + SpuerUltraCompact >>> 2160p
539p - 1p >>> Compact + UltraCompact >>> 2156p - 4p
60 frames
2159p - 1081p >>> resize810 + SuperUltraCompact >>> 1620p
1080p - 810p >>> resize675 + UltraCompact >>> 1350p
809p - 540p >>> UltraCompact >>> 1618p - 1082p
539p - 1p >>> Compact >>> 1078p - 2p
I am very pleased with this result
rife_cuda.py also I've tried this too. but I think it's going to be hard.
Below is the code I used
I'll mark ###new### for the part where I fixed the code
import vapoursynth as vs
import os
import subprocess
import logging
import configparser
import sys
from logging.handlers import RotatingFileHandler
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
import rife_cuda
import animejanai_v2_config
# import gmfss_cuda
# trtexec num_streams
TOTAL_NUM_STREAMS = 4
core = vs.core
core.num_threads = 4 # can influence ram usage
plugin_path = os.path.join(os.path.dirname(os.path.abspath(__file__)),
r"..\..\vapoursynth64\plugins\vsmlrt-cuda")
model_path = os.path.join(plugin_path, r"..\models\animejanai")
formatter = logging.Formatter(fmt='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger('animejanai_v2')
config = {}
def init_logger():
global logger
logger.setLevel(logging.DEBUG)
rfh = RotatingFileHandler(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'animejanai_v2.log'),
mode='a', maxBytes=1 * 1024 * 1024, backupCount=2, encoding=None, delay=0)
rfh.setFormatter(formatter)
rfh.setLevel(logging.DEBUG)
logger.addHandler(rfh)
# model_type: HD or SD
# binding: 1 through 9
def find_model(model_type, binding):
section_key = f'slot_{binding}'
key = f'{model_type.lower()}_model'
if section_key in config:
if key in config[section_key]:
return config[section_key][key]
return None
def create_engine(onnx_name):
onnx_path = os.path.join(model_path, f"{onnx_name}.onnx")
if not os.path.isfile(onnx_path):
raise FileNotFoundError(onnx_path)
engine_path = os.path.join(model_path, f"{onnx_name}.engine")
subprocess.run([os.path.join(plugin_path, "trtexec"), "--fp16", f"--onnx={onnx_path}",
"--minShapes=input:1x3x8x8", "--optShapes=input:1x3x1080x1920", "--maxShapes=input:1x3x1080x1920",
f"--saveEngine={engine_path}", "--tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT"],
cwd=plugin_path)
def scale_to_1080(clip, w=1920, h=1080):
if clip.width / clip.height > 16 / 9:
prescalewidth = w
prescaleheight = w * clip.height / clip.width
else:
prescalewidth = h * clip.width / clip.height
prescaleheight = h
return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)
###new###
def scale_to_810(clip, w=1440, h=810):
if clip.width / clip.height > 16 / 9:
prescalewidth = w
prescaleheight = w * clip.height / clip.width
else:
prescalewidth = h * clip.width / clip.height
prescaleheight = h
return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)
###new###
def scale_to_675(clip, w=1200, h=675):
if clip.width / clip.height > 16 / 9:
prescalewidth = w
prescaleheight = w * clip.height / clip.width
else:
prescalewidth = h * clip.width / clip.height
prescaleheight = h
return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)
###new###
def scale_to_540(clip, w=960, h=540):
if clip.width / clip.height > 16 / 9:
prescalewidth = w
prescaleheight = w * clip.height / clip.width
else:
prescalewidth = h * clip.width / clip.height
prescaleheight = h
return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)
###new###
def upscale2x(clip, sd_engine_name, hd_engine_name, shd_engine_name, num_streams):
if clip.height == 675 or clip.width == 1200:
engine_name = hd_engine_name
else:
if (clip.height < 540 or clip.width < 960):
engine_name = sd_engine_name
if (clip.height == 1080 or clip.width == 1920) or ((clip.height > 540 and clip.width > 960) and (clip.height < 1080 or clip.width < 1920)):
engine_name = hd_engine_name
if (clip.height == 540 or clip.width == 960) or (clip.height == 810 or clip.width == 1440):
engine_name = shd_engine_name
if engine_name is None:
return clip
engine_path = os.path.join(model_path, f"{engine_name}.engine")
message = f"upscale2x: scaling 2x from {clip.width}x{clip.height} with engine={engine_name}; num_streams={num_streams}"
logger.debug(message)
print(message)
if not os.path.isfile(engine_path):
create_engine(engine_name)
return core.trt.Model(
clip,
engine_path=engine_path,
num_streams=num_streams,
)
###new###
def upscale22x(clip, hd_engine_name, shd_engine_name, num_streams):
if clip.height == 1080 or clip.width == 1920:
engine_name = shd_engine_name
else:
if clip.height < 1080 or clip.width < 1920:
engine_name = hd_engine_name
if engine_name is None:
return clip
engine_path = os.path.join(model_path, f"{engine_name}.engine")
message = f"upscale22x: scaling 2x from {clip.width}x{clip.height} with engine={engine_name}; num_streams={num_streams}"
logger.debug(message)
print(message)
if not os.path.isfile(engine_path):
create_engine(engine_name)
return core.trt.Model(
clip,
engine_path=engine_path,
num_streams=num_streams,
)
def run_animejanai(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
resize_to_1080_before_second_2x, upscale_twice, use_rife):
if do_upscale:
colorspace = "709"
colorlv = clip.get_frame(0).props._ColorRange
fmt_in = clip.format.id
if clip.height < 720 or clip.width < 1280:
colorspace = "170m"
if resize_height_before_first_2x != 0:
resize_factor_before_first_2x = 1
try:
# try half precision first
clip = vs.core.resize.Bicubic(clip, format=vs.RGBH, matrix_in_s=colorspace,
width=clip.width/resize_factor_before_first_2x,
height=clip.height/resize_factor_before_first_2x)
clip = run_animejanai_upscale(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
resize_to_1080_before_second_2x, upscale_twice, use_rife, colorspace, colorlv,
fmt_in)
except:
clip = vs.core.resize.Bicubic(clip, format=vs.RGBS, matrix_in_s=colorspace,
width=clip.width/resize_factor_before_first_2x,
height=clip.height/resize_factor_before_first_2x)
clip = run_animejanai_upscale(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
resize_to_1080_before_second_2x, upscale_twice, use_rife, colorspace, colorlv,
fmt_in)
###new###
if use_rife:
clip = rife_cuda.rife(clip, clip.width, clip.height, container_fps)
clip.set_output()
###new###
def run_animejanai_upscale(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
resize_to_1080_before_second_2x, upscale_twice, use_rife, colorspace, colorlv, fmt_in):
###new###
if resize_height_before_first_2x == 540:
if (clip.height >= 540 or clip.width >= 960) and container_fps >= 45:
if (clip.height <= 1080 or clip.width <= 1920):
clip = scale_to_675(clip)
else:
clip = scale_to_810(clip)
else:
if (clip.height >= 540 or clip.width >= 960) and clip.height < 810 and clip.width < 1440 and container_fps < 45:
clip = scale_to_540(clip)
if (clip.height >= 810 or clip.width >= 1440) and clip.height < 2160 and clip.width < 3840:
clip = scale_to_1080(clip)
# if not 540, error occurred at upscale2x, upscale22x
if resize_height_before_first_2x != 540 and resize_height_before_first_2x != 0 :
clip = scale_to_1080(clip, resize_height_before_first_2x * 16 / 9, resize_height_before_first_2x)
# pre-scale 720p or higher to 1080 > NO
if resize_720_to_1080_before_first_2x:
if (clip.height >= 810 or clip.width >= 1440) and clip.height < 1080 and clip.width < 1920:
clip = scale_to_1080(clip)
num_streams = TOTAL_NUM_STREAMS
if upscale_twice and ( clip.height <= 540 or clip.width <= 960 ) and container_fps < 45:
num_streams /= 2
# upscale 2x
clip = upscale2x(clip, sd_engine_name, hd_engine_name, shd_engine_name, num_streams)
# upscale 2x again if necessary
if upscale_twice and ( clip.height <= 1080 or clip.width <= 1920 ) and container_fps < 45:
# downscale down to 1080 if first 2x went over 1080,
# or scale up to 1080 if enabled >> NO
if resize_to_1080_before_second_2x and ( clip.height > 720 or clip.width > 1280):
clip = scale_to_1080(clip)
# upscale 2x again
clip = upscale22x(clip, hd_engine_name, shd_engine_name, num_streams)
fmt_out = fmt_in
if fmt_in not in [vs.YUV410P8, vs.YUV411P8, vs.YUV420P8, vs.YUV422P8, vs.YUV444P8, vs.YUV420P10, vs.YUV422P10,
vs.YUV444P10]:
fmt_out = vs.YUV420P10
return vs.core.resize.Bicubic(clip, format=fmt_out, matrix_s=colorspace, range=1 if colorlv == 0 else None)
# keybinding: 1-9
def run_animejanai_with_keybinding(clip, container_fps, keybinding):
sd_engine_name = find_model("SD", keybinding)
hd_engine_name = find_model("HD", keybinding)
shd_engine_name = find_model("SHD", keybinding)
section_key = f'slot_{keybinding}'
do_upscale = config[section_key].get(f'upscale_2x', True)
upscale_twice = config[section_key].get(f'upscale_4x', True)
use_rife = config[section_key].get(f'rife', True)
resize_720_to_1080_before_first_2x = config[section_key].get(f'resize_720_to_1080_before_first_2x', True)
resize_factor_before_first_2x = config[section_key].get(f'resize_factor_before_first_2x', 1)
resize_height_before_first_2x = config[section_key].get(f'resize_height_before_first_2x', 0)
resize_to_1080_before_second_2x = config[section_key].get(f'resize_to_1080_before_second_2x', True)
if do_upscale:
if sd_engine_name is None and hd_engine_name is None and shd_engine_name is None:
raise FileNotFoundError(
f"2x upscaling is enabled but no SD model and HD model defined for slot {keybinding}. Expected at least one of SD or HD model to be specified with sd_model or hd_model in animejanai.conf.")
###new###
if (clip.height < 2160 or clip.width < 3840) and container_fps < 100:
run_animejanai(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
resize_to_1080_before_second_2x, upscale_twice, use_rife)
def init():
global config
config = animejanai_v2_config.read_config()
if config['global']['logging']:
init_logger()
init()
import sys, os
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
import animejanai_v2
animejanai_v2.run_animejanai_with_keybinding(video_in, container_fps, 1)
[slot_1]
SD_model=2x_AnimeJaNai_Strong_V1_Compact_net_g_120000
HD_model=2x_AnimeJaNai_Strong_V1_UltraCompact_net_g_100000
SHD_model=2x_AnimeJaNai_Strong_V1_SuperUltraCompact_net_g_100000
resize_factor_before_first_2x=1
resize_height_before_first_2x=540
resize_720_to_1080_before_first_2x=no
upscale_2x=yes
upscale_4x=yes
resize_to_1080_before_second_2x=no
rife=no
I'm collecting benchmarks for different hardware configurations on the wiki here: https://github.com/the-database/mpv-upscale-2x_animejanai/wiki/Benchmarks
If you would like to contribute your benchmarks, please try downloading the prerelease archive 1.0.1 here: https://github.com/the-database/mpv-upscale-2x_animejanai/releases/tag/1.0.1
To run the benchmark, extract the release archive and then run the bat file at: mpv-upscale-2x_animejanai\portable_config\shaders\animejanai_v2_benchmark_all.bat
. If running the bat just opens and closes the console Window immediately, an error has occurred. If that happens, please open a command window, navigate to the mpv-upscale-2x_animejanai\portable_config\shaders
directory, and then run this command: .\animejanai_v2_benchmark_all.bat
. Any errors that occurred should be printed to the console, please share them here.
Please add:
It works fine on my Intel Arc A750.
Thank You for Your hard work :)
I'm using rife-cuda in mpv_lazy and wanted to see if I could get AnimeJaNai working alongside it. I'm totally new to using these machine learning models. Is the only way to do it to merge the two desired ONNX models? It's easy to apply anime4k shaders with RIFE so I'm hoping there is a solution to get this working too.
After starting mpv or mpvnet and loading a Video over URL, it has a Black screen for more than a minute.
On normal MPV, it does not happen, starting the video instantly.
Is it, because it has to load the Model? Any way to speed up the process?
(RTX 3080, i7 8700k@5GHz, SSD, Win11)
Is there a way to run these on a phone? Like, loading in android mpv player, etc?
Hello, when I play a video with the upscale feature, all other system sounds are muted. I want to talk on Discord while using the program, but when I start the video, I stop hearing it. Is there any way to hear the other programs besides the audio track of the video I'm playing?
Will it be possible to do this (in realtime) with an amd card in the near future?
I sold my 1070ti and I'm going to get a 6800xt + 13600k. Was also considering the rtx 4070 but 12gb vram and 192 bit made me look at amd.
animejanai_v2_encode.vpy doesn't work, needs investigation
V2 models are being developed to address some feedback provided for the V1 models:
The primary goal of the V2 models is to produce results that appear as if the source was originally produced in 4K while faithfully retaining the original look as much as possible. This will be tracked by downscaling the upscaled results to the native resolution of the anime. The results should be difficult to distinguish from the original anime source. Performing this test on the V1 models more easily reveals the above oversharpening, line darkening, and loss of detail.
It's expected that V2 will not require Soft/Standard/Strong models, there will simply 3 models consisting of V2 Compact, V2 UltraCompact, and V2 SuperUltraCompact.
As with V1, V2 will not be intended for use on low quality sources with heavy artifacting, as those artifacts will be preserved and upscaled. But with less oversharpening, V2 should perform better than V1 on low quality sources.
V2 will undoubtedly lose some sharpness when directly compared to V1. V1 models will remain available for those that prefer the extra sharp look.
Screenshots of progress on the V2 models are included in the following comments. Please note those screenshots are not final and the released V2 model may produce different results.
Hi there,
Setting up animejanai v3 on a system with AMD Radeon graphics can be challenging, especially for noobie.
I'd like to request a lite version of animejanai v3 implemented as a GLSL shader for mpv player.
This would allow users easily use animejanai.
Example,
https://github.com/igv/FSRCNN-TensorFlow/releases/tag/1.1
Thank you.
The Powershell script should be deprecated in favor of releasing prepackaged archives for simpler installation. Two archives should be supplied - one as a full standalone mpv archive like mpv_lazy, and another which can be dropped into any existing mpv installation. TensorRT engines should be generated on the fly.
I use 2x_AnimeJaNai_ Strong_V1_ SuperUltraCompact_net_g_100000 on my 1060 GTX card to upsacle sd material to fullhd.
It would be nice to have the normal and soft variant of the model. Do you plan to train and release them?
Instead of
set video_settings=hevc_nvenc -preset slow -profile:v main10 -b:v 50M
use
set video_settings=hevc_nvenc -preset p7 -profile:v main10 -b:v 50M
instead.
Also works for h264_nvenc.
You can read more about the more modern nvenc presets here:
https://developer.nvidia.com/blog/introducing-video-codec-sdk-10-presets/
When attempting to build an engine, TensorRT throws errors and fails (trtexec log below). Nothing fixed it until I tried to downgrade TRT. From mpv-janai v2.0.2, I copied the entire vsmlrt-cuda folder into v3.0's vapoursynth64/plugins folder, overwriting the old one.
Then v3 finally generated the engine and I tested it as working, but trtexec still complained that because of INT64, the result might be less accurate. Is that true? Should I be worried about that?
Error log before I downgraded TRT (regular janai v3)
&&&& RUNNING TensorRT.trtexec [TensorRT v9200] # C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\..\vapoursynth64\plugins\vsmlrt-cuda\trtexec --fp16 --onnx=C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.onnx --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --skipInference --infStreams=4 --builderOptimizationLevel=4 --saveEngine=C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.engine --tacticSources=-CUDNN,-CUBLAS,-CUBLAS_LT
[05/19/2024-14:15:37] [I] === Model Options ===
[05/19/2024-14:15:37] [I] Format: ONNX
[05/19/2024-14:15:37] [I] Model: C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.onnx
[05/19/2024-14:15:37] [I] Output:
[05/19/2024-14:15:37] [I] === Build Options ===
[05/19/2024-14:15:37] [I] Max batch: explicit batch
[05/19/2024-14:15:37] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[05/19/2024-14:15:37] [I] minTiming: 1
[05/19/2024-14:15:37] [I] avgTiming: 8
[05/19/2024-14:15:37] [I] Precision: FP32+FP16
[05/19/2024-14:15:37] [I] LayerPrecisions:
[05/19/2024-14:15:37] [I] Layer Device Types:
[05/19/2024-14:15:37] [I] Calibration:
[05/19/2024-14:15:37] [I] Refit: Disabled
[05/19/2024-14:15:37] [I] Weightless: Disabled
[05/19/2024-14:15:37] [I] Version Compatible: Disabled
[05/19/2024-14:15:37] [I] ONNX Native InstanceNorm: Disabled
[05/19/2024-14:15:37] [I] TensorRT runtime: full
[05/19/2024-14:15:37] [I] Lean DLL Path:
[05/19/2024-14:15:37] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[05/19/2024-14:15:37] [I] Exclude Lean Runtime: Disabled
[05/19/2024-14:15:37] [I] Sparsity: Disabled
[05/19/2024-14:15:37] [I] Safe mode: Disabled
[05/19/2024-14:15:37] [I] Build DLA standalone loadable: Disabled
[05/19/2024-14:15:37] [I] Allow GPU fallback for DLA: Disabled
[05/19/2024-14:15:37] [I] DirectIO mode: Disabled
[05/19/2024-14:15:37] [I] Restricted mode: Disabled
[05/19/2024-14:15:37] [I] Skip inference: Enabled
[05/19/2024-14:15:37] [I] Save engine: C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.engine
[05/19/2024-14:15:37] [I] Load engine:
[05/19/2024-14:15:37] [I] Profiling verbosity: 0
[05/19/2024-14:15:37] [I] Tactic sources: cublas [OFF], cublasLt [OFF], cudnn [OFF],
[05/19/2024-14:15:37] [I] timingCacheMode: local
[05/19/2024-14:15:37] [I] timingCacheFile:
[05/19/2024-14:15:37] [I] Enable Compilation Cache: Enabled
[05/19/2024-14:15:37] [I] errorOnTimingCacheMiss: Disabled
[05/19/2024-14:15:37] [I] Heuristic: Disabled
[05/19/2024-14:15:37] [I] Preview Features: Use default preview flags.
[05/19/2024-14:15:37] [I] MaxAuxStreams: -1
[05/19/2024-14:15:37] [I] BuilderOptimizationLevel: 4
[05/19/2024-14:15:37] [I] Calibration Profile Index: 0
[05/19/2024-14:15:37] [I] Input(s)s format: fp32:CHW
[05/19/2024-14:15:37] [I] Output(s)s format: fp32:CHW
[05/19/2024-14:15:37] [I] Input build shape (profile 0): input=1x3x8x8+1x3x1080x1920+1x3x1080x1920
[05/19/2024-14:15:37] [I] Input calibration shapes: model
[05/19/2024-14:15:37] [I] === System Options ===
[05/19/2024-14:15:37] [I] Device: 0
[05/19/2024-14:15:37] [I] DLACore:
[05/19/2024-14:15:37] [I] Plugins:
[05/19/2024-14:15:37] [I] setPluginsToSerialize:
[05/19/2024-14:15:37] [I] dynamicPlugins:
[05/19/2024-14:15:37] [I] ignoreParsedPluginLibs: 0
[05/19/2024-14:15:37] [I]
[05/19/2024-14:15:37] [I] === Inference Options ===
[05/19/2024-14:15:37] [I] Batch: Explicit
[05/19/2024-14:15:37] [I] Input inference shape : input=1x3x1080x1920
[05/19/2024-14:15:37] [I] Iterations: 10
[05/19/2024-14:15:37] [I] Duration: 3s (+ 200ms warm up)
[05/19/2024-14:15:37] [I] Sleep time: 0ms
[05/19/2024-14:15:42] [I] Idle time: 0ms
[05/19/2024-14:15:42] [I] Inference Streams: 4
[05/19/2024-14:15:42] [I] ExposeDMA: Disabled
[05/19/2024-14:15:42] [I] Data transfers: Enabled
[05/19/2024-14:15:42] [I] Spin-wait: Disabled
[05/19/2024-14:15:42] [I] Multithreading: Disabled
[05/19/2024-14:15:42] [I] CUDA Graph: Disabled
[05/19/2024-14:15:42] [I] Separate profiling: Disabled
[05/19/2024-14:15:42] [I] Time Deserialize: Disabled
[05/19/2024-14:15:42] [I] Time Refit: Disabled
[05/19/2024-14:15:42] [I] NVTX verbosity: 0
[05/19/2024-14:15:42] [I] Persistent Cache Ratio: 0
[05/19/2024-14:15:42] [I] Optimization Profile Index: 0
[05/19/2024-14:15:42] [I] Inputs:
[05/19/2024-14:15:42] [I] === Reporting Options ===
[05/19/2024-14:15:42] [I] Verbose: Disabled
[05/19/2024-14:15:42] [I] Averages: 10 inferences
[05/19/2024-14:15:42] [I] Percentiles: 90,95,99
[05/19/2024-14:15:42] [I] Dump refittable layers:Disabled
[05/19/2024-14:15:42] [I] Dump output: Disabled
[05/19/2024-14:15:42] [I] Profile: Disabled
[05/19/2024-14:15:42] [I] Export timing to JSON file:
[05/19/2024-14:15:42] [I] Export output to JSON file:
[05/19/2024-14:15:42] [I] Export profile to JSON file:
[05/19/2024-14:15:42] [I]
[05/19/2024-14:15:42] [I] === Device Information ===
[05/19/2024-14:15:42] [I] Available Devices:
[05/19/2024-14:15:42] [I] Device 0: "NVIDIA GeForce RTX 3080" UUID: GPU-44b0b0ec-4a4a-2291-c949-ad5f2d47ac82
[05/19/2024-14:15:42] [I] Selected Device: NVIDIA GeForce RTX 3080
[05/19/2024-14:15:42] [I] Selected Device ID: 0
[05/19/2024-14:15:42] [I] Selected Device UUID: GPU-44b0b0ec-4a4a-2291-c949-ad5f2d47ac82
[05/19/2024-14:15:42] [I] Compute Capability: 8.6
[05/19/2024-14:15:42] [I] SMs: 68
[05/19/2024-14:15:42] [I] Device Global Memory: 10239 MiB
[05/19/2024-14:15:42] [I] Shared Memory per SM: 100 KiB
[05/19/2024-14:15:42] [I] Memory Bus Width: 320 bits (ECC disabled)
[05/19/2024-14:15:42] [I] Application Compute Clock Rate: 1.8 GHz
[05/19/2024-14:15:42] [I] Application Memory Clock Rate: 9.501 GHz
[05/19/2024-14:15:42] [I]
[05/19/2024-14:15:42] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[05/19/2024-14:15:42] [I]
[05/19/2024-14:15:42] [I] TensorRT version: 9.2.0
[05/19/2024-14:15:42] [I] Loading standard plugins
[05/19/2024-14:15:43] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 8080, GPU 1166 (MiB)
[05/19/2024-14:15:50] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +2726, GPU +312, now: CPU 11097, GPU 1478 (MiB)
[05/19/2024-14:15:50] [I] Start parsing network model.
[05/19/2024-14:15:50] [I] [TRT] ----------------------------------------------------------------
[05/19/2024-14:15:50] [I] [TRT] Input filename: C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.onnx
[05/19/2024-14:15:50] [I] [TRT] ONNX IR version: 0.0.7
[05/19/2024-14:15:50] [I] [TRT] Opset version: 14
[05/19/2024-14:15:50] [I] [TRT] Producer name: pytorch
[05/19/2024-14:15:50] [I] [TRT] Producer version: 2.1.2
[05/19/2024-14:15:50] [I] [TRT] Domain:
[05/19/2024-14:15:50] [I] [TRT] Model version: 0
[05/19/2024-14:15:50] [I] [TRT] Doc string:
[05/19/2024-14:15:50] [I] [TRT] ----------------------------------------------------------------
[05/19/2024-14:15:50] [I] Finished parsing network model. Parse time: 0.0284306
[05/19/2024-14:15:50] [I] Set shape of input tensor input for optimization profile 0 to: MIN=1x3x8x8 OPT=1x3x1080x1920 MAX=1x3x1080x1920
[05/19/2024-14:15:50] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[05/19/2024-14:15:53] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_lessZero = lt(/Conv_output_0', /PRelu_zero), name=/PRelu_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:15:53] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_lessZero = lt(/Conv_output_0', /PRelu_zero), name=/PRelu_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_1_lessZero = lt(/Conv_1_output_0', /PRelu_1_zero), name=/PRelu_1_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_1_lessZero = lt(/Conv_1_output_0', /PRelu_1_zero), name=/PRelu_1_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_2_lessZero = lt(/Conv_2_output_0', /PRelu_2_zero), name=/PRelu_2_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_2_lessZero = lt(/Conv_2_output_0', /PRelu_2_zero), name=/PRelu_2_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_3_lessZero = lt(/Conv_3_output_0', /PRelu_3_zero), name=/PRelu_3_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_3_lessZero = lt(/Conv_3_output_0', /PRelu_3_zero), name=/PRelu_3_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_4_lessZero = lt(/Conv_4_output_0', /PRelu_4_zero), name=/PRelu_4_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_4_lessZero = lt(/Conv_4_output_0', /PRelu_4_zero), name=/PRelu_4_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_5_lessZero = lt(/Conv_5_output_0', /PRelu_5_zero), name=/PRelu_5_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_5_lessZero = lt(/Conv_5_output_0', /PRelu_5_zero), name=/PRelu_5_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_6_lessZero = lt(/Conv_6_output_0', /PRelu_6_zero), name=/PRelu_6_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_6_lessZero = lt(/Conv_6_output_0', /PRelu_6_zero), name=/PRelu_6_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_7_lessZero = lt(/Conv_7_output_0', /PRelu_7_zero), name=/PRelu_7_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_7_lessZero = lt(/Conv_7_output_0', /PRelu_7_zero), name=/PRelu_7_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_8_lessZero = lt(/Conv_8_output_0', /PRelu_8_zero), name=/PRelu_8_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_8_lessZero = lt(/Conv_8_output_0', /PRelu_8_zero), name=/PRelu_8_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:10] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[05/19/2024-14:16:11] [I] [TRT] Total Host Persistent Memory: 56032
[05/19/2024-14:16:11] [I] [TRT] Total Device Persistent Memory: 0
[05/19/2024-14:16:11] [I] [TRT] Total Scratch Memory: 4608
[05/19/2024-14:16:11] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 40 steps to complete.
[05/19/2024-14:16:11] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 0.6522ms to assign 4 blocks to 40 nodes requiring 1111454208 bytes.
[05/19/2024-14:16:11] [I] [TRT] Total Activation Memory: 1111454208
[05/19/2024-14:16:11] [I] [TRT] Total Weights Memory: 670720
[05/19/2024-14:16:11] [I] [TRT] Engine generation completed in 21.705 seconds.
[05/19/2024-14:16:11] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
Works great but can anything be done to improve the seek performance.
Running a 4090 + i7 12700k.
Are the buffers to be filled the cause of pause on seek? If so, can they be reduced.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.