Github: https://github.com/KyokuSai/HentaiSR/

HentaiSR

至今为止还未有过公开的、为里番设计的图像超分辨率模型。
购买里番DVD、WEB片源均需要大量资金,要获取一个足够丰富、优秀的训练集非常之困难。
而不巧,我们因各种机缘巧合有此条件,于是计划训练一个(或多个)特别的模型。

So far, there is no publicly available model specifically designed for upscaling hentais.
Purchasing hentai DVDs and WEBs requires a significant amount of funds, making it extremely difficult to obtain a large and high-quality dataset for training.
But, due to various fortunate circumstances, we have the means to do so, and we plan to train some model.

首先需要解释何为「为里番设计」,单纯使用一般动画作为训练集的模型难道不够好吗?
训练集所用的素材与实际用途越契合理论上效果会越好,用真实环境作素材所得的模型来放大动画,其效果难免大失所望。
进而,使用非常极端的动画作素材所得的模型也不符合我们的需要——我们并不是要把几十年前的古董强行修好,也不是要让动画「转化」成高分辨率动画。
另一方面,里番片源特有的一些情况也会潜在地影响模型效果,例如里番需要考虑DVD的瑕疵、一些特殊WEB片源的瑕疵等等。

First, let's explain what it means to be "designed for upscaling hentais". Isn't a model trained simply using general anime sufficient?
Theoretically, the closer the training material is to the actual use case, the better the results. A model trained on real-world environments to upscale anime is bound to produce disappointing results.
A model trained on extremely low image quality anime also wouldn't meet our needs—we are neither trying to forcibly restore decades-old antiques nor convert anime into high-resolution versions.
On the other hand, certain characteristics unique to hentais (such as DVD artifacts & low-bitrate WEB) could also potentially affect the model's performance.

因此,我们选取了近年的若干里番,HR(高分辨率图像)使用高质量1080p或720p的WEB-DL,LR(低分辨率图像)使用对应的低分辨率WEB-DL或DVDISO——我们并非从HR生成LR,而是直接使用符合最终需要的低质量源。
通过足够严谨的方式解决了帧错位,并通过人工+AI解决图像的错位,对源的一些错误也进行了修复。
最后也对提取出的大量图片进行了人工筛选。
另外,HR直接使用WEB-DL得到的结果并不好,因为WEB-DL本身就存在非常多的瑕疵,所以我们最后使用的HR其实有一些秘密的处理。

Therefore, we selected enough recent hentais. For the high-resolution (HR) images, we used high-quality 1080p or 720p WEB-DLs, and for the low-resolution (LR) images, we used corresponding low-resolution WEB-DLs or DVDISOs—we did not generate LR from HR but instead directly used low-quality sources that match our final requirements.
We resolved frame misalignment issues and used a combination of manual and AI methods to align images, also fixing some errors in the sources.
Finally, we manually filtered the large number of extracted images.
Moreover, the results directly obtained from WEB-DL for HR were not satisfactory, as WEB-DL itself contains many artifacts. Therefore, the HR we ultimately used underwent some secret processing.

Comparison

Point: 最临近插值 / Nearest neighbor
RealESRGAN_animevideo_xsx2
CUGAN_pro_alpha1.0: CUGAN pro with alpha 1.0
HentaiSR_V0
*WEB: WEB-DL, ideal output

观察背景角色/物件能看到HentaiSR会保留景深/模糊,而观察焦点角色HentaiSR的效果也是最接近WEB-DL的。
相比于WEB-DL,其实HentaiSR的锐利度可能会稍高一点,但并不会造成可见的画面涂抹。
HentaiSR还会修复一些片源的瑕疵,如锯齿、晕轮等。
另外有一点不知道重不重要,在有马赛克的图像放大时,HentaiSR并不会将马赛克加深导致看上去非常丑。

Observing background character/objects, you can see that HentaiSR retains depth of field and blur, while for focal characters, HentaiSR's effects are closest to WEB-DL.
Compared to WEB-DL, HentaiSR may be slightly sharper, but it won't destroy images.
HentaiSR also fixes some artifacts, such as aliasing and halos.
Additionally, regardless of whether it is important, when upscaling images with mosaics, HentaiSR does not darken the mosaics, which prevents them from looking very ugly.

这组对比图可能比较奇怪,但放大之后能注意到不少细节:
桌面的木板纹路几乎全被HentaiSR保留下来,且更为清晰,有一定降噪,而另两个模型几乎完全摧毁了纹路;
绳子的细线难以被还原,HentaiSR在尽力还原的同时并没有过分涂抹线条,而另两个模型则造成了相当程度的涂抹。
HentaiSR的破坏力是偏弱的,不会重度影响画面表现。

There are some details become noticeable after enlarging image:
The wood grain on the desk is almost entirely preserved and clearer with HentaiSR, with some noise reduction, while the other two models almost completely destroyed the grain.
The fine lines of the rope are difficult to restore. HentaiSR attempts to restore them without excessive blurring, whereas the other two models cause considerable line blurring.
HentaiSR's impact is relatively mild, not heavily affecting the visual quality.

Others

总之,作为第一次模型训练的尝试,无论是数据集处理还是训练参数之类的肯定还可以优化。
之所以将初版命名为HentaiSR_V0也是因为模型还不够完善,可能过几个月就会发新版本。
不过可能后续一些模型不会公开,低概率;至于数据集,在可见的将来都不会公开。

In summary, as this is our first attempt at model training, both the dataset processing and training parameters can definitely be optimized.
The reason for naming the initial version "HentaiSR_V0" is because the model is not yet perfect, and a new version might be released in a few months.
However, some future models may not be made public. As for the dataset, it will not be made public in the foreseeable future.