Releases: Phhofm/models
4xmssim_realplksr_dysample_pretrain
4xmssim_realplksr_dysample_pretrain
Scale: 4
Architecture: RealPLKSR with Dysample
Architecture Option: realplksr
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Pretrained
Subject: Photography
Input Type: Images
Release Date: 27.06.2024
Dataset: nomosv2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: None (=From Scratch)
Iterations: 200'000
Batch Size: 8
GT Size: 192, 512
Description: Dysample had been recently added to RealPLKSR, which from what I had seen can resolve or help avoid the checkerboard / grid pattern on inference outputs. So with the commits from three days ago, the 24.06.24, on neosr, I wanted to create a 4x photo pretrain I can then use to train more realplksr models with dysample specifically to stabilize training at the beginning.
4xTextures_GTAV_rgt-s_dither
4xTextures_GTAV_rgt-s_dither
Scale: 4
Architecture: RGT
Architecture Option: RGT-S
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Game Textures
Input Type: Images
Release Date: 08.05.2024
Dataset: GTAV_512_Textures
Dataset Size: 7061
OTF (on the fly augmentations): No
Pretrained Model: 4xTextures_GTAV_rgt-s
Iterations: 128'000
Batch Size: 6,4
GT Size: 128,256
Description: A model to upscale game textures, trained on GTAV Textures, handles jpg compression down to 80 and was trained with dithering. Basically the previous 4xTextures_GTAV_rgt-s model but extended to handle dithering.
Showcase:
Slow Pics 25 Examples
4xNomosWebPhoto_esrgan
4xNomosWebPhoto_esrgan
Scale: 4
Architecture: ESRGAN
Architecture Option: esrgan
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Photography
Input Type: Images
Release Date: 16.06.2024
Dataset: Nomos-v2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: RealESRGAN_x4plus
Iterations: 210'000
Batch Size: 12
GT Size: 256
Description:
4x ESRGAN model for photography, trained with realistic noise, lens blur, jpg and webp re-compression.
ESRGAN version of 4xNomosWebPhoto_RealPLKSR, trained on the same dataset and in the same way. For more information look into the 4xNomosWebPhoto_RealPLKSR release, and the pdf file in its attachments.
Showcase:
Slow Pics 6 Examples
(Click on image for better view)
4xNomosWebPhoto_atd
4xNomosWebPhoto_atd
Scale: 4
Architecture: ATD
Architecture Option: atd
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Photography
Input Type: Images
Release Date: 07.06.2024
Dataset: Nomos-v2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: 003_ATD_SRx4_finetune.pth
Iterations: 460'000
Batch Size: 6, 2
GT Size: 128, 192
Description:
4x ATD model for photography, trained with realistic noise, lens blur, jpg and webp re-compression.
ATD version of 4xNomosWebPhoto_RealPLKSR, trained on the same dataset and in the same way. For more information look into the 4xNomosWebPhoto_RealPLKSR release, and the pdf file in its attachments.
Showcase:
Slow Pics 18 Examples
(Click on image for better view)
4xNomosWebPhoto_RealPLKSR
4xNomosWebPhoto_RealPLKSR
Scale: 4
Architecture: RealPLKSR
Architecture Option: realplksr
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Photography
Input Type: Images
Release Date: 28.05.2024
Dataset: Nomos-v2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: 4x_realplksr_gan_pretrain
Iterations: 404'000, 445'000
Batch Size: 12, 4
GT Size: 128, 256, 512
Description:
short: 4x RealPLKSR model for photography, trained with realistic noise, lens blur, jpg and webp re-compression.
full: My newest version of my RealWebPhoto series, this time I used the newly released Nomos-v2 dataset by musl.
I then made 12 different low resolution degraded folders, using kim's datasetdestroyer for scaling and compression, my ludvae200 model for realistic noise, and umzi's wtp_dataset_destroyer with its floating point lens blur implementation for better control (since i needed to control the lens blur strength more precisely).
I then mixed them together in a single lr folder and trained for 460'000 iters, checked the results, and upon kims suggestion of using interpolation, I tested and am releasing this interpolation between the checkpoints 404'000 and 445'000.
This model has been trained on neosr using mixup, cutmix, resizemix, cutblur, nadam, unet, multisteplr, mssim, perceptual, gan, dists, ldl, ff, color and lumaloss, and interpolated using the current chaiNNer nightly version.
This model took multiple retrainings and reworks of the dataset, until I am now satisfied enough with the quality to release this version.
For more details on the whole process see the pdf file in the attachement.
I am also attaching the 404'000, 445'000 and 460'000 checkpoints for completeness.
PS in general degradation strengths have been reduced/adjusted in comparison to my previous RealWebPhoto models
Showcase:
Slow Pics 10 Examples
4xNomos2_otf_esrgan
4xNomos2_otf_esrgan
Scale: 4
Architecture: ESRGAN
Architecture Option: esrgan
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Photography
Input Type: Images
Release Date: 22.06.2024
Dataset: Nomos-v2
Dataset Size: 6000
OTF (on the fly augmentations): Yes
Pretrained Model: RealESRGAN_x4plus
Iterations: 246'000
Batch Size: 8
GT Size: 256
Description:
4x ESRGAN model for photography, trained using the Real-ESRGAN otf degradation pipeline.
Showcase:
Slow Pics 8 Examples
(Click on image for better view)
4xTextures_GTAV_rgt-s
4xTextures_GTAV_rgt-s
Scale: 4
Architecture: RGT
Architecture Option: RGT-S
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Game Textures
Input Type: Images
Release Date: 04.05.2024
Dataset: GTAV_512_Textures
Dataset Size: 8492
OTF (on the fly augmentations): No
Pretrained Model: RGT_S_x4
Iterations: 165'000
Batch Size: 6,4
GT Size: 128,256
Description: A model to upscale game textures, trained on GTAV Textures, handles jpg compression down to 80.
Showcase:
Slow Pics
4xRealWebPhoto_v4_drct-l
4xRealWebPhoto_v4_drct-l
Scale: 4
Architecture: DRCT
Architecture Option: DRCT-L
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Realistic, Photography
Input Type: Images
Release Date: 02.05.2024
Dataset: 4xRealWebPhoto_v4
Dataset Size: 8492
OTF (on the fly augmentations): No
Pretrained Model: 4xmssim_drct-l_pretrain
Iterations: 260'000
Batch Size: 6,4
GT Size: 128,192
Description: The first real-world drct model, so I am releasing it, or at least my try at it, maybe others will be able to get better results than me, I think I'd recommend my 4xRealWebPhoto_v3_atd model over this one if a real-world model for upscaling photos downloaded from the web is desired.
This model is based on my previously released drct pretrain. Used mixup, cutmix, resizemix augmentations, and mssim, perceptual, gan, dists, ldl, focalfrequency, gradvar, color and luma losses.
Showcase:
Slow.pics
4xDRCT-mssim-pretrains
Since no DRCT model releases exist yet, I am releasing my drct-s, drct and drct-l pretrains.
These can be used to speed up and stabilize early training stages when training new drct models.
Trained with mssim (and augs, and color and lumaloss) on downscaled nomosuni.
Training files are also provided additionally.
4xmssim_drct-s_pretrain
Scale: 4
Architecture: DRCT
Architecture Option: DRCT-S
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Pretrained
Subject: Anime, Photography
Input Type: Images
Release Date: 28.04.2024
Dataset: nomos_uni
Dataset Size: 2989
OTF (on the fly augmentations): No
Pretrained Model: None (=From Scratch)
Iterations: 75'000
Batch Size: 6
GT Size: 128-320
Description: A pretrain to start drct-s model training, since there are no official released drct pretrains yet I trained these myself and release them here.
4xmssim_drct_pretrain
Scale: 4
Architecture: DRCT
Architecture Option: DRCT
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Pretrained
Subject: Anime, Photography
Input Type: Images
Release Date: 28.04.2024
Dataset: nomos_uni
Dataset Size: 2989
OTF (on the fly augmentations): No
Pretrained Model: None (=From Scratch)
Iterations: 95'000
Batch Size: 2-6
GT Size: 128-320
Description: A pretrain to start drct model training, since there are no official released drct pretrains yet I trained these myself and release them here.
4xmssim_drct-l_pretrain
Scale: 4
Architecture: DRCT
Architecture Option: DRCT-L
Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Pretrained
Subject: Anime, Photography
Input Type: Images
Release Date: 28.04.2024
Dataset: nomos_uni
Dataset Size: 2989
OTF (on the fly augmentations): No
Pretrained Model: None (=From Scratch)
Iterations: 108'000
Batch Size: 2-6
GT Size: 128-320
Description: A pretrain to start drct-l model training, since there are no official released drct pretrains yet I trained these myself and release them here.
Showcase:
4xTextureDAT2_otf
Name: 4xTextureDAT2_otf
Author: Philip Hofmann
Release: 13.12.2023
License: CC BY 4.0
Network: DAT
Arch Option: DAT2
Scale: 4
Purpose: 4x texture image upscaler, handling jpg compression, some noise and slight blur
Iterations: 125000
batch_size: 6
HR_size: 128
Dataset: GTAV_512_Dataset
Number of train images: 30122
OTF Training: Yes
Pretrained_Model_G: DAT_2_x4
Description: 4x upscale texture images, trained with the Real-ESRGAN otf pipeline so handles jpg compression, some noise and slight blur