Skip to content

Releases: Phhofm/models

Nature Dataset

13 Aug 09:15
196f42d
Compare
Choose a tag to compare

Nature Dataset

This is a curated version of iNaturalist 2017 Dataset for the purpose of training single image super resolution models. The original dataset consists of 675'170 images and is 200GB in size.

There is a small version that consists of 3000 images of 512x512px that can be used to train lightweight networks like for example compact or SPAN. Average hyperiqa score of hr_small with 3000 images is: 0.767434819261233

There is also a medium version that consists of 7000 images of 512x512 that can be used to train medium or heavy networks like for example RealPLKSR or RGT/DAT/ATD. Average hyperiqa score of hr_small with 7000 images is: 0.754073106459209

HR folder, LRx2 and LRx4 folders and a validation folder provided in the Assets as zip files.

I will list the changes I applied (or simply what I did) below:

For the HR folder, I

  • moved all images into the same folder
  • removed all files that were smaller than 300kB -> 240'833 images left (from 675'170)
  • tiled to 512x512
  • hyperiqa scored all of them and removed all that were below 0.7 -> 32'499 images left, 18GB in size
  • checked all images for visual similarity and removed duplicates
  • removed a lot of human hand photos (too many human hands)
  • made a small version with 3k images that can be used for training lightweight sisr networks.
  • made a medium version with images that can be used for training medium/heavy sisr networks.
  • normalized filenames
  • oxipng -o 4 --strip safe --alpha *.png

For the LRx4 folder, I took the HR folder and applied

  • scaling with randomized down_up (range 0.75, 1.5), linear, cubic_mitchell, lanczos, gauss and box
  • slight randomized gaussian blurring
  • randomized jpg compression with quality 75 - 100
  • oxipng -o 4 --strip safe --alpha *.png

The same approach was used for the LRx2 folder

The corresponding zip files are in the Assets below. Since GitHub file size limit is 2GB, the HR_medium was split into 2 files.

Example of HR images from the dataset:
Example1
Example2

Example of bad images removed from the original iNaturalist 2017 dataset:
Example_bad

The small HR folder:
smallversion

4xNature_realplksr_dysample

13 Aug 09:15
196f42d
Compare
Choose a tag to compare

4xNature_realplksr_dysample

Scale: 4
Architecture: RealPLKSR with Dysample
Architecture Option: realplksr

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Realistic
Input Type: Images
Release Date: 13.08.2024

Dataset: Nature
Dataset Size: 7'000
OTF (on the fly augmentations): No
Pretrained Model: 4xNomos2_realplksr_dysample
Iterations: 265'000
Batch Size: 8
Patch Size: 64

Description:
A Dysample RealPLKSR 4x upscaling model for photographs nature (animals, plants).
LR prepared with down_up, linear, cubic_mitchell, lanczos, gauss and box scaling with some gaussian blur and jpg compression down to 75 (as released with my dataset, the LRx4 folder).
Trained with dysample, ea2fpn, ema, eco, adan_sf, mssim, perceptual, color, luma, dists, ldl and ff (see config toml file).
Based on my Nature Dataset which is a curated version of the iNaturalist 2017 Dataset for the purpose of training single image super resolution models.

Use the 4xNature_realplksr_dysample.pth file for inference. Also provided is a static onnx conversion with 3 256 256. Config, state, and net_d files are additionally provided for trainers, to maybe create an improved version 2 of this model or to train a similiar model from this state.

Showcase:
(Click Image to enlarge)
Example1
Example2
Example3
Example4
Example5

1xDeNoise_realplksr_otf

08 Aug 15:16
196f42d
Compare
Choose a tag to compare

1xDeNoise_realplksr_otf

Scale: 1
Architecture: RealPLKSR
Architecture Option: realplksr

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration, Denoise
Subject: Photography
Input Type: Images
Release Date: 08.08.2024

Dataset: nomosv2
Dataset Size: 6000
OTF (on the fly augmentations): Yes
Pretrained Model: 1xDeJPG_realplksr_otf
Iterations: 200'000
Batch Size: 8
Patch Size: 64

Description:
A 1x realplksr model to denoise, trained with the realesrgan-otf pipeline, also handles a bit of jpg compression (if stronger jpg compression handling is needed, 1xDeJPG_realplksr_otf can be used).

Showcase:
(Click on image for better view)
Example1
Example2
Example3
Example4

1xDeH264_realplksr

08 Aug 10:19
196f42d
Compare
Choose a tag to compare

1xDeH264_realplksr

Scale: 1
Architecture: RealPLKSR
Architecture Option: realplksr

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration, H264
Subject: Photography
Input Type: Images
Release Date: 08.08.2024

Dataset: nomosv2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: 1xDeJPG_realplksr_otf
Iterations: 210'000
Batch Size: 8
Patch Size: 64

Description:
A 1x de h264 model to remove h264 compression.

Showcase:
Slowpics
Imgsli

(Click on image for better view)
Example1
Example2
Example3
Example4

ArtFaces Dataset

08 Aug 08:40
3c76034
Compare
Choose a tag to compare

ArtFaces Dataset

This is a curated version of metfaces dataset for the purpose of training single image super resolution models.
It consists of 5630 images that are 512x512px.
HR folder, LRx2 and LRx4 folders and a validation folder provided in the Assets as zip files.

I will list the changes I applied (or simply what I did) below:

For the HR folder, I

  • applied the multi-scale strategy but with scale 1 and 0.5
  • cropped to sub-images -> all images are now 512x512
  • checked all images for visual similarity
  • hyperiqa scoring; average score with 6992 images is: 0.425619977407021
  • removed all images that scored lower than 0.3; also fits Assets 2GB file size limit now;
  • hyperiqa scoring; average new score with 5630 images is: 0.4653959208652774
  • extracted val images from the previously removed images
  • normalized filenames

For the LRx4 folder, I took the HR folder and applied

  • scaling with down_up (range 0.75, 1.5), linear, cubic_mitchell, lanczos, gauss and box
  • slight randomized gaussian blurring
  • randomized jpg compression with quality 75 - 100

The same approach was used for the LRx2 folder

The corresponding zip files are in the Assets below

Example of HR images:
image

4xArtFaces_realplksr_dysample

08 Aug 08:47
3c76034
Compare
Choose a tag to compare

4xArtFaces_realplksr_dysample

Scale: 4
Architecture: RealPLKSR with Dysample
Architecture Option: realplksr

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Art
Input Type: Images
Release Date: 08.08.2024

Dataset: ArtFaces
Dataset Size: 5'630
OTF (on the fly augmentations): No
Pretrained Model: 4xNomos2_realplksr_dysample
Iterations: 139'000
Batch Size: 6
Patch Size: 64

Description:
A Dysample RealPLKSR 4x upscaling model for art / painted faces.
Based on my ArtFaces Dataset which is a curated version of the metfaces dataset for the purpose of training single image super resolution models.

Showcase:
(Click Image to enlarge)
Example1
Example2
Example3
Example4
[Example5

4xmssim_hma_pretrains

19 Jul 13:41
a579cef
Compare
Choose a tag to compare

Since no official HMA model releases exist yet, I am releasing my hma and hma_medium mssim pretrains.
These can be used to speed up and stabilize early training stages when training new hma models.
Trained with mssim on nomosv2.

4xmssim_hma_pretrain

Scale: 4
Architecture: HMA
Architecture Option: hma

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Pretrained
Subject: Photography
Input Type: Images
Release Date: 19.07.2024

Dataset: nomosv2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: None (=From Scratch)
Iterations: 205'000
Batch Size: 4
Patch Size: 96

Description: A pretrain to start hma model training.


4xmssim_hma_medium_pretrain

Scale: 4
Architecture: HMA
Architecture Option: hma_medium

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Pretrained
Subject: Photography
Input Type: Images
Release Date: 19.07.2024

Dataset: nomosv2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: None (=From Scratch)
Iterations: 150'000
Batch Size: 4
Patch Size: 48

Description: A pretrain to start hma_medium model training.


Showcase:

slow.pics

Example1
Example2
Example3
Example4

4xHFA2k_ludvae_realplksr_dysample

13 Jul 11:11
258a27a
Compare
Choose a tag to compare

4xHFA2k_ludvae_realplksr_dysample
Scale: 4
Architecture: RealPLKSR with Dysample
Architecture Option: realplksr

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Restoration
Subject: Anime
Input Type: Images
Release Date: 13.07.2024

Dataset: HFA2k_LUDVAE
Dataset Size: 10'272
OTF (on the fly augmentations): No
Pretrained Model: 4xNomos2_realplksr_dysample
Iterations: 165'000
Batch Size: 12
GT Size: 256

Description:
A Dysample RealPLKSR 4x upscaling model for anime single-image resolution.
The dataset has been degraded using DM600_LUDVAE, for more realistic noise/compression. Downscaling algorithms used were imagemagick box, triangle, catrom, lanczos and mitchell. Blurs applied were gaussian, box and lens blur (using chaiNNer). Some images were further compressed using -quality 75-92. Down-up was applied to roughly 10% of the dataset (5 to 15% variation in size). Degradations orders were shuffled, to give as many variations as possible.

Examples are inferenced with neosr testscript and the released pth file. I include the test images also as a zip file in this release together with the model outputs, so others can test their models against these test images aswell to compare.

onnx conversions are static since dysample doesnt allow dynamic conversion, I tested the conversions with chaiNNer.

Showcase:
Slowpics

(Click Image to enlarge)
Example1
Example2
Example3
Example4
Example5
Example6
Example7
Example8
Example9
Example10
Example11
Example12
Example13
Example14

1xDeJPG_realplksr_otf

09 Jul 08:48
9109fd3
Compare
Choose a tag to compare

1xDeJPG_realplksr_otf
Scale: 1
Architecture: RealPLKSR
Architecture Option: realplksr

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: JPEG, Restoration
Subject: Photography
Input Type: Images
Release Date: 09.07.2024

Dataset: nomosv2
Dataset Size: 6000
OTF (on the fly augmentations): Yes
Pretrained Model: This one used the 60 model as pretrain, while I used 4xNomos2_realplksr_dysample as a pretrain for the 60 model.
Iterations: 196'000
Batch Size: 24
GT Size: 64

Description:
A 1x dejpg model, trained with otf down to jpg 40 in both degradation processes, see the yml config file for details.
I also release the 1xDeJPG_realplksr_otf_60 model additionally, which had been trained with less jpg compression in comparison, down to 60 in both cases, where the hope was that it could give a bit higher quality on less compressed inputs.
The 40 model is released as default since it will be able to handle more compressed inputs in general.
Since these are trained with otf will slightly deblur. Examples here are compressed, and then recompressed, with jpg 40.

Showcase:
Slowpics
Imgsli

(Click on image for better view)
Example1
Example2
Example3
Example4
Example5
Example6
Example7
Example8

4xNomos2_realplksr_dysample

30 Jun 11:13
c472c82
Compare
Choose a tag to compare

4xNomos2_realplksr_dysample
Scale: 4
Architecture: RealPLKSR with Dysample
Architecture Option: realplksr

Author: Philip Hofmann
License: CC-BY-0.4
Purpose: Pretrained
Subject: Photography
Input Type: Images
Release Date: 30.06.2024

Dataset: nomosv2
Dataset Size: 6000
OTF (on the fly augmentations): No
Pretrained Model: 4xmssim_realplksr_dysample_pretrain
Iterations: 185'000
Batch Size: 8
GT Size: 256, 512

Description:
A Dysample RealPLKSR 4x upscaling model that was trained with / handles jpg compression down to 70 on the Nomosv2 dataset, preserved DoF.
Based on the 4xmssim_realplksr_dysample_pretrain I released 3 days ago.
This model affects / saturate colors, which can be counteracted a bit by using wavelet color fix, as used in these examples.

Added a static (3 256 256) onnx conversion, with fp32 and fully optimized. This can be used with chaiNNer, since the dysample pth file would be unsupported. (Removed other conversions like statis with 128 because they would produce different results, but static 256 gives the same result as using the pth file with neosr testscript.

Showcase:
Slowpics

(Click on image for better view)
Example1
Example2
Example3
Example4
Example5
Example6
Example7
Example8
Example9
Example10
Example11
Example12
Example13