upload models

Files changed (10) hide show

.gitignore +6 -0
README.md +166 -0
download_edsr_benchmark.py +76 -0
onnx_eval.py +202 -0
onnx_inference.py +53 -0
onnx_runner.py +275 -0
requirements-eval.txt +6 -0
requirements-infer.txt +4 -0
sesr_nchw_fp32.onnx +3 -0
sesr_nchw_int8.onnx +3 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,6 @@

+.vscode/
+.venv/
+*.pyc
+__pycache__/
+outputs/
+datasets/

README.md ADDED Viewed

	@@ -0,0 +1,166 @@

+---
+license: apache-2.0
+tags:
+  - RyzenAI
+  - Int8 quantization
+  - Single Image Super Resolution
+  - SESR
+  - ONNX
+  - Computer Vision
+metrics:
+  - PSNR
+  - MS_SSIM
+  - FID
+---
+# SESR for 2x Single Image Super Resolution
+We provide 2x super-resolution models at resolution 256x256.
+It was introduced in the paper _Collapsible Linear Blocks for Super-Efficient Super Resolution_ by Bhardwaj. The official code for this work is available at [sesr](https://github.com/ARM-software/sesr).
+We have developed a modified version optimized for [AMD Ryzen AI](https://onnxruntime.ai/docs/execution-providers/Vitis-AI-ExecutionProvider.html).
+## Model description
+SESR is based on linear overparameterization of CNNs and creates an efficient model architecture for SISR.
+## Intended uses & limitations
+You can use this model for single image super resolution tasks. See the [model hub](https://huggingface.co/models?search=amd/ryzenai-sesr) for all available models.
+## How to use
+### Installation
+```bash
+# inference only
+pip install -r requirements-infer.txt
+# inference & evaluation
+pip install -r requirements-eval.txt
+```
+### Data Preparation (optional: for evaluation)
+Run `python download_edsr_benchmark.py` to automatically download and extract the EDSR benchmark dataset into the datasets directory. After it completes, your datasets folder should have the following structure:
+```Plain
+datasets/edsr_benchmark
+    └── B100
+          └── HR
+            ├── 3096.png
+            ├── ...
+          └── LR_bicubic/X2
+            ├── 3096x4.png
+            ├── ...
+    └── Set5
+          └── HR
+            ├── baby.png
+            ├── ...
+          └── LR_bicubic/X2
+            ├── babyx4.png
+            ├── ...
+```
+### Test & Evaluation
+- **Run inference on images**
+```bash
+python onnx_inference.py --onnx sesr_nchw_fp32.onnx --input /Path/To/Image --out-dir outputs
+python onnx_inference.py --onnx sesr_nchw_int8.onnx --input /Path/To/Image --out-dir outputs
+```
+_Arguments:_
+`--input`: Accepts either a single image file path or a directory path. If it's a file, the script will process that image only. If it's a directory, the script will recursively scan for .png, .jpg, and .jpeg files and process all of them.
+`--out-dir`: Output directory where the restored images will be saved.
+- **Evaluate the quantized model**
+_Arguments:_
+`--onnx`: Path to the ONNX model file.
+`--hq-dir`: Directory containing high-quality (ground truth) images.
+`--lq-dir`: Directory containing low-quality (input) images.
+`--out-dir`: Output directory where evaluation results will be saved.
+`--max-samples`: (Optional) Limit the number of samples to evaluate. Useful for debugging. If not specified, all samples will be evaluated.
+`-clean`: (Optional) If specified, the generated super-resolution images will be deleted after evaluation to save disk space.
+```bash
+# ===================== eval int8 =====================
+python onnx_eval.py \
+  --onnx sesr_nchw_int8.onnx \
+  --hq-dir datasets/edsr_benchmark/Set5/HR \
+  --lq-dir datasets/edsr_benchmark/Set5/LR_bicubic/X2 \
+  --out-dir outputs/Set5 -clean
+python onnx_eval.py \
+  --onnx sesr_nchw_int8.onnx \
+  --hq-dir datasets/edsr_benchmark/Set14/HR \
+  --lq-dir datasets/edsr_benchmark/Set14/LR_bicubic/X2 \
+  --out-dir outputs/Set14 -clean
+python onnx_eval.py \
+  --onnx sesr_nchw_int8.onnx \
+  --hq-dir datasets/edsr_benchmark/B100/HR \
+  --lq-dir datasets/edsr_benchmark/B100/LR_bicubic/X2 \
+  --out-dir outputs/B100 -clean
+python onnx_eval.py \
+  --onnx sesr_nchw_int8.onnx \
+  --hq-dir datasets/edsr_benchmark/Urban100/HR \
+  --lq-dir datasets/edsr_benchmark/Urban100/LR_bicubic/X2 \
+  --out-dir outputs/Urban100 -clean
+# ===================== eval fp32 =====================
+python onnx_eval.py \
+  --onnx sesr_nchw_fp32.onnx \
+  --hq-dir datasets/edsr_benchmark/Set5/HR \
+  --lq-dir datasets/edsr_benchmark/Set5/LR_bicubic/X2 \
+  --out-dir outputs/Set5 -clean
+python onnx_eval.py \
+  --onnx sesr_nchw_fp32.onnx \
+  --hq-dir datasets/edsr_benchmark/Set14/HR \
+  --lq-dir datasets/edsr_benchmark/Set14/LR_bicubic/X2 \
+  --out-dir outputs/Set14 -clean
+python onnx_eval.py \
+  --onnx sesr_nchw_fp32.onnx \
+  --hq-dir datasets/edsr_benchmark/B100/HR \
+  --lq-dir datasets/edsr_benchmark/B100/LR_bicubic/X2 \
+  --out-dir outputs/B100 -clean
+python onnx_eval.py \
+  --onnx sesr_nchw_fp32.onnx \
+  --hq-dir datasets/edsr_benchmark/Urban100/HR \
+  --lq-dir datasets/edsr_benchmark/Urban100/LR_bicubic/X2 \
+  --out-dir outputs/Urban100 -clean
+```
+### Performance
+| Model      |         | Set5       |        |         | Set14      |        |         | B100        |        |         | Urban100   |        |
+| :--------- | ------- | ---------- | ------ | ------- | ---------- | ------ | ------- | ----------- | ------ | ------- | ---------- | ------ |
+|            | PSNR(↑) | MS_SSIM(↑) | FID(↓) | PSNR(↑) | MS_SSIM(↑) | FID(↓) | PSNR(↑) | MS_SSIM (↑) | FID(↓) | PSNR(↑) | MS_SSIM(↑) | FID(↓) |
+| sesr(fp32) | 35.65   | 0.9971     | 26.46  | 30.98   | 0.9935     | 17.69  | 30.23   | 0.9921      | 17.00  | 28.84   | 0.9929     | 0.25   |
+| sesr(int8) | 34.65   | 0.9952     | 28.37  | 30.46   | 0.9916     | 20.70  | 29.80   | 0.9900      | 19.38  | 28.25   | 0.9906     | 1.47   |
+---
+```bibtex
+@article{bhardwaj2021collapsible,
+  title={Collapsible Linear Blocks for Super-Efficient Super Resolution},
+  author={Bhardwaj, Kartikeya and Milosavljevic, Milos and O'Neil, Liam and Gope, Dibakar and Matas, Ramon and Chalfin, Alex and Suda, Naveen and Meng, Lingchuan and Loh, Danny},
+  journal={arXiv preprint arXiv:2103.09404},
+  year={2021}
+}
+```

download_edsr_benchmark.py ADDED Viewed

	@@ -0,0 +1,76 @@

+from pathlib import Path
+from urllib.request import urlretrieve
+import tarfile
+from tqdm import tqdm
+import shutil
+ITEMS = [
+    {
+        "url": "https://cv.snu.ac.kr/research/EDSR/benchmark.tar",
+        "name": "EDSR_benchmark",
+    },
+]
+def download_with_progress(url: str, out_path: Path) -> None:
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    print(f"Downloading {url} -> {out_path}")
+    bar = None
+    last_b = 0
+    def reporthook(b: int, bsize: int, tsize: int):
+        nonlocal bar, last_b
+        if bar is None:
+            total = tsize if tsize > 0 else None
+            bar = tqdm(
+                total=total,
+                unit="B",
+                unit_scale=True,
+                unit_divisor=1024,
+                desc=out_path.name,
+                dynamic_ncols=True,
+            )
+        delta_blocks = b - last_b
+        if delta_blocks > 0:
+            bar.update(delta_blocks * bsize)
+            last_b = b
+    try:
+        urlretrieve(url, out_path, reporthook=reporthook)
+    finally:
+        if bar is not None:
+            bar.close()
+def extract_tar_flatten(tar_path: Path, dest_dir: Path) -> None:
+    dest_dir.mkdir(parents=True, exist_ok=True)
+    print(f"Extracting {tar_path} -> {dest_dir} (flatten top folder)")
+    with tarfile.open(tar_path, "r") as tf:
+        tf.extractall(dest_dir)
+    # rename benchmark -> edsr_benchmark
+    print("Renaming benchmark -> edsr_benchmark")
+    shutil.move(str(dest_dir / "benchmark"), str(dest_dir / "edsr_benchmark"))
+def main() -> None:
+    base = Path(__file__).resolve().parent
+    root = base / "datasets"
+    out_dir = base / "datasets"
+    root.mkdir(parents=True, exist_ok=True)
+    out_dir.mkdir(parents=True, exist_ok=True)
+    for it in ITEMS:
+        tar_path = out_dir / f"{it['name']}.tar"
+        download_with_progress(it["url"], tar_path)
+        extract_tar_flatten(tar_path, root)
+    print("All done.")
+if __name__ == "__main__":
+    main()

onnx_eval.py ADDED Viewed

	@@ -0,0 +1,202 @@

+import sys
+import json
+from pathlib import Path
+sys.path.insert(0, Path(__file__).parent.as_posix())
+import cv2
+import pyiqa
+import torch
+import numpy as np
+from tqdm import tqdm
+from onnx_runner import OnnxRunner
+def collect_common_image_pairs(
+    lq_dir: Path, hq_dir: Path
+) -> tuple[list[Path], list[Path]]:
+    exts = {".png", ".jpg", ".jpeg"}
+    def is_img(p: Path) -> bool:
+        return p.is_file() and p.suffix.lower() in exts
+    hq_map = {p.stem: p for p in hq_dir.iterdir() if is_img(p)}
+    hq_names = sorted(hq_map.keys())
+    lq_files = [p for p in lq_dir.iterdir() if is_img(p)]
+    lq_paths: list[Path] = []
+    hq_paths: list[Path] = []
+    for base in hq_names:
+        # try full match first
+        best_lq = next((p for p in lq_files if p.stem == base), None)
+        # try prefix match then
+        if best_lq is None:
+            best_lq = next(
+                (
+                    p
+                    for p in lq_files
+                    if p.stem.startswith(base) and len(p.stem) > len(base)
+                ),
+                None,
+            )
+        if best_lq is not None:  # matched
+            hq_paths.append(hq_map[base])
+            lq_paths.append(best_lq)
+    return lq_paths, hq_paths
+def align_shape_by_crop(sr_bgr: np.ndarray, hq_bgr: np.ndarray):
+    if sr_bgr.shape != hq_bgr.shape:
+        min_h = min(sr_bgr.shape[0], hq_bgr.shape[0])
+        min_w = min(sr_bgr.shape[1], hq_bgr.shape[1])
+        sr_bgr = sr_bgr[:min_h, :min_w]
+        hq_bgr = hq_bgr[:min_h, :min_w]
+    return sr_bgr, hq_bgr
+def gen_sr_images(
+    hq_dir: Path, lq_dir: Path, out_dir: Path, onnx_path: Path, max_samples: int
+):
+    out_dir.mkdir(exist_ok=True, parents=True)
+    onnx_runner = OnnxRunner(onnx_path, sr_scale=2, tile_overlap=8)
+    lq_paths, hq_paths = collect_common_image_pairs(lq_dir, hq_dir)
+    if max_samples is not None:
+        lq_paths = lq_paths[: max(max_samples, 1)]
+        hq_paths = hq_paths[: max(max_samples, 1)]
+    sr_paths = []
+    for i in tqdm(range(len(lq_paths)), desc="generating"):
+        lq_img_path = lq_paths[i]
+        lq_bgr = cv2.imread(lq_img_path.as_posix(), cv2.IMREAD_COLOR)
+        assert lq_bgr is not None
+        sr_bgr = onnx_runner.run(lq_bgr)
+        hq_img_path = hq_paths[i]
+        hq_bgr = cv2.imread(hq_img_path.as_posix(), cv2.IMREAD_COLOR)
+        aligned_sr_bgr, aligned_hq_bgr = align_shape_by_crop(sr_bgr, hq_bgr)
+        if aligned_hq_bgr.shape != hq_bgr.shape:
+            cv2.imwrite(hq_img_path.as_posix(), aligned_hq_bgr)
+        out_path = out_dir / f"{lq_img_path.stem}.png"
+        cv2.imwrite(out_path.as_posix(), aligned_sr_bgr)
+        sr_paths.append(out_path)
+    return hq_paths, sr_paths
+def eval_metrics(
+    hq_paths: list[Path],
+    sr_paths: list[Path],
+    hq_dir: Path,
+    sr_dir: Path,
+    device: torch.device | None = None,
+) -> dict[str, float]:
+    assert len(hq_paths) == len(sr_paths)
+    device = device or (
+        torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
+    )
+    # FR: sr, ref
+    psnr_metric = pyiqa.create_metric("psnr", device=device, test_y_channel=True)
+    # FR: sr, ref
+    ms_ssim_metric = pyiqa.create_metric("ms_ssim", device=device, test_y_channel=True)
+    fid_metric = pyiqa.create_metric("fid")
+    with torch.inference_mode():
+        psnr_vals = []
+        ms_ssim_vals = []
+        for sr_p, hq_p in zip(sr_paths, hq_paths):
+            sr_p = sr_p.as_posix()
+            hq_p = hq_p.as_posix()
+            psnr_vals.append(psnr_metric(sr_p, hq_p).detach())
+            ms_ssim_vals.append(ms_ssim_metric(sr_p, hq_p).detach())
+        psnr = torch.stack(psnr_vals).mean().item()
+        ms_ssim = torch.stack(ms_ssim_vals).mean().item()
+        fid = fid_metric(
+            sr_dir.as_posix(),
+            hq_dir.as_posix(),
+            mode="clean",
+            batch_size=1,
+            num_workers=0,
+        ).item()
+    return {"psnr": psnr, "ms_ssim": ms_ssim, "fid": fid}
+def main(args):
+    onnx_path = Path(args.onnx)
+    hq_dir = Path(args.hq_dir)
+    lq_dir = Path(args.lq_dir)
+    out_dir = Path(args.out_dir)
+    assert onnx_path.suffix == ".onnx"
+    assert lq_dir.is_dir(), f"{lq_dir} is not a dir!"
+    assert hq_dir.is_dir(), f"{hq_dir} is not a dir!"
+    sr_dir = out_dir / "sr"
+    hq_paths, sr_paths = gen_sr_images(
+        hq_dir, lq_dir, sr_dir, onnx_path, args.max_samples
+    )
+    scores = eval_metrics(hq_paths, sr_paths, hq_dir, sr_dir)
+    summary = {
+        "onnx": onnx_path.as_posix(),
+        "psnr": scores["psnr"],
+        "ms_ssim": scores["ms_ssim"],
+        "fid": scores["fid"],
+    }
+    out_file = out_dir / f"eval_{onnx_path.stem}_result.json"
+    with open(out_file, "w") as f:
+        json.dump(summary, f, indent=2)
+    dataset_name = hq_dir.parent.name
+    print(f"summary of {dataset_name}: PSNR | MS_SSIM | FID")
+    print(
+        f"{dataset_name}: {scores['psnr']:.2f} | {scores['ms_ssim']:.4f} | {scores['fid']:.2f}"
+    )
+    print(f"result saved to {out_file}")
+    if args.clean:
+        import shutil
+        print(f"cleaning enhanced lq dir: {sr_dir}")
+        shutil.rmtree(sr_dir.as_posix(), ignore_errors=True)
+if __name__ == "__main__":
+    from argparse import ArgumentParser
+    parser = ArgumentParser()
+    parser.add_argument("--onnx", type=str, required=True)
+    parser.add_argument("--hq-dir", type=str, required=True)
+    parser.add_argument("--lq-dir", type=str, required=True)
+    parser.add_argument("--out-dir", type=str, default="outputs")
+    parser.add_argument(
+        "--max-samples",
+        type=int,
+        default=None,
+        help="limit number of used samples(debug purpose only), None means not-limited",
+    )
+    parser.add_argument(
+        "-clean",
+        action="store_true",
+        default=False,
+        help="clean out-dir when finished",
+    )
+    main(parser.parse_args())

onnx_inference.py ADDED Viewed

	@@ -0,0 +1,53 @@

+from pathlib import Path
+import sys
+sys.path.insert(0, Path(__file__).parent.as_posix())
+import cv2
+from onnx_runner import OnnxRunner
+def main(args):
+    onnx_path = Path(args.onnx)
+    input_path = Path(args.input)
+    out_dir = Path(args.out_dir)
+    assert onnx_path.suffix == ".onnx"
+    onnx_runner = OnnxRunner(onnx_path, sr_scale=2, debug=False)
+    if input_path.is_file():
+        input_images_path = [input_path]
+    else:
+        input_images_path = sorted(
+            [
+                p
+                for p in input_path.rglob("*")
+                if p.suffix.lower() in (".png", ".jpg", ".jpeg")
+            ]
+        )
+    out_dir.mkdir(exist_ok=True, parents=True)
+    for input_img_path in input_images_path:
+        input_img_path: Path
+        img_bgr = cv2.imread(input_img_path.as_posix(), cv2.IMREAD_COLOR)
+        assert img_bgr is not None
+        sr_img_bgr = onnx_runner.run(img_bgr)
+        out_path = out_dir / f"{input_img_path.stem}.png"
+        cv2.imwrite(out_path.as_posix(), sr_img_bgr)
+        print(f"saved {out_path}")
+if __name__ == "__main__":
+    from argparse import ArgumentParser
+    parser = ArgumentParser()
+    parser.add_argument("--onnx", type=str, required=True)
+    parser.add_argument("--input", type=str, required=True)
+    parser.add_argument("--out-dir", type=str, required=True)
+    main(parser.parse_args())

onnx_runner.py ADDED Viewed

	@@ -0,0 +1,275 @@

+from pathlib import Path
+import numpy as np
+import onnxruntime as ort
+__all__ = [
+    "OnnxRunner",
+]
+def split_into_tiles_with_context(
+    img_chw: np.ndarray,
+    patch_size_hw: tuple[int, int],
+    overlap: int,
+):
+    """
+    Args:
+        img_chw: (C, H, W)
+        patch_size_hw: (ph, pw) size of each patch.
+        overlap: overlap of neighbored patches.
+    Returns:
+        tiles_chw: list[np.ndarray], each tile in shape of [C, ph, pw]
+        orig_hw: (H, W)
+        padded_hw: (H_pad, W_pad)
+    """
+    import math
+    assert img_chw.ndim == 3
+    C, H, W = img_chw.shape
+    ph, pw = patch_size_hw
+    assert 2 * overlap < ph and 2 * overlap < pw, "2*overlap must <= patch_size"
+    # core region size(remove overlap region)
+    core_h = ph - 2 * overlap
+    core_w = pw - 2 * overlap
+    assert core_h > 0 and core_w > 0
+    # compute how much tiles required
+    n_tiles_h = math.ceil(H / core_h)
+    n_tiles_w = math.ceil(W / core_w)
+    # center padded size
+    H_pad = n_tiles_h * core_h
+    W_pad = n_tiles_w * core_w
+    # first padding, make sure padded image divisible by patch size
+    pad_h = H_pad - H
+    pad_w = W_pad - W
+    img_pad = np.pad(
+        img_chw,
+        pad_width=((0, 0), (0, pad_h), (0, pad_w)),
+        mode="reflect",
+    )  # (C, H_pad, W_pad)
+    # second padding, add reflect context for boundaries
+    big_pad = np.pad(
+        img_pad,
+        pad_width=((0, 0), (overlap, overlap), (overlap, overlap)),
+        mode="reflect",
+    )  # (C, H_pad+2o, W_pad+2o)
+    tiles = []
+    for iy in range(n_tiles_h):
+        for ix in range(n_tiles_w):
+            cy = iy * core_h
+            cx = ix * core_w
+            y0 = cy
+            x0 = cx
+            tile = big_pad[:, y0 : y0 + ph, x0 : x0 + pw]
+            tiles.append(tile)
+    return tiles, (H, W), (H_pad, W_pad)
+def merge_tiles_with_context(
+    tiles_chw: list[np.ndarray],
+    orig_hw: tuple[int, int],
+    padded_hw: tuple[int, int],
+    overlap: int,
+) -> np.ndarray:
+    """
+    Args:
+        tiles_chw:
+        orig_hw: original image size.
+        padded_hw: center-padded image size.
+        overlap: overlap of neighbored patches.
+    Returns:
+        img_chw: (C, H, W)
+    """
+    assert len(tiles_chw) > 0
+    C, ph, pw = tiles_chw[0].shape
+    H, W = orig_hw
+    H_pad, W_pad = padded_hw
+    assert 2 * overlap < ph and 2 * overlap < pw
+    core_h = ph - 2 * overlap
+    core_w = pw - 2 * overlap
+    n_h = H_pad // core_h
+    n_w = W_pad // core_w
+    assert n_h * n_w == len(tiles_chw), "tiles != padded_hw"
+    img_pad_recon = np.zeros((C, H_pad, W_pad), dtype=tiles_chw[0].dtype)
+    idx = 0
+    for iy in range(n_h):
+        for ix in range(n_w):
+            cy = iy * core_h
+            cx = ix * core_w
+            tile = tiles_chw[idx]
+            core = tile[:, overlap : overlap + core_h, overlap : overlap + core_w]
+            img_pad_recon[:, cy : cy + core_h, cx : cx + core_w] = core
+            idx += 1
+    img_recon = img_pad_recon[:, :H, :W]
+    return np.ascontiguousarray(img_recon)
+def parse_input_shape_fmt(input_shape):
+    """parse input shape is nchw or nhwc format.
+    We assume c is smaller than h&w dimensions
+    """
+    assert len(input_shape) == 4
+    c1, c2, c3 = input_shape[1:]
+    if c1 < min(c2, c3):  # c1 is channel dimension
+        return "nchw"
+    elif c3 < min(c1, c2):  # c3 is channel dimension
+        return "nhwc"
+    else:
+        raise ValueError(f"can not parse input format for shape: {input_shape}")
+def is_channel_last(img: np.ndarray):
+    return img.shape[2] < min(img.shape[0], img.shape[1])
+def preprocess(img_bgr: np.ndarray):
+    """Convert bgr channel last uint8 image to rgb channel first fp32 image."""
+    img_rgb = img_bgr[..., ::-1]  # bgr -> rgb
+    img_chw = np.transpose(img_rgb, [2, 0, 1])  # hwc -> chw
+    img_chw = np.float32(img_chw)  # uint8 -> fp32
+    return np.ascontiguousarray(img_chw)
+def postprocess(pred_chw: np.ndarray):
+    """Convert rgb channel first fp32 image to bgr channel last uint8 image"""
+    uint8_chw = pred_chw.clip(0, 255).astype(np.uint8)  # fp32 -> uint8
+    img_rgb = np.transpose(uint8_chw, [1, 2, 0])  # chw -> hwc
+    img_bgr = img_rgb[..., ::-1]  # rgb to bgr
+    return np.ascontiguousarray(img_bgr)
+class OnnxRunner:
+    """Single Image Super Resolution onnx runner."""
+    def __init__(self, onnx_path, sr_scale, tile_overlap: int = 8, debug=False):
+        if "CUDAExecutionProvider" in ort.get_available_providers():
+            providers = ["CUDAExecutionProvider"]
+        else:
+            providers = ["CPUExecutionProvider"]
+        ort_session = ort.InferenceSession(str(onnx_path), providers=providers)
+        input0 = ort_session.get_inputs()[0]
+        self.input_name = input0.name
+        self.input_shape = tuple(input0.shape)
+        self.input_format = parse_input_shape_fmt(input0.shape)
+        self.ort_session = ort_session
+        self.sr_scale = sr_scale
+        self.tile_overlap = max(tile_overlap, 0)
+        self.debug = debug
+        if self.input_format == "nchw":
+            self._in_h, self._in_w = self.input_shape[2:]
+        else:  # nhwc
+            self._in_h, self._in_w = self.input_shape[1:3]
+        if debug:
+            self._dbg_out_dir = Path(__file__).parent / "outputs"
+            self._dbg_out_dir.mkdir(exist_ok=True, parents=True)
+    def _save_dbg_img(self, savename, img):
+        if not self.debug:
+            return
+        import cv2
+        cv2.imwrite(str(self._dbg_out_dir / savename), img)
+    def run(self, img_bgr: np.ndarray) -> np.ndarray:
+        """Do 2x scale-up super resolution on given uint8 bgr image,
+        and return scaled uint8 bgr image.
+        """
+        assert img_bgr.dtype == np.uint8, img_bgr.dtype
+        assert img_bgr.ndim in (2, 3), img_bgr.ndim
+        if img_bgr.ndim == 3:
+            assert is_channel_last(img_bgr), img_bgr.shape
+        if self.debug:
+            self._save_dbg_img("original_input_bgr.png", img_bgr)
+        # =====================
+        # preprocessing
+        # =====================
+        img_chw = preprocess(img_bgr)
+        tiles_chw, origin_size_hw, padded_size_hw = split_into_tiles_with_context(
+            img_chw, (self._in_h, self._in_w), self.tile_overlap
+        )
+        if self.debug:
+            print(f"tiling to {len(tiles_chw)} tiles")
+            tile_bgr = postprocess(tiles_chw[0])
+            self._save_dbg_img("tile_bgr.png", tile_bgr)
+        # =====================
+        # inference
+        # =====================
+        sr_tiles_chw = []
+        for tile_chw in tiles_chw:
+            if self.input_format == "nhwc":
+                input_3d = np.transpose(tile_chw, [1, 2, 0])  # chw -> hwc
+            else:
+                input_3d = tile_chw
+            outputs = self.ort_session.run(None, {self.input_name: input_3d[None, ...]})
+            sr_tile = outputs[0][0]  # chw or hwc format
+            if self.input_format == "nhwc":
+                sr_tile_chw = np.transpose(sr_tile, [2, 0, 1])  # hwc -> chw
+            else:
+                sr_tile_chw = sr_tile
+            sr_tiles_chw.append(sr_tile_chw)
+        if self.debug:
+            sr_padded_tile_bgr = postprocess(sr_tiles_chw[0])
+            self._save_dbg_img("sr_padded_tile_bgr.png", sr_padded_tile_bgr)
+        # =====================
+        # postprocessing
+        # =====================
+        sr_origin_hw = (
+            int(origin_size_hw[0] * self.sr_scale),
+            int(origin_size_hw[1] * self.sr_scale),
+        )
+        sr_padded_hw = (
+            int(padded_size_hw[0] * self.sr_scale),
+            int(padded_size_hw[1] * self.sr_scale),
+        )
+        sr_overlap = int(self.tile_overlap * self.sr_scale)
+        sr_padded_chw = merge_tiles_with_context(
+            sr_tiles_chw,
+            orig_hw=sr_origin_hw,
+            padded_hw=sr_padded_hw,
+            overlap=sr_overlap,
+        )
+        if self.debug:
+            sr_padded_img_bgr = postprocess(sr_padded_chw)
+            self._save_dbg_img("sr_padded_img_bgr.png", sr_padded_img_bgr)
+        sr_chw = sr_padded_chw[..., : sr_origin_hw[0], : sr_origin_hw[1]]
+        sr_img_bgr = postprocess(sr_chw)
+        return sr_img_bgr

requirements-eval.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+onnxruntime==1.22
+numpy==1.26.*
+opencv-python==4.8.*
+tqdm
+torch==2.6.0
+pyiqa @ git+https://github.com/chaofengc/IQA-PyTorch.git@e851fd62e66a97345e1281d80e8deb4ab7b93c83

requirements-infer.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+onnxruntime==1.22
+numpy==1.26.*
+opencv-python==4.8.*
+tqdm

sesr_nchw_fp32.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4b686864a8b17cf9aaad0d787f7b7a133c95317f408cac5204701d7291199711
+size 93732

sesr_nchw_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:11b6adfb3d5d9cc46405af6e684237a1efa7ff6956f82c489631476edc813237
+size 120432