This paper presents a tool for video quality assessment that allows the selection of quality (QP), temporal (FPS), and spatial (bitrate) scalability parameters. The proposal integrates traditional metrics such as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), together with the perceptual metric Perceptual Image Patch Similarity (LPIPS), which is based on deep neural networks. To validate its effectiveness, a two-phase subjective evaluation methodology was applied. In the first phase, participants assessed videos encoded with the same scalability parameter, showing a strong correspondence between visual perception and objective metrics. In the second phase, different configurations were compared, revealing a preference for high quality and intermediate spatial scalability. Additionally, in experiments with common distortions such as blurring and noise, LPIPS achieved a sensitivity of 73.64%, outperforming PSNR and SSIM in its alignment with human perception. The main contribution of this work is a tool that combines objective and subjective evaluations, enabling a more comprehensive analysis that closely reflects human visual perception.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
Bowker, D. (2021). Bitrate Defined: How It Impacts Video Quality. https://artlist.io/blog/what-is-bitrate/
Chen, Z., Hu, B., Niu, C., Chen, T., Li, Y., Shan, H., & Wang, G. (2023). IQAGPT: Image Quality Assessment with Vision-language and ChatGPT Models. https://doi.org/10.48550/arXiv.2312.15663
Danier, D., Zhang, F., & Bull, D. (2022). Flo LPIPS: A bespoke video quality metric for frame interpolation. 2022 Picture Coding Symposium (PCS), 283–287. https://doi.org/10.48550/arXiv.2207.08119
Ding, K., Ma, K., Wang, S., & Simoncelli, E. P. (2020). Image quality assessment: Unifying structure and texture similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(5), 2567–2581. https://doi.org/10.48550/arXiv.2004.07728
Ding, K., Zhong, R., Wang, Z., Yu, Y., & Fang, Y. (2023). Adaptive Structure and Texture Similarity Metric for Image Quality Assessment and Optimization. IEEE Transactions on Multimedia, 1–13. https://doi.org/10.1109/TMM.2023.3333208
FFmpeg Developers. (2024). Introduction to FFmpeg. FFmpeg. https://ffmpeg.org/about.html
Flores, B. (2024). Evaluación calidad video [Repositorio en GitHub]. GitHub. https://github.com/Akilescasteo/Evaluacion_calidad_video
Xiph Foundation. (2023). Xiph.org video test media [derf’s collection]. https://media.xiph.org/video/derf/
Gu, J., Cai, H., Chen, H., Ye, X., Ren, J., & Dong, C. (2020). PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration. https://doi.org/10.48550/arXiv.2007.12142
Gu, J., Cai, H., Dong, C., Ren, J. S., Timofte, R., Gong, Y., Lao, S., Shi, S., Wang, J., Yang, S., Wu, T., Xia, W., Yang, Y., Cao, M., Heng, C., Fu, L., Zhang, R., Zhang, Y., Wang, H., … Tiwari, A. K. (2022). NTIRE 2022 Challenge on Perceptual Image Quality Assessment. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 951–967. https://doi.org/https://doi.org/10.48550/arXiv.2206.11695
Hou, Q., Ghildyal, A., & Liu, F. (2022). A perceptual quality metric for video frame interpolation. European Conference on Computer Vision, 234–253. https://doi.org/10.48550/arXiv.2210.01879
Huynh-Thu, Q., & Ghanbari, M. (2012). The accuracy of PSNR in predicting video quality for different video scenes and frame rates. Telecommunication Systems, (49), 35–48. https://doi.org/10.1007/s11235-010-9351-x
García Izquierdo, F. (2017). Desarrollo de una herramienta para la medida de calidad de vídeo [Tesis de grado, Universidad de Sevilla]. https://biblus.us.es/bibing/proyectos/abreproy/91129/fichero/Memoria+TFG+-+Desarrollo+de+una+herramienta+para+la+medida+de+calidad+de+video.pdf
Kastryulin, S., Zakirov, J., Pezzotti, N., & Dylov, D. V. (2023). Image Quality Assessment for Magnetic Resonance Imaging. IEEE Access, 11, 14154–14168. https://doi.org/10.1109/ACCESS.2023.3243466
Kotevski, Z., & Mitrevski, P. (2010). Experimental comparison of PSNR and SSIM metrics for video quality estimation. International Conference on ICT Innovations, 357–366. https://doi.org/10.1007/978-3-642-10781-8_37
Kruglov, A. (2022). Interpretation of Objective Video Quality Metrics. Elecard: Video Compression Guru. https://www.elecard.com/page/article_interpretation_of_metrics
Li, D., Jiang, T., & Jiang, M. (2019). Quality assessment of in-the-wild videos. Proceedings of the 27th ACM International Conference on Multimedia, 2351–2359. https://doi.org/10.48550/arXiv.1908.00375
Prashnani, E., Cai, H., Mostofi, Y., & Sen, P. (2018). PieAPP: Perceptual image-error assessment through pairwise preference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1808–1817. http://doi.org/10.1109/CVPR.2018.00194
Richardson, I. E. G. (2003). H.264 and MPEG-4 video compression: Video coding for next-generation multimedia (2. ed.). Wiley. https://onlinelibrary.wiley.com/doi/book/10.1002/0470869615
Wang, J., Chan, K. C. K., & Loy, C. C. (2023). Exploring CLIP for assessing the look and feel of images. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 2555–2563. https://doi.org/10.1609/aaai.v37i2.25353
Wang, L. (2021). How Bitrate and Quantization Parameter (QP) Affect Video Quality. https://lesliewongcv.github.io/posts/2021/11/blog-post-1/
Watt, J. (2022). 6 Factors Decide Video Quality: Resolution, Bitrate, Frame Rate, CRF, Bit Depth. https://www.winxdvd.com/video-transcoder/6-factors-decide-video-quality-bitrate-resolution-framerate.htm
Zhang, K., Liang, J., Van Gool, L., & Timofte, R. (2021). Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 4791–4800. https://doi.org/10.48550/arXiv.2103.14006
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 586–595. https://doi.org/https://doi.org/10.48550/arXiv.1801.03924
Zhang, S., Lin, Z., & Zhou, Y. (2023). Accelerate diffusion based human image generation via consistency models. https://dx.doi.org/10.2139/ssrn.4719919