Este artículo presenta una herramienta para la evaluación de la calidad de video, que permite seleccionar parámetros de escalabilidad de calidad (QP), temporal (FPS) y espacial (bitrate). La propuesta integra métricas tradicionales como Peak Signal-to-Noise Ratio (PSNR) y Structural Similarity Index (SSIM), junto con la métrica perceptual Learned Perceptual Image Patch Similarity (LPIPS), basada en redes neuronales profundas. Para validar su efectividad, se aplicó una metodología en dos fases de evaluación subjetiva. En la primera, los participantes evaluaron videos codificados con un mismo parámetro de escalabilidad, mostrando alta correspondencia entre la percepción visual y las métricas. En la segunda, se compararon diferentes configuraciones, evidenciando preferencia por alta calidad y escalabilidad espacial intermedia. Asimismo, en experimentos adicionales con distorsiones comunes (difuminado y ruido), LPIPS alcanzó una sensibilidad del 73.64 %, superando a PSNR y SSIM en su alineación con la percepción humana. La principal contribución de este trabajo es una herramienta que combina evaluaciones objetivas y subjetivas, facilitando un análisis más completo y cercano a la percepción visual humana

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial 4.0.
Referencias
Bowker, D. (2021). Bitrate Defined: How It Impacts Video Quality. https://artlist.io/blog/what-is-bitrate/
Chen, Z., Hu, B., Niu, C., Chen, T., Li, Y., Shan, H., & Wang, G. (2023). IQAGPT: Image Quality Assessment with Vision-language and ChatGPT Models. https://doi.org/10.48550/arXiv.2312.15663
Danier, D., Zhang, F., & Bull, D. (2022). Flo LPIPS: A bespoke video quality metric for frame interpolation. 2022 Picture Coding Symposium (PCS), 283–287. https://doi.org/10.48550/arXiv.2207.08119
Ding, K., Ma, K., Wang, S., & Simoncelli, E. P. (2020). Image quality assessment: Unifying structure and texture similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(5), 2567–2581. https://doi.org/10.48550/arXiv.2004.07728
Ding, K., Zhong, R., Wang, Z., Yu, Y., & Fang, Y. (2023). Adaptive Structure and Texture Similarity Metric for Image Quality Assessment and Optimization. IEEE Transactions on Multimedia, 1–13. https://doi.org/10.1109/TMM.2023.3333208
FFmpeg Developers. (2024). Introduction to FFmpeg. FFmpeg. https://ffmpeg.org/about.html
Flores, B. (2024). Evaluación calidad video [Repositorio en GitHub]. GitHub. https://github.com/Akilescasteo/Evaluacion_calidad_video
Xiph Foundation. (2023). Xiph.org video test media [derf’s collection]. https://media.xiph.org/video/derf/
Gu, J., Cai, H., Chen, H., Ye, X., Ren, J., & Dong, C. (2020). PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration. https://doi.org/10.48550/arXiv.2007.12142
Gu, J., Cai, H., Dong, C., Ren, J. S., Timofte, R., Gong, Y., Lao, S., Shi, S., Wang, J., Yang, S., Wu, T., Xia, W., Yang, Y., Cao, M., Heng, C., Fu, L., Zhang, R., Zhang, Y., Wang, H., … Tiwari, A. K. (2022). NTIRE 2022 Challenge on Perceptual Image Quality Assessment. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 951–967. https://doi.org/https://doi.org/10.48550/arXiv.2206.11695
Hou, Q., Ghildyal, A., & Liu, F. (2022). A perceptual quality metric for video frame interpolation. European Conference on Computer Vision, 234–253. https://doi.org/10.48550/arXiv.2210.01879
Huynh-Thu, Q., & Ghanbari, M. (2012). The accuracy of PSNR in predicting video quality for different video scenes and frame rates. Telecommunication Systems, (49), 35–48. https://doi.org/10.1007/s11235-010-9351-x
García Izquierdo, F. (2017). Desarrollo de una herramienta para la medida de calidad de vídeo [Tesis de grado, Universidad de Sevilla]. https://biblus.us.es/bibing/proyectos/abreproy/91129/fichero/Memoria+TFG+-+Desarrollo+de+una+herramienta+para+la+medida+de+calidad+de+video.pdf
Kastryulin, S., Zakirov, J., Pezzotti, N., & Dylov, D. V. (2023). Image Quality Assessment for Magnetic Resonance Imaging. IEEE Access, 11, 14154–14168. https://doi.org/10.1109/ACCESS.2023.3243466
Kotevski, Z., & Mitrevski, P. (2010). Experimental comparison of PSNR and SSIM metrics for video quality estimation. International Conference on ICT Innovations, 357–366. https://doi.org/10.1007/978-3-642-10781-8_37
Kruglov, A. (2022). Interpretation of Objective Video Quality Metrics. Elecard: Video Compression Guru. https://www.elecard.com/page/article_interpretation_of_metrics
Li, D., Jiang, T., & Jiang, M. (2019). Quality assessment of in-the-wild videos. Proceedings of the 27th ACM International Conference on Multimedia, 2351–2359. https://doi.org/10.48550/arXiv.1908.00375
Prashnani, E., Cai, H., Mostofi, Y., & Sen, P. (2018). PieAPP: Perceptual image-error assessment through pairwise preference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1808–1817. http://doi.org/10.1109/CVPR.2018.00194
Richardson, I. E. G. (2003). H.264 and MPEG-4 video compression: Video coding for next-generation multimedia (2. ed.). Wiley. https://onlinelibrary.wiley.com/doi/book/10.1002/0470869615
Wang, J., Chan, K. C. K., & Loy, C. C. (2023). Exploring CLIP for assessing the look and feel of images. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 2555–2563. https://doi.org/10.1609/aaai.v37i2.25353
Wang, L. (2021). How Bitrate and Quantization Parameter (QP) Affect Video Quality. https://lesliewongcv.github.io/posts/2021/11/blog-post-1/
Watt, J. (2022). 6 Factors Decide Video Quality: Resolution, Bitrate, Frame Rate, CRF, Bit Depth. https://www.winxdvd.com/video-transcoder/6-factors-decide-video-quality-bitrate-resolution-framerate.htm
Zhang, K., Liang, J., Van Gool, L., & Timofte, R. (2021). Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 4791–4800. https://doi.org/10.48550/arXiv.2103.14006
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 586–595. https://doi.org/https://doi.org/10.48550/arXiv.1801.03924
Zhang, S., Lin, Z., & Zhou, Y. (2023). Accelerate diffusion based human image generation via consistency models. https://dx.doi.org/10.2139/ssrn.4719919