Professor Raouf Hamzaoui

Job: Professor in Media Technology

Faculty: Computing, Engineering and Media

School/department: School of Engineering and Sustainable Development

Research group(s): Institute of Engineering Sciences

Address: De Montfort University, The Gateway, Leicester, LE1 9BH

T: +44 (0)116 207 8096

E: rhamzaoui@dmu.ac.uk

W: http://www.tech.dmu.ac.uk/~hamzaoui/

 

Personal profile

Raouf Hamzaoui received the MSc degree in mathematics from the University of Montreal, Canada, in 1993, the Dr.rer.nat. degree from the University of Freiburg, Germany, in 1997 and the Habilitation degree in computer science from the University of Konstanz, Germany, in 2004. He was an Assistant Professor with the Department of Computer Science of the University of Leipzig, Germany and with the Department of Computer and Information Science of the University of Konstanz. In September 2006, he joined DMU where he is a Professor in Media Technology and Head of the Signal Processing and Communications Systems Group in the Institute of Engineering Sciences. Raouf Hamzaoui is an IEEE Senior member. He was a member of the Editorial Board of the IEEE Transactions on Multimedia and IEEE Transactions on Circuits and Systems for Video Technology. He has published more than 100 research papers in books, journals, and conferences. His research has been funded by the EU, DFG, Royal Society, Chinese Academy of Sciences, China Ministry of Science and Technology, and industry and received best paper awards (ICME 2002, PV’07, CONTENT 2010, MESM’2012, UIC-2019).

Research group affiliations

Institute of Engineering Sciences (IES)

Signal Processing and Communications Systems (SPCS)

 

Publications and outputs

  • 3DAttGAN: A 3D attention-based generative adversarial network for joint space-time video super-resolution
    dc.title: 3DAttGAN: A 3D attention-based generative adversarial network for joint space-time video super-resolution dc.contributor.author: Fu, Congrui; Yuan, Hui; Shen, Liquan; Hamzaoui, Raouf; Zhang, Hao dc.description.abstract: Joint space-time video super-resolution aims to increase both the spatial resolution and the frame rate of a video sequence. As a result, details become more apparent, leading to a better and more realistic viewing experience. This is particularly valuable for applications such as video streaming, video surveillance (object recognition and tracking), and digital entertainment. Over the last few years, several joint space-time video super-resolution methods have been proposed. While those built on deep learning have shown great potential, their performance still falls short. One major reason is that they heavily rely on two-dimensional (2D) convolutional networks, which restricts their capacity to effectively exploit spatio-temporal information. To address this limitation, we propose a novel generative adversarial network for joint space-time video super-resolution. The novelty of our network is twofold. First, we propose a three-dimensional (3D) attention mechanism instead of traditional two-dimensional attention mechanisms. Our generator uses 3D convolutions associated with the proposed 3D attention mechanism to process temporal and spatial information simultaneously and focus on the most important channel and spatial features. Second, we design two discriminator strategies to enhance the performance of the generator. The discriminative network uses a two-branch structure to handle the intra-frame texture details and inter-frame motion occlusions in parallel, making the generated results more accurate. Experimental results on the Vid4, Vimeo-90K, and REDS datasets demonstrate the effectiveness of the proposed method. The source code is publicly available at https://github.com/FCongRui/3DAttGan.git. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • PU-Mask: 3D Point Cloud Upsampling via an Implicit Virtual Mask
    dc.title: PU-Mask: 3D Point Cloud Upsampling via an Implicit Virtual Mask dc.contributor.author: Liu, Hao; Yuan, Hui; Hamzaoui, Raouf; Liu, Qi; Li, Shuai dc.description.abstract: We present PU-Mask, a virtual mask-based network for 3D point cloud upsampling. Unlike existing upsampling methods, which treat point cloud upsampling as an “unconstrained generative” problem, we propose to address it from the perspecitive of “local filling”, i.e., we assume that the sparse input point cloud (i.e., the unmasked point set) is obtained by locally masking the original dense point cloud with virtual masks. Therefore, given the unmasked point set and virtual masks, our goal is to fill the point set hidden by the virtual masks. Specifically, because the masks do not actually exist, we first locate and form each virtual mask by a virtual mask generation module. Then, we propose a mask-guided transformer-style asymmetric autoencoder (MTAA) to restore the upsampled features. Moreover, we introduce a second-order unfolding attention mechanism to enhance the interaction between the feature channels of MTAA. Next, we generate a coarse upsampled point cloud using a pooling technique that is specific to the virtual masks. Finally, we design a learnable pseudo Laplacian operator to calibrate the coarse upsampled point cloud and generate a refined upsampled point cloud. Extensive experiments demonstrate that PU-Mask is superior to the state-of-the-art methods. Our code will be made available at: https://github.com/liuhaoyun/PU-Mask dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • Enhancing Context Models for Point Cloud Geometry Compression with Context Feature Residuals and Multi-Loss
    dc.title: Enhancing Context Models for Point Cloud Geometry Compression with Context Feature Residuals and Multi-Loss dc.contributor.author: Sun, Chang; Yuan, Hui; Li, Shuai; Lu, Xin; Hamzaoui, Raouf dc.description.abstract: In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficult for the context model to accurately predict the probability distribution of node occupancy. Second, as the one-hot encoding is not the actual probability distribution of node occupancy, the cross-entropy loss function is inaccurate. To address these problems, we propose a general structure that can enhance existing context models. We introduce the context feature residuals into the context model to amplify the differences between contexts. We also add a multi-layer perception branch, that uses the mean squared error between its output and node occupancy as a loss function to provide accurate gradients in backpropagation. We validate our method by showing that it can improve the performance of an octreebased model (OctAttention) and a voxel-based model (VoxelDNN) on the object point cloud datasets MPEG 8i and MVUB, as well as the LiDAR point cloud dataset SemanticKITTI. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • Support vector regression-based reduced-reference perceptual quality model for compressed point clouds
    dc.title: Support vector regression-based reduced-reference perceptual quality model for compressed point clouds dc.contributor.author: Su, Honglei; Liu, Qi; Yuan, Hui; Cheng, Qiang; Hamzaoui, Raouf dc.description.abstract: Video-based point cloud compression (V-PCC) is a state-of-the-art moving picture experts group (MPEG) standard for point cloud compression. V-PCC can be used to compress both static and dynamic point clouds in a lossless, near lossless, or lossy way. Many objective quality metrics have been proposed for distorted point clouds. Most of these metrics are full-reference metrics that require both the original point cloud and the distorted one. However, in some real-time applications, the original point cloud is not available, and no-reference or reduced-reference quality metrics are needed. Three main challenges in the design of a reduced-reference quality metric are how to build a set of features that characterize the visual quality of the distorted point cloud, how to select the most effective features from this set, and how to map the selected features to a perceptual quality score. We address the first challenge by proposing a comprehensive set of features consisting of compression, geometry, normal, curvature, and luminance features. To deal with the second challenge, we use the least absolute shrinkage and selection operator (LASSO) method, which is a variable selection method for regression problems. Finally, we map the selected features to the mean opinion score in a nonlinear space. Although we have used only 19 features in our current implementation, our metric is flexible enough to allow any number of features, including future more effective ones. Experimental results on the Waterloo point cloud datasetversion 2 (WPC2.0) and the MPEG point cloud compression dataset (M-PCCD) show that our method, namely PCQAML, outperforms state-of-the-art full-reference and reduced-reference quality metrics in terms of Pearson linear correlation coefficient, Spearman rank order correlation coefficient, Kendall’s rank-order correlation coefficient, and root mean squared error. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • GQE-Net: A Graph-based Quality Enhancement Network for Point Cloud Color Attribute
    dc.title: GQE-Net: A Graph-based Quality Enhancement Network for Point Cloud Color Attribute dc.contributor.author: Xing, Jinrui; Yuan, Hui; Hamzaoui, Raouf; Liu, Hao; Hou, Junhui dc.description.abstract: In recent years, point clouds have become increasingly popular for representing three-dimensional (3D) visual objects and scenes. To efficiently store and transmit point clouds, compression methods have been developed, but they often result in a degradation of quality. To reduce color distortion in point clouds, we propose a graph-based quality enhancement network (GQE-Net) that uses geometry information as an auxiliary input and graph convolution blocks to extract local features efficiently. Specifically, we use a parallel-serial graph attention module with a multi-head graph attention mechanism to focus on important points or features and help them fuse together. Additionally, we design a feature refinement module that takes into account the normals and geometry distance between points. To work within the limitations of GPU memory capacity, the distorted point cloud is divided into overlap-allowed 3D patches, which are sent to GQE-Net for quality enhancement. To account for differences in data distribution among different color components, three models are trained for the three color components. Experimental results show that our method achieves state-of-the-art performance. For example, when implementing GQE-Net on a recent test model of the geometry-based point cloud compression (G-PCC) standard, 0.43 dB, 0.25 dB and 0.36 dB Bjϕntegaard delta (BD)-peak signal-to-noise ratio (PSNR), corresponding to 14.0%, 9.3% and 14.5% BD-rate savings were achieved on dense point clouds for the Y, Cb, and Cr components, respectively. The source code of our method is available at https://github.com/xjr998/GQE-Net. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • Relaxed forced choice improves performance of visual quality assessment methods
    dc.title: Relaxed forced choice improves performance of visual quality assessment methods dc.contributor.author: Jenadeleh, Mohsen; Zagermann, Johannes; Reiterer, Harald; Reips, Ulf-Dietrich; Hamzaoui, Raouf; Saupe, Dietmar dc.description.abstract: In image quality assessment, a collective visual quality score for an image or video is obtained from the individual ratings of many subjects. One commonly used format for these experiments is the two-alternative forced choice method. Two stimuli with the same content but differing visual quality are presented sequentially or side-by-side. Subjects are asked to select the one of better quality, and when uncertain, they are required to guess. The relaxed alternative forced choice format aims to reduce the cognitive load and the noise in the responses due to the guessing by providing a third response option, namely, “not sure”. This work presents a large and comprehensive crowdsourcing experiment to compare these two response formats: the one with the “not sure” option and the one without it. To provide unambiguous ground truth for quality evaluation, subjects were shown pairs of images with differing numbers of dots and asked each time to choose the one with more dots. Our crowdsourcing study involved 254 participants and was conducted using a within-subject design. Each participant was asked to respond to 40 pair comparisons with and without the “not sure” response option and completed a questionnaire to evaluate their cognitive load for each testing condition. The experimental results show that the inclusion of the “not sure” response option in the forced choice method reduced mental load and led to models with better data fit and correspondence to ground truth. We also tested for the equivalence of the models and found that they were different. The dataset is available at http://database.mmsp-kn.de/cogvqa-database.html. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • CAS-NET: Cascade attention-based sampling neural network for point cloud simplification
    dc.title: CAS-NET: Cascade attention-based sampling neural network for point cloud simplification dc.contributor.author: Chen, Chen; Yuan, Hui; Liu, Hao; Hou, Junhui; Hamzaoui, Raouf dc.description.abstract: Point cloud sampling can reduce storage requirements and computation costs for various vision tasks. Traditional sampling methods, such as farthest point sampling, are not geared towards downstream tasks and may fail on such tasks. In this paper, we propose a cascade attention-based sampling network (CAS-Net), which is end-to-end trainable. Specifically, we propose an attention-based sampling module (ASM) to capture the semantic features and preserve the geometry of the original point cloud. Experimental results on the ModelNet40 dataset show that CAS-Net outperforms state-of-the-art methods in a sampling-based point cloud classification task, while preserving the geometric structure of the sampled point cloud.
  • PUFA-GAN: A Frequency-Aware Generative Adversarial Network for 3D Point Cloud Upsampling
    dc.title: PUFA-GAN: A Frequency-Aware Generative Adversarial Network for 3D Point Cloud Upsampling dc.contributor.author: Liu, Hao; Yuan, Hui; Hou, Junhui; Hamzaoui, Raouf; Gao, Wei dc.description.abstract: We propose a generative adversarial network for point cloud upsampling, which can not only make the upsampled points evenly distributed on the underlying surface but also efficiently generate clean high frequency regions. The generator of our network includes a dynamic graph hierarchical residual aggregation unit and a hierarchical residual aggregation unit for point feature extraction and upsampling, respectively. The former extracts multiscale point-wise descriptive features, while the latter captures rich feature details with hierarchical residuals. To generate neat edges, our discriminator uses a graph filter to extract and retain high frequency points. The generated high resolution point cloud and corresponding high frequency points help the discriminator learn the global and high frequency properties of the point cloud. We also propose an identity distribution loss function to make sure that the upsampled points remain on the underlying surface of the input low resolution point cloud. To assess the regularity of the upsampled points in high frequency regions, we introduce two evaluation metrics. Objective and subjective results demonstrate that the visual quality of the upsampled point clouds generated by our method is better than that of the state-of-the-art methods. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • No-reference Bitstream-layer Model for Perceptual Quality Assessment of V-PCC Encoded Point Clouds
    dc.title: No-reference Bitstream-layer Model for Perceptual Quality Assessment of V-PCC Encoded Point Clouds dc.contributor.author: Liu, Qi; Su, Honglei; Chen, Tianxin; Yuan, Hui; Hamzaoui, Raouf dc.description.abstract: No-reference bitstream-layer models for point cloud quality assessment (PCQA) use the information extracted from a bitstream for real-time and nonintrusive quality monitoring. We propose a no-reference bitstream-layer model for the perceptual quality assessment of video-based point cloud compression (V-PCC) encoded point clouds. First, we describe the fundamental relationship between perceptual coding distortion and the texture quantization parameter (TQP) when geometry encoding is lossless. Then, we incorporate the texture complexity (TC) into the proposed model while considering the fact that the perceptual coding distortion of a point cloud depends on the texture characteristics. TC is estimated using TQP and the texture bitrate per pixel (TBPP), both of which are extracted from the compressed bitstream without resorting to complete decoding. Then, we construct a texture distortion assessment model upon TQP and TBPP. By combining this texture distortion model with the geometry quantization parameter (GQP), we obtain an overall no-reference bitstream-layer PCQA model that we call bitstreamPCQ. Experimental results show that the proposed model markedly outperforms existing models in terms of widely used performance criteria, including the Pearson linear correlation coefficient (PLCC), the Spearman rank order correlation coefficient (SRCC) and the root mean square error (RMSE). The dataset developed in this study is publicly available at https://github.com/qdushl/Waterloo-Point-Cloud-Database-3.0. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • Large-scale crowdsourced subjective assessment of picturewise just noticeable difference
    dc.title: Large-scale crowdsourced subjective assessment of picturewise just noticeable difference dc.contributor.author: Lin, Hanhe; Chen, Guangan; Jenadeleh, Mohsen; Hosu, Vlad; Reips, Ulf-Dietrich; Hamzaoui, Raouf; Saupe, Dietmar dc.description.abstract: The picturewise just noticeable difference (PJND) for a given image, compression scheme, and subject is the smallest distortion level that the subject can perceive when the image is compressed with this compression scheme. The PJND can be used to determine the compression level at which a given proportion of the population does not notice any distortion in the compressed image. To obtain accurate and diverse results, the PJND must be determined for a large number of subjects and images. This is particularly important when experimental PJND data are used to train deep learning models that can predict a probability distribution model of the PJND for a new image. To date, such subjective studies have been carried out in laboratory environments. However, the number of participants and images in all existing PJND studies is very small because of the challenges involved in setting up laboratory experiments. To address this limitation, we develop a framework to conduct PJND assessments via crowdsourcing. We use a new technique based on slider adjustment and a flicker test to determine the PJND. A pilot study demonstrated that our technique could decrease the study duration by 50% and double the perceptual sensitivity compared to the standard binary search approach that successively compares a test image side by side with its reference image. Our framework includes a robust and systematic scheme to ensure the reliability of the crowdsourced results. Using 1,008 source images and distorted versions obtained with JPEG and BPG compression, we apply our crowdsourcing framework to build the largest PJND dataset, KonJND-1k (Konstanz just noticeable difference 1k dataset). A total of 503 workers participated in the study, yielding 61,030 PJND samples that resulted in an average of 42 samples per source image. The KonJND-1k dataset is available at http://database.mmsp-kn.de/konjnd-1kdatabase.html dc.description: TRR 161 (Project A05) The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

Click here for a full listing of Raouf Hamzaoui's publications and outputs.

Key research outputs

  • H. Liu, H. Yuan, J. Hou, R. Hamzaoui, W. Gao, PUFA-GAN: A Frequency-Aware Generative Adversarial Network for 3D Point Cloud Upsampling, IEEE Transactions on Image Processing, vol. 31, pp. 7389-7402, 2022, doi: 10.1109/TIP.2022.3222918.

  • Q. Liu, H. Yuan, J. Hou, R. Hamzaoui, H. Su, Model-based joint bit allocation between geometry and color for video-based 3D point cloud compression, IEEE Transactions on Multimedia, vol. 23, pp. 3278-3291, 2021, doi: 10.1109/TMM.2020.3023294.

  • Ahmad, S., Hamzaoui, R., Al-Akaidi, M., Adaptive unicast video streaming with rateless codes and feedback, IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, pp. 275-285, Feb. 2010.
  • Röder, M., Cardinal, J., Hamzaoui, R., Efficient rate-distortion optimized media streaming for tree-structured packet dependencies, IEEE Transactions on Multimedia, vol. 9, pp. 1259-1272, Oct. 2007.  
  • Röder, M., Hamzaoui, R., Fast tree-trellis list Viterbi decoding, IEEE Transactions on Communications, vol. 54, pp. 453-461, March 2006.
  • Röder, M., Cardinal, J., Hamzaoui, R., Branch and bound algorithms for rate-distortion optimized media streaming, IEEE Transactions on Multimedia, vol. 8, pp. 170-178, Feb. 2006.
  • Stankovic, V., Hamzaoui, R., Xiong, Z., Real-time error protection of embedded codes for packet erasure and fading channels, IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, pp. 1064-1072, Aug. 2004.
  • Stankovic, V., Hamzaoui, R., Saupe, D., Fast algorithm for rate-based optimal error protection of embedded codes, IEEE Transactions on Communications, vol. 51, pp. 1788-1795, Nov. 2003.
  • Hamzaoui, R., Saupe, D., Combining fractal image compression and vector quantization, IEEE Transactions on Image Processing, vol. 9, no. 2, pp. 197-208, 2000.
  • Hamzaoui, R., Fast iterative methods for fractal image compression, Journal of Mathematical Imaging and Vision 11,2 (1999) 147-159.

Research interests/expertise

  • Image and Video Compression
  • Multimedia Communication
  • Error Control Systems
  • Image and Signal Processing
  • Machine Learning
  • Pattern Recognition
  • Algorithms

Areas of teaching

Signal Processing

Image Processing

Data Communication

Media Technology

Qualifications

Master’s in Mathematics (Faculty of Sciences of Tunis), 1986

MSc in Mathematics (University of Montreal), 1993

Dr.rer.nat (University of Freiburg), 1997

Habilitation in Computer Science (University of Konstanz), 2004

Courses taught

Digital Signal Processing

Mobile Communication 

Communication Networks

Signal Processing

Multimedia Communication

Digital Image Processing

Mobile Wireless Communication

Research Methods

Pattern Recognition

Error Correcting Codes

Honours and awards

Outstanding Associate Editor Award, IEEE Transactions on Multimedia, 2020

Certificate of Merit for outstanding editorial board service, IEEE Transactions on Multimedia, 2018

Best Associate Editor award, IEEE Transactions on Circuits and Systems for Video Technology, 2014

Best Associate Editor award, IEEE Transactions on Circuits and Systems for Video Technology, 2012

Membership of professional associations and societies

IEEE Senior Member

IEEE Signal Processing Society

IEEE Multimedia Communications Technical Committee 

British Standards Institute (BSI) IST/37 committee 

Current research students

Sergun Ozmen, PT PhD student since July 2019

Mohammad Al-Ibaisi, PT PhD student since January 2017

 

Professional esteem indicators

 

Guest Editor IEEE Open Journal of Circuits and Systems, Special Section on IEEE ICME 2020.

Guest Editor IEEE Transactions on Multimedia, Special Issue on Hybrid Human-Artificial Intelligence for Multimedia Computing.

Editorial Board Member Frontiers in Signal Processing (2021-) 

Editorial Board Member IEEE Transactions on Multimedia (2017-2021)

Editorial Board Member IEEE Transactions on Circuits and Systems for Video Technology (2010-2016)

Co-organiser Special Session on 3D Point Cloud Acquisition, Processing and Communication (3DPC-APC), 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) December 13 – 16, 2022, Suzhou, China.

Co-organiser 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis, at ACM Multimedia 2022, Lisbon, Oct. 2022.

Area Chair IEEE International Conference on Image Processing (ICIP) 2024, Abu Dhabi, Oct. 2024.

Area Chair IEEE International Conference on Multimedia and Expo (ICME) 2024, Niagara Falls, July 2024.

Area Chair IEEE International Conference on Image Processing (ICIP) 2023, Kuala Lumpur, Oct. 2023.

Area Chair IEEE International Conference on Multimedia and Expo (ICME) 2023, Brisbane, July 2023.

Area Chair IEEE International Conference on Image Processing (ICIP) 2022, Bordeaux, October 2022.

Area Chair for Multimedia Communications, Networking and Mobility IEEE International Conference on Multimedia and Expo (ICME) 2022, Taipei, July 2022.

Area Chair, IEEE ICIP 2021, Anchorage, September 2021

Area Chair for Multimedia Communications, Networking and Mobility, IEEE ICME 2021, Shenzhen, July 2021

Workshops Co-Chair, IEEE ICME 2020, London, July 2020.

Technical Program Committee Co-Chair, IEEE MMSP 2017, London-Luton, Oct. 2017.