Professor Raouf Hamzaoui

Job: Professor in Media Technology

Faculty: Technology

School/department: School of Engineering and Sustainable Development

Research group(s): Centre for Electronic and Communications Engineering (CECE)

Address: De Montfort University, The Gateway, Leicester, LE1 9BH

T: +44 (0)116 207 8096

E: rhamzaoui@dmu.ac.uk

W: http://www.tech.dmu.ac.uk/~hamzaoui/

 

Personal profile

Raouf Hamzaoui received the MSc degree in mathematics from the University of Montreal, Canada, in 1993, the Dr.rer.nat. degree from the University of Freiburg, Germany, in 1997 and the Habilitation degree in computer science from the University of Konstanz, Germany, in 2004. He was an Assistant Professor with the Department of Computer Science of the University of Leipzig, Germany and with the Department of Computer and Information Science of the University of Konstanz. In September 2006, he joined DMU where he is a Professor in Media Technology and Head of the Signal Processing and Communications Systems Group in the Institute of Engineering Sciences. Raouf Hamzaoui is an IEEE Senior member. He is a member of the Editorial Board of the IEEE Transactions on Multimedia. He has published more than 80 research papers in books, journals, and conferences. His research has been funded by the EU, DFG, Royal Society, and industry and received best paper awards (ICME 2002, PV’07, CONTENT 2010, MESM’2012).

Research group affiliations

Institute of Engineering Sciences (IES)

Context, Intelligence and Interaction Research Group (CIIRG)

Publications and outputs 

  • Satisfied user ratio prediction with support vector regression for compressed stereo images
    Satisfied user ratio prediction with support vector regression for compressed stereo images Fan, Chunling; Zhang, Yun; Hamzaoui, Raouf; Ziou, Djemel; Jiang, Qingshan We propose the first method to predict the Satisfied User Ratio (SUR) for compressed stereo images. The method consists of two main steps. First, considering binocular vision properties, we extract three types of features from stereo images: image quality features, monocular visual features, and binocular visual features. Then, we train a Support Vector Regression (SVR) model to learn a mapping function from the feature space to the SUR values. Experimental results on the SIAT-JSSI dataset show excellent prediction accuracy, with a mean absolute SUR error of only 0.08 for H.265 intra coding and only 0.13 for JPEG2000 compression.
  • Subjective assessment of global picture-wise just noticeable difference
    Subjective assessment of global picture-wise just noticeable difference Lin, Hanhe; Jenadeleh, Mohsen; Chen, Guangan; Reips, Ulf-Dietrich; Hamzaoui, Raouf; Saupe, Dietmar The picture-wise just noticeable difference (PJND) for a given image and a compression scheme is a statistical quantity giving the smallest distortion that a subject can perceive when the image is compressed with the compression scheme. The PJND is determined with subjective assessment tests for a sample of subjects. We introduce and apply two methods of adjustment where the subject interactively selects the distortion level at the PJND using either a slider or keystrokes. We compare the results and times required to those of the adaptive binary search type approach, in which image pairs with distortions that bracket the PJND are displayed and the difference in distortion levels is reduced until the PJND is identified. For the three methods, two images are compared using the flicker test in which the displayed images alternate at a frequency of 8 Hz. Unlike previous work, our goal is a global one, determining the PJND not only for the original pristine image but also for a sequence of compressed versions. Results for the MCL-JCI dataset show that the PJND measurements based on adjustment are comparable with those of the traditional approach using binary search, yet significantly faster. Moreover, we conducted a crowdsourcing study with side-by-side comparisons and forced choice, which suggests that the flicker test is more sensitive than a side-by-side comparison.
  • Coarse to fine rate control for region-based 3D point cloud compression
    Coarse to fine rate control for region-based 3D point cloud compression Liu, Qi; Yuan, Hui; Hamzaoui, Raouf; Su, Honglei We modify the video-based point cloud compression standard (V-PCC) by mapping the patches to seven regions and encoding the geometry and color video sequences of each region. We then propose a coarse to fine rate control algorithm for this scheme. The algorithm consists of two major steps. First, we allocate the target bitrate between the geometry and color information. Then, we optimize in turn the geometry and color quantization steps for the video sequences of each region using analytical models for the rate and distortion. Experimental results for eight point clouds showed that the average percent bitrate error of our algorithm is only 3.7%, and its perceptual reconstruction quality is better than that of V-PCC. The Publisher's final version can be found by following the DOI link.
  • Feature learning for human activity recognition using convolutional neural networks: A case study for inertial measurement unit and audio data
    Feature learning for human activity recognition using convolutional neural networks: A case study for inertial measurement unit and audio data Cruciani, Federico; Vafeiadis, Anastasios; Nugent, Chris; Cleland, Ian; McCullagh, Paul; Votis, Konstantinos; Giakoumis, Dimitrios; Tzovaras, Dimitrios; Chen, Liming; Hamzaoui, Raouf The use of Convolutional Neural Networks (CNNs) as a feature learning method for Human Activity Recognition (HAR) is becoming more and more common. Unlike conventional machine learning methods, which require domain-specific expertise, CNNs can extract features automatically. On the other hand, CNNs require a training phase, making them prone to the cold-start problem. In this work, a case study is presented where the use of a pre-trained CNN feature extractor is evaluated under realistic conditions. The case study consists of two main steps: (1) different topologies and parameters are assessed to identify the best candidate models for HAR, thus obtaining a pre-trained CNN model. The pre-trained model (2) is then employed as feature extractor evaluating its use with a large scale real-world dataset. Two CNN applications were considered: Inertial Measurement Unit (IMU) and audio based HAR. For the IMU data, balanced accuracy was 91.98% on the UCI-HAR dataset, and 67.51% on the real-world Extrasensory dataset. For the audio data, the balanced accuracy was 92.30% on the DCASE 2017 dataset, and 35.24% on the Extrasensory dataset. open access article
  • SUR-FeatNet: Predicting the Satisfied User Ratio Curve for Image Compression with Deep Feature Learning
    SUR-FeatNet: Predicting the Satisfied User Ratio Curve for Image Compression with Deep Feature Learning Lin, Hanhe; Hosu, Vlad; Fan, Chunling; Zhang, Yun; Mu, Yuchen; Hamzaoui, Raouf; Saupe, Dietmar The satisfied user ratio (SUR) curve for a lossy image compression scheme, e.g., JPEG, characterizes the complementary cumulative distribution function of the just noticeable difference (JND), the smallest distortion level that can be perceived by a subject when a reference image is compared to a distorted one. A sequence of JNDs can be defined with a suitable successive choice of reference images. We propose the first deep learning approach to predict SUR curves. We show how to apply maximum likelihood estimation and the Anderson-Darling test to select a suitable parametric model for the distribution function. We then use deep feature learning to predict samples of the SUR curve and apply the method of least squares to fit the parametric model to the predicted samples. Our deep learning approach relies on a siamese convolutional neural network, transfer learning, and deep feature learning, using pairs consisting of a reference image and a compressed image for training. Experiments on the MCL-JCI dataset showed state-of-the-art performance. For example, the mean Bhattacharyya distances between the predicted and ground truth first, second, and third JND distributions were 0.0810, 0.0702, and 0.0522, respectively, and the corresponding average absolute differences of the peak signal-to-noise ratio at a median of the first JND distribution were 0.58, 0.69, and 0.58 dB. Further experiments on the JND-Pano dataset showed that the method transfers well to high resolution panoramic images viewed on head-mounted displays. The file attached to this record is the author's final peer reviewed version.
  • Two-Dimensional Convolutional Recurrent Neural Networks for Speech Activity Detection
    Two-Dimensional Convolutional Recurrent Neural Networks for Speech Activity Detection Vafeiadis, Anastasios; Fanioudakis, Eleftherios; Potamitis, Ilyas; Votis, Konstantinos; Giakoumis, Dimitrios; Tzovaras, Dimitrios; Chen, Liming; Hamzaoui, Raouf Speech Activity Detection (SAD) plays an important role in mobile communications and automatic speech recognition (ASR). Developing efficient SAD systems for real-world applications is a challenging task due to the presence of noise. We propose a new approach to SAD where we treat it as a two-dimensional multilabel image classification problem. To classify the audio segments, we compute their Short-time Fourier Transform spectrograms and classify them with a Convolutional Recurrent Neural Network (CRNN), traditionally used in image recognition. Our CRNN uses a sigmoid activation function, max-pooling in the frequency domain, and a convolutional operation as a moving average filter to remove misclassified spikes. On the development set of Task 1 of the 2019 Fearless Steps Challenge, our system achieved a decision cost function (DCF) of 2.89%, a 66.4% improvement over the baseline. Moreover, it achieved a DCF score of 3.318% on the evaluation dataset of the challenge, ranking first among all submissions.
  • Comparing CNN and Human Crafted Features for Human Activity Recognition
    Comparing CNN and Human Crafted Features for Human Activity Recognition Cruciani, Federico; Vafeiadis, Anastasios; Nugent, Chris; Cleland, Ian; McCullagh, Paul; Votis, Konstantinos; Giakoumis, Dimitrios; Tzovaras, Dimitrios; Chen, Liming; Hamzaoui, Raouf Deep learning techniques such as Convolutional Neural Networks (CNNs) have shown good results in activity recognition. One of the advantages of using these methods resides in their ability to generate features automatically. This ability greatly simplifies the task of feature extraction that usually requires domain specific knowledge, especially when using big data where data driven approaches can lead to anti-patterns. Despite the advantage of this approach, very little work has been undertaken on analyzing the quality of extracted features, and more specifically on how model architecture and parameters affect the ability of those features to separate activity classes in the final feature space. This work focuses on identifying the optimal parameters for recognition of simple activities applying this approach on both signals from inertial and audio sensors. The paper provides the following contributions: (i) a comparison of automatically extracted CNN features with gold standard Human Crafted Features (HCF) is given, (ii) a comprehensive analysis on how architecture and model parameters affect separation of target classes in the feature space. Results are evaluated using publicly available datasets. In particular, we achieved a 93.38% F-Score on the UCI-HAR dataset, using 1D CNNs with 3 convolutional layers and 32 kernel size, and a 90.5% F-Score on the DCASE 2017 development dataset, simplified for three classes (indoor, outdoor and vehicle), using 2D CNNs with 2 convolutional layers and a 2x2 kernel size.
  • Image-based Text Classification using 2D Convolutional Neural Networks
    Image-based Text Classification using 2D Convolutional Neural Networks Merdivan, Erinç; Vafeiadis, Anastasios; Kalatzis, Dimitrios; Hanke, Sten; Kropf, Johannes; Votis, Konstantinos; Giakoumis, Dimitrios; Tzovaras, Dimitrios; Chen, Liming; Hamzaoui, Raouf; Geist, Matthieu We propose a new approach to text classification in which we consider the input text as an image and apply 2D Convolutional Neural Networks to learn the local and global semantics of the sentences from the variations of the visual patterns of words. Our approach demonstrates that it is possible to get semantically meaningful features from images with text without using optical character recognition and sequential processing pipelines, techniques that traditional natural language processing algorithms require. To validate our approach, we present results for two applications: text classification and dialog modeling. Using a 2D Convolutional Neural Network, we were able to outperform the state-ofart accuracy results for a Chinese text classification task and achieved promising results for seven English text classification tasks. Furthermore, our approach outperformed the memory networks without match types when using out of vocabulary entities from Task 4 of the bAbI dialog dataset.
  • SUR-Net: Predicting the Satisfied User Ratio Curve for Image Compression with Deep Learning
    SUR-Net: Predicting the Satisfied User Ratio Curve for Image Compression with Deep Learning Fan, Chunling; Lin, Hanhe; Hosu, Vlad; Zhang, Yun; Jiang, Qingshan; Hamzaoui, Raouf; Saupe, Dietmar The Satisfied User Ratio (SUR) curve for a lossy image compression scheme, e.g., JPEG, characterizes the probability distribution of the Just Noticeable Difference (JND) level, the smallest distortion level that can be perceived by a subject. We propose the first deep learning approach to predict such SUR curves. Instead of the direct approach of regressing the SUR curve itself for a given reference image, our model is trained on pairs of images, original and compressed. Relying on a Siamese Convolutional Neural Network (CNN), feature pooling, a fully connected regression-head, and transfer learning, we achieved a good prediction performance. Experiments on the MCL-JCI dataset showed a mean Bhattacharyya distance between the predicted and the original JND distributions of only 0.072. The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
  • Picture-level just noticeable difference for symmetrically and asymmetrically compressed stereoscopic images: Subjective quality assessment study and datasets
    Picture-level just noticeable difference for symmetrically and asymmetrically compressed stereoscopic images: Subjective quality assessment study and datasets Fan, Chunling; Zhang, Yun; Zhang, Huan; Hamzaoui, Raouf; Jiang, Qingshan The Picture-level Just Noticeable Difference (PJND) for a given image and compression scheme reflects the smallest distortion level that can be perceived by an observer with respect to a reference image. Previous work has focused on the PJND of images and videos. In this paper, we study the PJND of symmetrically and asymmetrically compressed stereoscopic images for JPEG2000 and H.265 intra coding. We conduct interactive subjective quality assessment tests to determine the PJND point using both a pristine image and a distorted image as a reference. We find that the PJND points are highly dependent on the image content. In asymmetric compression, there exists a perceptual threshold in the quality difference between the left and right views due to the binocular masking effect. We generate two PJND-based stereo image datasets (one for symmetric compression and one for asymmetric compression) and make them accessible to the public. The Publisher's final version can be found by following the DOI link.

Click here for a full listing of Raouf Hamzaoui's publications and outputs.

Key research outputs

  • Ahmad, S., Hamzaoui, R., Al-Akaidi, M., Adaptive unicast video streaming with rateless codes and feedback, IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, pp. 275-285, Feb. 2010.
  • Röder, M., Cardinal, J., Hamzaoui, R., Efficient rate-distortion optimized media streaming for tree-structured packet dependencies, IEEE Transactions on Multimedia, vol. 9, pp. 1259-1272, Oct. 2007.  
  • Röder, M., Hamzaoui, R., Fast tree-trellis list Viterbi decoding, IEEE Transactions on Communications, vol. 54, pp. 453-461, March 2006.
  • Röder, M., Cardinal, J., Hamzaoui, R., Branch and bound algorithms for rate-distortion optimized media streaming, IEEE Transactions on Multimedia, vol. 8, pp. 170-178, Feb. 2006.
  • Stankovic, V., Hamzaoui, R., Xiong, Z., Real-time error protection of embedded codes for packet erasure and fading channels, IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, pp. 1064-1072, Aug. 2004.
  • Stankovic, V., Hamzaoui, R., Saupe, D., Fast algorithm for rate-based optimal error protection of embedded codes, IEEE Transactions on Communications, vol. 51, pp. 1788-1795, Nov. 2003.
  • Hamzaoui, R., Saupe, D., Combining fractal image compression and vector quantization, IEEE Transactions on Image Processing, vol. 9, no. 2, pp. 197-208, 2000.
  • Hamzaoui, R., Fast iterative methods for fractal image compression, Journal of Mathematical Imaging and Vision 11,2 (1999) 147-159.

 

Research interests/expertise

  • Image and Video Compression
  • Multimedia Communication
  • Error Control Systems
  • Image and Signal Processing
  • Pattern Recognition
  • Algorithms

Areas of teaching

Signal Processing

Image Processing

Data Communication

Media Technology

Qualifications

Master’s in Mathematics (Faculty of Sciences of Tunis), 1986

MSc in Mathematics (University of Montreal), 1993

Dr.rer.nat (University of Freiburg), 1997

Habilitation in Computer Science (University of Konstanz), 2004

Courses taught

Digital Signal Processing

Mobile Communication

Communication Networks

Signal Processing

Multimedia Communication

Digital Image Processing

Mobile Wireless Communication

Research Methods

Pattern Recognition

Error Correcting Codes

Membership of professional associations and societies

IEEE Senior Member

IEEE Signal Processing Society

IEEE Multimedia Communications Technical Committee 

Current research students

Mohamed Al-Ibaisi, PT, PhD student since January 2017

Thaeer Kobbaey, FT, PhD student since April 2014

Professional esteem indicators

Editorial Board Member IEEE Transactions on Multimedia (since 2017)

Technical Program Committee Co-Chair, IEEE MMSP 2017, London-Luton, Oct. 2017.

Editorial Board Member IEEE Transactions on Circuits and Systems for Video Technology (2010-2016)

RaoufH

Search Who's Who

 

 
News target area image
News

DMU is a dynamic university, read about what we have been up to in our latest news section.

Events at DMU
Events

At DMU there is always something to do or see, check out our events for yourself.

Mission and vision target area image
Mission and vision

Read about our mission and vision and how these create a supportive and exciting learning environment.