-
Considerations for a Micromirror Array Optimized for Compressive Sensing (VIS to MIR) in Space Applications -
A Mathematical Model for Wind Velocity Field Reconstruction and Visualization Taking into Account the Topography Influence -
Anatomical Characteristics of Cervicomedullary Compression on MRI Scans in Children with Achondroplasia -
Evaluating Brain Tumor Detection with Deep Learning Convolutional Neural Networks Across Multiple MRI Modalities
Journal Description
Journal of Imaging
Journal of Imaging
is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques published online monthly by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: CiteScore - Q1 (Computer Graphics and Computer-Aided Design)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 18.3 days after submission; acceptance to publication is undertaken in 3.3 days (median values for papers published in this journal in the second half of 2024).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
2.7 (2023);
5-Year Impact Factor:
3.0 (2023)
Latest Articles
The Aesthetic Appreciation of Multi-Stable Images
J. Imaging 2025, 11(4), 111; https://doi.org/10.3390/jimaging11040111 - 4 Apr 2025
Abstract
Does the quality that renders multi-stable images fascinating, the sudden perceptual reorganization, the switching from one interpretation into another, also make these images appear beautiful? Or is the aesthetic quality of multi-stable figures unrelated to the ease with which they switch? Across two
[...] Read more.
Does the quality that renders multi-stable images fascinating, the sudden perceptual reorganization, the switching from one interpretation into another, also make these images appear beautiful? Or is the aesthetic quality of multi-stable figures unrelated to the ease with which they switch? Across two experiments, we presented multi-stable images and manipulated their perceptual stability. We also presented their unambiguous components in isolation. In the first experiment, this manipulation targeted the inherent stimulus stability through properties like figural size and composition. The second experiment added an instruction for observers to actively control the stability, by attempting to either enhance or prevent perceptual switches as best they could. We found that higher stability was associated with higher liking, positive valence, and lower arousal. This increase in appreciation was mainly driven by inherent stimulus properties. The stability instruction only increased the liking of figures that had been comparatively stable to begin with. We conclude that the fascinating feature of multi-stable images does not contribute to their aesthetic liking. In fact, perceptual switching is detrimental to it. Processing fluency can explain this counterintuitive finding. We also discuss the role of ambiguity in the aesthetic quality of multi-stable images.
Full article
(This article belongs to the Special Issue Next-Gen Visual Stimulators: Smart Human-Machine Interfaces for Visual Perception Assessment)
►
Show Figures
Open AccessReview
Artificial Intelligence in Placental Pathology: New Diagnostic Imaging Tools in Evolution and in Perspective
by
Antonio d’Amati, Giorgio Maria Baldini, Tommaso Difonzo, Angela Santoro, Miriam Dellino, Gerardo Cazzato, Antonio Malvasi, Antonella Vimercati, Leonardo Resta, Gian Franco Zannoni and Eliano Cascardi
J. Imaging 2025, 11(4), 110; https://doi.org/10.3390/jimaging11040110 - 3 Apr 2025
Abstract
Artificial intelligence (AI) has emerged as a transformative tool in placental pathology, offering novel diagnostic methods that promise to improve accuracy, reduce inter-observer variability, and positively impact pregnancy outcomes. The primary objective of this review is to summarize recent developments in AI applications
[...] Read more.
Artificial intelligence (AI) has emerged as a transformative tool in placental pathology, offering novel diagnostic methods that promise to improve accuracy, reduce inter-observer variability, and positively impact pregnancy outcomes. The primary objective of this review is to summarize recent developments in AI applications tailored specifically to placental histopathology. Current AI-driven approaches include advanced digital image analysis, three-dimensional placental reconstruction, and deep learning models such as GestAltNet for precise gestational age estimation and automated identification of histological lesions, including decidual vasculopathy and maternal vascular malperfusion. Despite these advancements, significant challenges remain, notably dataset heterogeneity, interpretative limitations of current AI algorithms, and issues regarding model transparency. We critically address these limitations by proposing targeted solutions, such as augmenting training datasets with annotated artifacts, promoting explainable AI methods, and enhancing cross-institutional collaborations. Finally, we outline future research directions, emphasizing the refinement of AI algorithms for routine clinical integration and fostering interdisciplinary cooperation among pathologists, computational researchers, and clinical specialists.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Validation of Quantitative Ultrasound and Texture Derivative Analyses-Based Model for Upfront Prediction of Neoadjuvant Chemotherapy Response in Breast Cancer
by
Adrian Wai Chan, Lakshmanan Sannachi, Daniel Moore-Palhares, Archya Dasgupta, Sonal Gandhi, Rossanna Pezo, Andrea Eisen, Ellen Warner, Frances C. Wright, Nicole Look Hong, Ali Sadeghi-Naini, Mia Skarpathiotakis, Belinda Curpen, Carrie Betel, Michael C. Kolios, Maureen Trudeau and Gregory J. Czarnota
J. Imaging 2025, 11(4), 109; https://doi.org/10.3390/jimaging11040109 - 3 Apr 2025
Abstract
This work was conducted in order to validate a pre-treatment quantitative ultrasound (QUS) and texture derivative analyses-based prediction model proposed in our previous study to identify responders and non-responders to neoadjuvant chemotherapy in patients with breast cancer. The validation cohort consisted of 56
[...] Read more.
This work was conducted in order to validate a pre-treatment quantitative ultrasound (QUS) and texture derivative analyses-based prediction model proposed in our previous study to identify responders and non-responders to neoadjuvant chemotherapy in patients with breast cancer. The validation cohort consisted of 56 breast cancer patients diagnosed between the years 2018 and 2021. Among all patients, 53 were treated with neoadjuvant chemotherapy and three had unplanned changes in their chemotherapy cycles. Radio Frequency (RF) data were collected volumetrically prior to the start of chemotherapy. In addition to tumour region (core), a 5 mm tumour-margin was also chosen for parameters estimation. The prediction model, which was developed previously based on quantitative ultrasound, texture derivative, and tumour molecular subtypes, was used to identify responders and non-responders. The actual response, which was determined by clinical and pathological assessment after lumpectomy or mastectomy, was then compared to the predicted response. The sensitivity, specificity, positive predictive value, negative predictive value, and F1 score for determining chemotherapy response of all patients in the validation cohort were 94%, 67%, 96%, 57%, and 95%, respectively. Removing patients who had unplanned changes in their chemotherapy resulted in a sensitivity, specificity, positive predictive value, negative predictive value, and F1 score of all patients in the validation cohort of 94%, 100%, 100%, 50%, and 97%, respectively. Explanations for the misclassified cases included unplanned modifications made to the type of chemotherapy during treatment, inherent limitations of the predictive model, presence of DCIS in tumour structure, and an ill-defined tumour border in a minority of cases. Validation of a model was conducted in an independent cohort of patient for the first time to predict the tumour response to neoadjuvant chemotherapy using quantitative ultrasound, texture derivate, and molecular features in patients with breast cancer. Further research is needed to improve the positive predictive value and evaluate whether the treatment outcome can be improved in predicted non-responders by switching to other treatment options.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Local Binary Pattern–Cycle Generative Adversarial Network Transfer: Transforming Image Style from Day to Night
by
Abeer Almohamade, Salma Kammoun and Fawaz Alsolami
J. Imaging 2025, 11(4), 108; https://doi.org/10.3390/jimaging11040108 - 31 Mar 2025
Abstract
Transforming images from day style to night style is crucial for enhancing perception in autonomous driving and smart surveillance. However, existing CycleGAN-based approaches struggle with texture loss, structural inconsistencies, and high computational costs. In our attempt to overcome these challenges, we produced LBP-CycleGAN,
[...] Read more.
Transforming images from day style to night style is crucial for enhancing perception in autonomous driving and smart surveillance. However, existing CycleGAN-based approaches struggle with texture loss, structural inconsistencies, and high computational costs. In our attempt to overcome these challenges, we produced LBP-CycleGAN, a new modification of CycleGAN that benefits from the advantages of a Local Binary Pattern (LBP) that extracts details of texture, unlike traditional CycleGAN, which relies heavily on color transformations. Our model leverages LBP-based single-channel inputs, ensuring sharper, more consistent night-time textures. We evaluated three model variations: (1) LBP-CycleGAN with a self-attention mechanism in both the generator and discriminator, (2) LBP-CycleGAN with a self-attention mechanism in the discriminator only, and (3) LBP-CycleGAN without a self-attention mechanism. Our results demonstrate that the LBP-CycleGAN model without self-attention outperformed the other models, achieving a superior texture quality while significantly reducing the training time and computational overhead. This work opens up new possibilities for efficient, high-fidelity night-time image translation in real-world applications, including autonomous driving and low-light vision systems.
Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Optimizing Digital Image Quality for Improved Skin Cancer Detection
by
Bogdan Dugonik, Marjan Golob, Marko Marhl and Aleksandra Dugonik
J. Imaging 2025, 11(4), 107; https://doi.org/10.3390/jimaging11040107 - 31 Mar 2025
Abstract
The rising incidence of skin cancer, particularly melanoma, underscores the need for improved diagnostic tools in dermatology. Accurate imaging plays a crucial role in early detection, yet challenges related to color accuracy, image distortion, and resolution persist, leading to diagnostic errors. This study
[...] Read more.
The rising incidence of skin cancer, particularly melanoma, underscores the need for improved diagnostic tools in dermatology. Accurate imaging plays a crucial role in early detection, yet challenges related to color accuracy, image distortion, and resolution persist, leading to diagnostic errors. This study addresses these issues by evaluating color reproduction accuracy across various imaging devices and lighting conditions. Using a ColorChecker test chart, color deviations were measured through Euclidean distances (ΔE*, ΔC*), and nonlinear color differences (ΔE00, ΔC00), while the color rendering index (CRI) and television lighting consistency index (TLCI) were used to evaluate the influence of light sources on image accuracy. Significant color discrepancies were identified among mobile phones, DSLRs, and mirrorless cameras, with inadequate dermatoscope lighting systems contributing to further inaccuracies. We demonstrate practical applications, including manual camera adjustments, grayscale reference cards, post-processing techniques, and optimized lighting conditions, to improve color accuracy. This study provides applicable solutions for enhancing color accuracy in dermatological imaging, emphasizing the need for standardized calibration techniques and imaging protocols to improve diagnostic reliability, support AI-assisted skin cancer detection, and contribute to high-quality image databases for clinical and automated analysis.
Full article
(This article belongs to the Special Issue Novel Approaches to Image Quality Assessment)
►▼
Show Figures

Figure 1
Open AccessArticle
Lightweight U-Net for Blood Vessels Segmentation in X-Ray Coronary Angiography
by
Jesus Salvador Ramos-Cortez, Dora E. Alvarado-Carrillo, Emmanuel Ovalle-Magallanes and Juan Gabriel Avina-Cervantes
J. Imaging 2025, 11(4), 106; https://doi.org/10.3390/jimaging11040106 - 30 Mar 2025
Abstract
Blood vessel segmentation in X-ray coronary angiography (XCA) plays a crucial role in diagnosing cardiovascular diseases, enabling a precise assessment of arterial structures. However, segmentation is challenging due to a low signal-to-noise ratio, interfering background structures, and vessel bifurcations, which hinder the accuracy
[...] Read more.
Blood vessel segmentation in X-ray coronary angiography (XCA) plays a crucial role in diagnosing cardiovascular diseases, enabling a precise assessment of arterial structures. However, segmentation is challenging due to a low signal-to-noise ratio, interfering background structures, and vessel bifurcations, which hinder the accuracy of deep learning models. Additionally, deep learning models for this task often require high computational resources, limiting their practical application in real-time clinical settings. This study proposes a lightweight variant of the U-Net architecture using a structured kernel pruning strategy inspired by the Lottery Ticket Hypothesis. The pruning method systematically removes entire convolutional filters from each layer based on a global reduction factor, generating compact subnetworks that retain key representational capacity. This results in a significantly smaller model without compromising the segmentation performance. This approach is evaluated on two benchmark datasets, demonstrating consistent improvements in segmentation accuracy compared to the vanilla U-Net. Additionally, model complexity is significantly reduced from 31 M to 1.9 M parameters, improving efficiency while maintaining high segmentation quality.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Inspection of Defective Glass Bottle Mouths Using Machine Learning
by
Daiki Tomita and Yue Bao
J. Imaging 2025, 11(4), 105; https://doi.org/10.3390/jimaging11040105 - 29 Mar 2025
Abstract
In this study, we proposed a method for detecting chips in the mouth of glass bottles using machine learning. In recent years, Japanese cosmetic glass bottles have gained attention for their advancements in manufacturing technology and eco-friendliness through the use of recycled glass,
[...] Read more.
In this study, we proposed a method for detecting chips in the mouth of glass bottles using machine learning. In recent years, Japanese cosmetic glass bottles have gained attention for their advancements in manufacturing technology and eco-friendliness through the use of recycled glass, leading to an increase in the volume of glass bottle exports overseas. Although cosmetic bottles are subject to strict quality inspections from the standpoint of safety, the complicated shape of the glass bottle mouths makes automated inspections difficult, and visual inspections have been the norm. Visual inspections conducted by workers have become problematic because it has become clear that the standard of judgment differs from worker to worker and that inspection accuracy deteriorates after long hours of work. To address these issues, the development of inspection systems for glass bottles using image processing and machine learning has been actively pursued. While conventional image processing methods can detect chips in glass bottles, the target glass bottles are those without screw threads, and the light from the light source is diffusely reflected by the screw threads in the glass bottles in this study, resulting in a loss of accuracy. Additionally, machine learning-based inspection methods are generally limited to the body and bottom of the bottle, excluding the mouth from analysis. To overcome these challenges, this study proposed a method to extract only the screw thread regions from the bottle image, using a dedicated machine learning model, and perform defect detection. To evaluate the effectiveness of the proposed approach, accuracy was assessed by training models using images of both the entire mouth and just the screw threads. Experimental results showed that the accuracy of the model trained using the image of the entire mouth was 98.0%, while the accuracy of the model trained using the image of the screw threads was 99.7%, indicating that the proposed method improves the accuracy by 1.7%. In a demonstration experiment using data obtained at a factory, the accuracy of the model trained using images of the entire mouth was 99.7%, whereas the accuracy of the model trained using images of screw threads was 100%, indicating that the proposed system can be used to detect chips in factories.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
Evaluating Super-Resolution Models in Biomedical Imaging: Applications and Performance in Segmentation and Classification
by
Mario Amoros, Manuel Curado and Jose F. Vicent
J. Imaging 2025, 11(4), 104; https://doi.org/10.3390/jimaging11040104 - 29 Mar 2025
Abstract
Super-resolution (SR) techniques have gained traction in biomedical imaging for their ability to enhance image quality. However, it remains unclear whether these improvements translate into better performance in clinical tasks. In this study, we provide a comprehensive evaluation of state-of-the-art SR models—including CNN-
[...] Read more.
Super-resolution (SR) techniques have gained traction in biomedical imaging for their ability to enhance image quality. However, it remains unclear whether these improvements translate into better performance in clinical tasks. In this study, we provide a comprehensive evaluation of state-of-the-art SR models—including CNN- and Transformer-based architectures—by assessing not only visual quality metrics (PSNR and SSIM) but also their downstream impact on segmentation and classification performance for lung CT scans. Using U-Net and ResNet architectures, we quantify how SR influences diagnostic tasks across different datasets, and we evaluate model generalization in cross-domain settings. Our findings show that advanced SR models such as SwinIR preserve diagnostic features effectively and, when appropriately applied, can enhance or maintain clinical performance even in low-resolution contexts. This work bridges the gap between image quality enhancement and practical clinical utility, providing actionable insights for integrating SR into real-world biomedical imaging workflows.
Full article
(This article belongs to the Special Issue Tools and Techniques for Improving Radiological Imaging Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Riemannian Manifolds for Biological Imaging Applications Based on Unsupervised Learning
by
Ilya Larin and Alexander Karabelsky
J. Imaging 2025, 11(4), 103; https://doi.org/10.3390/jimaging11040103 - 29 Mar 2025
Abstract
The development of neural networks has made the introduction of multimodal systems inevitable. Computer vision methods are still not widely used in biological research, despite their importance. It is time to recognize the significance of advances in feature extraction and real-time analysis of
[...] Read more.
The development of neural networks has made the introduction of multimodal systems inevitable. Computer vision methods are still not widely used in biological research, despite their importance. It is time to recognize the significance of advances in feature extraction and real-time analysis of information from cells. Teacherless learning for the image clustering task is of great interest. In particular, the clustering of single cells is of great interest. This study will evaluate the feasibility of using latent representation and clustering of single cells in various applications in the fields of medicine and biotechnology. Of particular interest are embeddings, which relate to the morphological characterization of cells. Studies of C2C12 cells will reveal more about aspects of muscle differentiation by using neural networks. This work focuses on analyzing the applicability of the latent space to extract morphological features. Like many researchers in this field, we note that obtaining high-quality latent representations for phase-contrast or bright-field images opens new frontiers for creating large visual-language models. Graph structures are the main approaches to non-Euclidean manifolds. Graph-based segmentation has a long history, e.g., the normalized cuts algorithm treated segmentation as a graph partitioning problem—but only recently have such ideas merged with deep learning in an unsupervised manner. Recently, a number of works have shown the advantages of hyperbolic embeddings in vision tasks, including clustering and classification based on the Poincaré ball model. One area worth highlighting is unsupervised segmentation, which we believe is undervalued, particularly in the context of non-Euclidean spaces. In this approach, we aim to mark the beginning of our future work on integrating visual information and biological aspects of individual cells to multimodal space in comparative studies in vitro.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Hierarchical Knowledge Transfer: Cross-Layer Distillation for Industrial Anomaly Detection
by
Junning Xu and Sanxin Jiang
J. Imaging 2025, 11(4), 102; https://doi.org/10.3390/jimaging11040102 - 28 Mar 2025
Abstract
There are two problems with traditional knowledge distillation methods in industrial anomaly detection: first, traditional methods mostly use feature alignment between the same layers. The second is that similar or even identical structures are usually used to build teacher-student models, thus limiting the
[...] Read more.
There are two problems with traditional knowledge distillation methods in industrial anomaly detection: first, traditional methods mostly use feature alignment between the same layers. The second is that similar or even identical structures are usually used to build teacher-student models, thus limiting the ability to represent anomalies in multiple ways. To address these issues, this work proposes a Hierarchical Knowledge Transfer (HKT) framework for detecting industrial surface anomalies. First, HKT utilizes the deep knowledge of the highest feature layer in the teacher’s network to guide student learning at every level, thus enabling cross-layer interactions. Multiple projectors are built inside the model to facilitate the teacher in transferring knowledge to each layer of the student. Second, the teacher-student structural symmetry is decoupled by embedding Convolutional Block Attention Modules (CBAM) in the student network. Finally, based on HKT, a more powerful anomaly detection model, HKT+, is developed. By adding two additional convolutional layers to the teacher and student networks of HKT, HKT+ achieves enhanced detection capabilities at the cost of a relatively small increase in model parameters. Experiments on the MVTec AD and BeanTech AD(BTAD) datasets show that HKT+ achieves state-of-the-art performance with average area under the receiver operating characteristic curve (AUROC) scores of 98.69% and 94.58%, respectively, which outperforms most current state-of-the-art methods.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
Dual-Stream Contrastive Latent Learning Generative Adversarial Network for Brain Image Synthesis and Tumor Classification
by
Junaid Zafar, Vincent Koc and Haroon Zafar
J. Imaging 2025, 11(4), 101; https://doi.org/10.3390/jimaging11040101 - 28 Mar 2025
Abstract
Generative adversarial networks (GANs) prioritize pixel-level attributes over capturing the entire image distribution, which is critical in image synthesis. To address this challenge, we propose a dual-stream contrastive latent projection generative adversarial network (DSCLPGAN) for the robust augmentation of MRI images. The dual-stream
[...] Read more.
Generative adversarial networks (GANs) prioritize pixel-level attributes over capturing the entire image distribution, which is critical in image synthesis. To address this challenge, we propose a dual-stream contrastive latent projection generative adversarial network (DSCLPGAN) for the robust augmentation of MRI images. The dual-stream generator in our architecture incorporates two specialized processing pathways: one is dedicated to local feature variation modeling, while the other captures global structural transformations, ensuring a more comprehensive synthesis of medical images. We used a transformer-based encoder–decoder framework for contextual coherence and the contrastive learning projection (CLP) module integrates contrastive loss into the latent space for generating diverse image samples. The generated images undergo adversarial refinement using an ensemble of specialized discriminators, where discriminator 1 (D1) ensures classification consistency with real MRI images, discriminator 2 (D2) produces a probability map of localized variations, and discriminator 3 (D3) preserves structural consistency. For validation, we utilized a publicly available MRI dataset which contains 3064 T1-weighted contrast-enhanced images with three types of brain tumors: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices). The experimental results demonstrate state-of-the-art performance, achieving an SSIM of 0.99, classification accuracy of 99.4% for an augmentation diversity level of 5, and a PSNR of 34.6 dB. Our approach has the potential of generating high-fidelity augmentations for reliable AI-driven clinical decision support systems.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessSystematic Review
A Systematic Review of Medical Image Quality Assessment
by
H. M. S. S. Herath, H. M. K. K. M. B. Herath, Nuwan Madusanka and Byeong-Il Lee
J. Imaging 2025, 11(4), 100; https://doi.org/10.3390/jimaging11040100 - 27 Mar 2025
Abstract
Medical image quality assessment (MIQA) is vital in medical imaging and directly affects diagnosis, patient treatment, and general clinical results. Accurate and high-quality imaging is necessary to make accurate diagnoses, efficiently design treatments, and consistently monitor diseases. This review summarizes forty-two research studies
[...] Read more.
Medical image quality assessment (MIQA) is vital in medical imaging and directly affects diagnosis, patient treatment, and general clinical results. Accurate and high-quality imaging is necessary to make accurate diagnoses, efficiently design treatments, and consistently monitor diseases. This review summarizes forty-two research studies on diverse MIQA approaches and their effects on performance in diagnostics, patient results, and efficiency in the process. It contrasts subjective (manual assessment) and objective (rule-driven) evaluation methods, underscores the growing promise of machine intelligence and machine learning (ML) in MIQA automation, and describes the existing MIQA challenges. AI-powered tools are revolutionizing MIQA with automated quality checks, noise reduction, and artifact removal, producing consistent and reliable imaging evaluation. Enhanced image quality is demonstrated in every examination to improve diagnostic precision and support decision making in the clinic. However, challenges still exist, such as variability in quality and variability in human ratings and small datasets hindering standardization. These must be addressed with better-quality data, low-cost labeling, and standardization. Ultimately, this paper reinforces the need for high-quality medical imaging and the potential of MIQA with the power of AI. It is crucial to advance research in this area to advance healthcare.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
by
Zhaoyu Shou, Yanjun Lin, Jianwen Mo and Ziyong Wu
J. Imaging 2025, 11(4), 99; https://doi.org/10.3390/jimaging11040099 - 27 Mar 2025
Abstract
►▼
Show Figures
The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze
[...] Read more.
The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze estimation models lack a focus on extracting essential features and fail to effectively model the spatio-temporal relationships among the head, face, and eye regions, which limits their ability to achieve lower angular errors. This paper proposes an appearance-based gaze estimation model (RSP-MCGaze). The model constructs a feature extraction backbone network for gaze estimation (ResNetSC) by integrating ResNet and SCConv; this integration enhances the model’s ability to extract important features while reducing spatial and channel redundancy. Based on the ResNetSC backbone, the method for video gaze estimation was further optimized by jointly locating the head, eyes, and face. The experimental results demonstrate that our model achieves significantly higher performance compared to existing baseline models on public datasets, thereby fully confirming the superiority of our method in the gaze estimation task. The model achieves a detection error of 9.86 on the Gaze360 dataset and a detection error of 7.11 on the detectable face subset of Gaze360.
Full article

Figure 1
Open AccessArticle
Assessing Cancer Presence in Prostate MRI Using Multi-Encoder Cross-Attention Networks
by
Avtantil Dimitriadis, Grigorios Kalliatakis, Richard Osuala, Dimitri Kessler, Simone Mazzetti, Daniele Regge, Oliver Diaz, Karim Lekadir, Dimitrios Fotiadis, Manolis Tsiknakis, Nikolaos Papanikolaou, ProCAncer-I Consortium and Kostas Marias
J. Imaging 2025, 11(4), 98; https://doi.org/10.3390/jimaging11040098 - 26 Mar 2025
Abstract
Prostate cancer (PCa) is currently the second most prevalent cancer among men. Accurate diagnosis of PCa can provide effective treatment for patients and reduce mortality. Previous works have merely focused on either lesion detection or lesion classification of PCa from magnetic resonance imaging
[...] Read more.
Prostate cancer (PCa) is currently the second most prevalent cancer among men. Accurate diagnosis of PCa can provide effective treatment for patients and reduce mortality. Previous works have merely focused on either lesion detection or lesion classification of PCa from magnetic resonance imaging (MRI). In this work we focus on a critical, yet underexplored task of the PCa clinical workflow: distinguishing cases with cancer presence (pathologically confirmed PCa patients) from conditions with no suspicious PCa findings (no cancer presence). To this end, we conduct large-scale experiments for this task for the first time by adopting and processing the multi-centric ProstateNET Imaging Archive which contains more than 6 million image representations of PCa from more than 11,000 PCa cases, representing the largest collection of PCa MR images. Bi-parametric MR (bpMRI) images of 4504 patients alongside their clinical variables are used for training, while the architectures are evaluated on two hold-out test sets of 975 retrospective and 435 prospective patients. Our proposed multi-encoder-cross-attention-fusion architecture achieved a promising area under the receiver operating characteristic curve (AUC) of 0.91. This demonstrates our method’s capability of fusing complex bi-parametric imaging modalities and enhancing model robustness, paving the way towards the clinical adoption of deep learning models for accurately determining the presence of PCa across patient populations.
Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Self-Supervised Multi-Task Learning for the Detection and Classification of RHD-Induced Valvular Pathology
by
Lorna Mugambi, Ciira wa Maina and Liesl Zühlke
J. Imaging 2025, 11(4), 97; https://doi.org/10.3390/jimaging11040097 - 25 Mar 2025
Abstract
Rheumatic heart disease (RHD) poses a significant global health challenge, necessitating improved diagnostic tools. This study investigated the use of self-supervised multi-task learning for automated echocardiographic analysis, aiming to predict echocardiographic views, diagnose RHD conditions, and determine severity. We compared two prominent self-supervised
[...] Read more.
Rheumatic heart disease (RHD) poses a significant global health challenge, necessitating improved diagnostic tools. This study investigated the use of self-supervised multi-task learning for automated echocardiographic analysis, aiming to predict echocardiographic views, diagnose RHD conditions, and determine severity. We compared two prominent self-supervised learning (SSL) methods: DINOv2, a vision-transformer-based approach known for capturing implicit features, and simple contrastive learning representation (SimCLR), a ResNet-based contrastive learning method recognised for its simplicity and effectiveness. Both models were pre-trained on a large, unlabelled echocardiogram dataset and fine-tuned on a smaller, labelled subset. DINOv2 achieved accuracies of 92% for view classification, 98% for condition detection, and 99% for severity assessment. SimCLR demonstrated good performance as well, achieving accuracies of 99% for view classification, 92% for condition detection, and 96% for severity assessment. Embedding visualisations, using both Uniform Manifold Approximation Projection (UMAP) and t-distributed Stochastic Neighbor Embedding (t-SNE), revealed distinct clusters for all tasks in both models, indicating the effective capture of the discriminative features of the echocardiograms. This study demonstrates the potential of using self-supervised multi-task learning for automated echocardiogram analysis, offering a scalable and efficient approach to improving RHD diagnosis, especially in resource-limited settings.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Identification of Textile Fibres Using a Near Infra-Red (NIR) Camera
by
Fariborz Eghtedari, Leszek Pecyna, Rhys Evans, Alan Pestell, Stuart McLeod and Shan Dulanty
J. Imaging 2025, 11(4), 96; https://doi.org/10.3390/jimaging11040096 - 25 Mar 2025
Abstract
Accurate detection of textile composition is a major challenge for textile reuse/recycling. This paper investigates the feasibility of identification of textile materials using a Near Infra-Red (NIR) camera. A transportable metric has been defined which could be capable of identification and distinction between
[...] Read more.
Accurate detection of textile composition is a major challenge for textile reuse/recycling. This paper investigates the feasibility of identification of textile materials using a Near Infra-Red (NIR) camera. A transportable metric has been defined which could be capable of identification and distinction between cotton and polyester. The NIR camera provides a single data value in the form of the “intensity” of the exposed light at each pixel across its 2D pixel array. The feasibility of textile material identification was investigated using a combination of various statistical methods to evaluate the output images from the NIR camera when a bandpass filter was attached to the camera’s lens. A repeatable and stable metric was identified and was shown to be independent of both the camera’s exposure setting and the physical illumination spread over the textiles. The average value of the identified metric for the most suitable bandpass filter was found to be 0.68 for cotton, with a maximum deviation of 2%, and 1.0 for polyester, with a maximum deviation of 1%. It was further shown that carbon black dye, a known challenge in the industry, was easily detectable by the system, and, using the proposed technique in this paper, areas that are not covered by carbon black dye can be identified and analysed.
Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Analysis of Physical Features Affecting Glossiness and Roughness Alteration in Image Reproduction and Image Features for Their Recovery
by
Midori Tanaka, Hideyuki Ajiki and Takahiko Horiuchi
J. Imaging 2025, 11(4), 95; https://doi.org/10.3390/jimaging11040095 - 25 Mar 2025
Abstract
Digital imaging can cause the perception of an appearance that is different from the real object. This study first confirmed that the glossiness and roughness of reproduced images are altered by directly comparing real and colorimetrically reproduced images (CRIs). Then, psychophysical experiments comparing
[...] Read more.
Digital imaging can cause the perception of an appearance that is different from the real object. This study first confirmed that the glossiness and roughness of reproduced images are altered by directly comparing real and colorimetrically reproduced images (CRIs). Then, psychophysical experiments comparing real and modulated images were performed, and the physical features that influence the alteration of the real object were analyzed. Furthermore, we analyzed the image features to recover the altered glossiness and roughness by image reproduction. In total, 67 samples belonging to 11 material categories, including metals, resins, etc., were used as stimuli. Analysis of the physical surface roughness of real objects showed that the low skewness and high kurtosis of samples were associated with alterations in glossiness and roughness, respectively. It was shown that these can be recovered by modulating the contrast for glossiness and the angular second moment in the gray level co-occurrence matrix for roughness, reproducing perceptually equivalent images. These results suggest that although the glossiness and roughness of real objects and their CRIs are perceived differently, reproducing perceptually equivalent glossiness and roughness may be facilitated by measuring the physical features of real objects and reflecting them in image features.
Full article
(This article belongs to the Special Issue Color in Image Processing and Computer Vision)
►▼
Show Figures

Figure 1
Open AccessArticle
Large-Scale Coastal Marine Wildlife Monitoring with Aerial Imagery
by
Octavio Ascagorta, María Débora Pollicelli, Francisco Ramiro Iaconis, Elena Eder, Mathías Vázquez-Sano and Claudio Delrieux
J. Imaging 2025, 11(4), 94; https://doi.org/10.3390/jimaging11040094 - 24 Mar 2025
Abstract
►▼
Show Figures
Monitoring coastal marine wildlife is crucial for biodiversity conservation, environmental management, and sustainable utilization of tourism-related natural assets. Conducting in situ censuses and population studies in extensive and remote marine habitats often faces logistical constraints, necessitating the adoption of advanced technologies to enhance
[...] Read more.
Monitoring coastal marine wildlife is crucial for biodiversity conservation, environmental management, and sustainable utilization of tourism-related natural assets. Conducting in situ censuses and population studies in extensive and remote marine habitats often faces logistical constraints, necessitating the adoption of advanced technologies to enhance the efficiency and accuracy of monitoring efforts. This study investigates the utilization of aerial imagery and deep learning methodologies for the automated detection, classification, and enumeration of marine-coastal species. A comprehensive dataset of high-resolution images, captured by drones and aircrafts over southern elephant seal (Mirounga leonina) and South American sea lion (Otaria flavescens) colonies in the Valdés Peninsula, Patagonia, Argentina, was curated and annotated. Using this annotated dataset, a deep learning framework was developed and trained to identify and classify individual animals. The resulting model may help produce automated, accurate population metrics that support the analysis of ecological dynamics. The resulting model achieved F1 scores of between 0.7 and 0.9, depending on the type of individual. Among its contributions, this methodology provided essential insights into the impacts of emergent threats, such as the outbreak of the highly pathogenic avian influenza virus H5N1 during the 2023 austral spring season, which caused significant mortality in these species.
Full article

Figure 1
Open AccessArticle
A Color-Based Multispectral Imaging Approach for a Human Detection Camera
by
Shuji Ono
J. Imaging 2025, 11(4), 93; https://doi.org/10.3390/jimaging11040093 - 21 Mar 2025
Abstract
In this study, we propose a color-based multispectral approach using four selected wavelengths (453, 556, 668, and 708 nm) from the visible to near-infrared range to separate clothing from the background. Our goal is to develop a human detection camera that supports real-time
[...] Read more.
In this study, we propose a color-based multispectral approach using four selected wavelengths (453, 556, 668, and 708 nm) from the visible to near-infrared range to separate clothing from the background. Our goal is to develop a human detection camera that supports real-time processing, particularly under daytime conditions and for common fabrics. While conventional deep learning methods can detect humans accurately, they often require large computational resources and struggle with partially occluded objects. In contrast, we treat clothing detection as a proxy for human detection and construct a lightweight machine learning model (multi-layer perceptron) based on these four wavelengths. Without relying on full spectral data, this method achieves an accuracy of 0.95, precision of 0.97, recall of 0.93, and an F1-score of 0.95. Because our color-driven detection relies on pixel-wise spectral reflectance rather than spatial patterns, it remains computationally efficient. A simple four-band camera configuration could thus facilitate real-time human detection. Potential applications include pedestrian detection in autonomous driving, security surveillance, and disaster victim searches.
Full article
(This article belongs to the Special Issue Color in Image Processing and Computer Vision)
►▼
Show Figures

Figure 1
Open AccessArticle
Synergistic Multi-Granularity Rough Attention UNet for Polyp Segmentation
by
Jing Wang and Chia S. Lim
J. Imaging 2025, 11(4), 92; https://doi.org/10.3390/jimaging11040092 - 21 Mar 2025
Abstract
Automatic polyp segmentation in colonoscopic images is crucial for the early detection and treatment of colorectal cancer. However, complex backgrounds, diverse polyp morphologies, and ambiguous boundaries make this task difficult. To address these issues, we propose the Synergistic Multi-Granularity Rough Attention U-Net (S-MGRAUNet),
[...] Read more.
Automatic polyp segmentation in colonoscopic images is crucial for the early detection and treatment of colorectal cancer. However, complex backgrounds, diverse polyp morphologies, and ambiguous boundaries make this task difficult. To address these issues, we propose the Synergistic Multi-Granularity Rough Attention U-Net (S-MGRAUNet), which integrates three key modules: the Multi-Granularity Hybrid Filtering (MGHF) module for extracting multi-scale contextual information, the Dynamic Granularity Partition Synergy (DGPS) module for enhancing polyp-background differentiation through adaptive feature interaction, and the Multi-Granularity Rough Attention (MGRA) mechanism for further optimizing boundary recognition. Extensive experiments on the ColonDB and CVC-300 datasets demonstrate that S-MGRAUNet significantly outperforms existing methods while achieving competitive results on the Kvasir-SEG and ClinicDB datasets, validating its segmentation accuracy, robustness, and generalization capability, all while effectively reducing computational complexity. This study highlights the value of multi-granularity feature extraction and attention mechanisms, providing new insights and practical guidance for advancing multi-granularity theories in medical image segmentation.
Full article
(This article belongs to the Special Issue Advances in Biomedical Image Processing and Artificial Intelligence for Computer-Aided Diagnosis in Medicine)
►▼
Show Figures

Figure 1
Journal Menu
► ▼ Journal Menu-
- J. Imaging Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Special Issues
- Topics
- Sections
- Article Processing Charge
- Indexing & Archiving
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Conferences
- Editorial Office
- 10th Anniversary
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Applied Sciences, Computation, Entropy, J. Imaging, Optics
Color Image Processing: Models and Methods (CIP: MM)
Topic Editors: Giuliana Ramella, Isabella TorcicolloDeadline: 30 July 2025
Topic in
Applied Sciences, Bioengineering, Diagnostics, J. Imaging, Signals
Signal Analysis and Biomedical Imaging for Precision Medicine
Topic Editors: Surbhi Bhatia Khan, Mo SaraeeDeadline: 31 August 2025
Topic in
Animals, Computers, Information, J. Imaging, Veterinary Sciences
AI, Deep Learning, and Machine Learning in Veterinary Science Imaging
Topic Editors: Vitor Filipe, Lio Gonçalves, Mário GinjaDeadline: 31 October 2025
Topic in
Applied Sciences, Electronics, MAKE, J. Imaging, Sensors
Applied Computer Vision and Pattern Recognition: 2nd Edition
Topic Editors: Antonio Fernández-Caballero, Byung-Gyu KimDeadline: 31 December 2025
Conferences
Special Issues
Special Issue in
J. Imaging
Object Detection in Video Surveillance Systems
Guest Editors: Jesús Ruiz-Santaquiteria Alegre, Juan Antonio Álvarez García, Harbinder SinghDeadline: 30 April 2025
Special Issue in
J. Imaging
Recent Advancements in 3D Imaging
Guest Editor: Guofeng MeiDeadline: 30 April 2025
Special Issue in
J. Imaging
Explainable AI in Computer Vision
Guest Editor: Bas Van der VeldenDeadline: 30 April 2025
Special Issue in
J. Imaging
Novel Approaches to Image Quality Assessment
Guest Editors: Luigi Celona, Hanhe LinDeadline: 30 April 2025



