Sign in to use this feature.

Years

Between: -

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,836)

Search Parameters:
Journal = J. Imaging

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 4886 KiB  
Article
The Aesthetic Appreciation of Multi-Stable Images
by Levin Saracbasi and Heiko Hecht
J. Imaging 2025, 11(4), 111; https://doi.org/10.3390/jimaging11040111 - 4 Apr 2025
Viewed by 126
Abstract
Does the quality that renders multi-stable images fascinating, the sudden perceptual reorganization, the switching from one interpretation into another, also make these images appear beautiful? Or is the aesthetic quality of multi-stable figures unrelated to the ease with which they switch? Across two [...] Read more.
Does the quality that renders multi-stable images fascinating, the sudden perceptual reorganization, the switching from one interpretation into another, also make these images appear beautiful? Or is the aesthetic quality of multi-stable figures unrelated to the ease with which they switch? Across two experiments, we presented multi-stable images and manipulated their perceptual stability. We also presented their unambiguous components in isolation. In the first experiment, this manipulation targeted the inherent stimulus stability through properties like figural size and composition. The second experiment added an instruction for observers to actively control the stability, by attempting to either enhance or prevent perceptual switches as best they could. We found that higher stability was associated with higher liking, positive valence, and lower arousal. This increase in appreciation was mainly driven by inherent stimulus properties. The stability instruction only increased the liking of figures that had been comparatively stable to begin with. We conclude that the fascinating feature of multi-stable images does not contribute to their aesthetic liking. In fact, perceptual switching is detrimental to it. Processing fluency can explain this counterintuitive finding. We also discuss the role of ambiguity in the aesthetic quality of multi-stable images. Full article
Show Figures

Figure 1

18 pages, 4882 KiB  
Review
Artificial Intelligence in Placental Pathology: New Diagnostic Imaging Tools in Evolution and in Perspective
by Antonio d’Amati, Giorgio Maria Baldini, Tommaso Difonzo, Angela Santoro, Miriam Dellino, Gerardo Cazzato, Antonio Malvasi, Antonella Vimercati, Leonardo Resta, Gian Franco Zannoni and Eliano Cascardi
J. Imaging 2025, 11(4), 110; https://doi.org/10.3390/jimaging11040110 - 3 Apr 2025
Viewed by 61
Abstract
Artificial intelligence (AI) has emerged as a transformative tool in placental pathology, offering novel diagnostic methods that promise to improve accuracy, reduce inter-observer variability, and positively impact pregnancy outcomes. The primary objective of this review is to summarize recent developments in AI applications [...] Read more.
Artificial intelligence (AI) has emerged as a transformative tool in placental pathology, offering novel diagnostic methods that promise to improve accuracy, reduce inter-observer variability, and positively impact pregnancy outcomes. The primary objective of this review is to summarize recent developments in AI applications tailored specifically to placental histopathology. Current AI-driven approaches include advanced digital image analysis, three-dimensional placental reconstruction, and deep learning models such as GestAltNet for precise gestational age estimation and automated identification of histological lesions, including decidual vasculopathy and maternal vascular malperfusion. Despite these advancements, significant challenges remain, notably dataset heterogeneity, interpretative limitations of current AI algorithms, and issues regarding model transparency. We critically address these limitations by proposing targeted solutions, such as augmenting training datasets with annotated artifacts, promoting explainable AI methods, and enhancing cross-institutional collaborations. Finally, we outline future research directions, emphasizing the refinement of AI algorithms for routine clinical integration and fostering interdisciplinary cooperation among pathologists, computational researchers, and clinical specialists. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

16 pages, 5365 KiB  
Article
Validation of Quantitative Ultrasound and Texture Derivative Analyses-Based Model for Upfront Prediction of Neoadjuvant Chemotherapy Response in Breast Cancer
by Adrian Wai Chan, Lakshmanan Sannachi, Daniel Moore-Palhares, Archya Dasgupta, Sonal Gandhi, Rossanna Pezo, Andrea Eisen, Ellen Warner, Frances C. Wright, Nicole Look Hong, Ali Sadeghi-Naini, Mia Skarpathiotakis, Belinda Curpen, Carrie Betel, Michael C. Kolios, Maureen Trudeau and Gregory J. Czarnota
J. Imaging 2025, 11(4), 109; https://doi.org/10.3390/jimaging11040109 - 3 Apr 2025
Viewed by 40
Abstract
This work was conducted in order to validate a pre-treatment quantitative ultrasound (QUS) and texture derivative analyses-based prediction model proposed in our previous study to identify responders and non-responders to neoadjuvant chemotherapy in patients with breast cancer. The validation cohort consisted of 56 [...] Read more.
This work was conducted in order to validate a pre-treatment quantitative ultrasound (QUS) and texture derivative analyses-based prediction model proposed in our previous study to identify responders and non-responders to neoadjuvant chemotherapy in patients with breast cancer. The validation cohort consisted of 56 breast cancer patients diagnosed between the years 2018 and 2021. Among all patients, 53 were treated with neoadjuvant chemotherapy and three had unplanned changes in their chemotherapy cycles. Radio Frequency (RF) data were collected volumetrically prior to the start of chemotherapy. In addition to tumour region (core), a 5 mm tumour-margin was also chosen for parameters estimation. The prediction model, which was developed previously based on quantitative ultrasound, texture derivative, and tumour molecular subtypes, was used to identify responders and non-responders. The actual response, which was determined by clinical and pathological assessment after lumpectomy or mastectomy, was then compared to the predicted response. The sensitivity, specificity, positive predictive value, negative predictive value, and F1 score for determining chemotherapy response of all patients in the validation cohort were 94%, 67%, 96%, 57%, and 95%, respectively. Removing patients who had unplanned changes in their chemotherapy resulted in a sensitivity, specificity, positive predictive value, negative predictive value, and F1 score of all patients in the validation cohort of 94%, 100%, 100%, 50%, and 97%, respectively. Explanations for the misclassified cases included unplanned modifications made to the type of chemotherapy during treatment, inherent limitations of the predictive model, presence of DCIS in tumour structure, and an ill-defined tumour border in a minority of cases. Validation of a model was conducted in an independent cohort of patient for the first time to predict the tumour response to neoadjuvant chemotherapy using quantitative ultrasound, texture derivate, and molecular features in patients with breast cancer. Further research is needed to improve the positive predictive value and evaluate whether the treatment outcome can be improved in predicted non-responders by switching to other treatment options. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

18 pages, 4664 KiB  
Article
Local Binary Pattern–Cycle Generative Adversarial Network Transfer: Transforming Image Style from Day to Night
by Abeer Almohamade, Salma Kammoun and Fawaz Alsolami
J. Imaging 2025, 11(4), 108; https://doi.org/10.3390/jimaging11040108 - 31 Mar 2025
Viewed by 102
Abstract
Transforming images from day style to night style is crucial for enhancing perception in autonomous driving and smart surveillance. However, existing CycleGAN-based approaches struggle with texture loss, structural inconsistencies, and high computational costs. In our attempt to overcome these challenges, we produced LBP-CycleGAN, [...] Read more.
Transforming images from day style to night style is crucial for enhancing perception in autonomous driving and smart surveillance. However, existing CycleGAN-based approaches struggle with texture loss, structural inconsistencies, and high computational costs. In our attempt to overcome these challenges, we produced LBP-CycleGAN, a new modification of CycleGAN that benefits from the advantages of a Local Binary Pattern (LBP) that extracts details of texture, unlike traditional CycleGAN, which relies heavily on color transformations. Our model leverages LBP-based single-channel inputs, ensuring sharper, more consistent night-time textures. We evaluated three model variations: (1) LBP-CycleGAN with a self-attention mechanism in both the generator and discriminator, (2) LBP-CycleGAN with a self-attention mechanism in the discriminator only, and (3) LBP-CycleGAN without a self-attention mechanism. Our results demonstrate that the LBP-CycleGAN model without self-attention outperformed the other models, achieving a superior texture quality while significantly reducing the training time and computational overhead. This work opens up new possibilities for efficient, high-fidelity night-time image translation in real-world applications, including autonomous driving and low-light vision systems. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

22 pages, 5756 KiB  
Article
Optimizing Digital Image Quality for Improved Skin Cancer Detection
by Bogdan Dugonik, Marjan Golob, Marko Marhl and Aleksandra Dugonik
J. Imaging 2025, 11(4), 107; https://doi.org/10.3390/jimaging11040107 - 31 Mar 2025
Viewed by 89
Abstract
The rising incidence of skin cancer, particularly melanoma, underscores the need for improved diagnostic tools in dermatology. Accurate imaging plays a crucial role in early detection, yet challenges related to color accuracy, image distortion, and resolution persist, leading to diagnostic errors. This study [...] Read more.
The rising incidence of skin cancer, particularly melanoma, underscores the need for improved diagnostic tools in dermatology. Accurate imaging plays a crucial role in early detection, yet challenges related to color accuracy, image distortion, and resolution persist, leading to diagnostic errors. This study addresses these issues by evaluating color reproduction accuracy across various imaging devices and lighting conditions. Using a ColorChecker test chart, color deviations were measured through Euclidean distances (ΔE*, ΔC*), and nonlinear color differences (ΔE00, ΔC00), while the color rendering index (CRI) and television lighting consistency index (TLCI) were used to evaluate the influence of light sources on image accuracy. Significant color discrepancies were identified among mobile phones, DSLRs, and mirrorless cameras, with inadequate dermatoscope lighting systems contributing to further inaccuracies. We demonstrate practical applications, including manual camera adjustments, grayscale reference cards, post-processing techniques, and optimized lighting conditions, to improve color accuracy. This study provides applicable solutions for enhancing color accuracy in dermatological imaging, emphasizing the need for standardized calibration techniques and imaging protocols to improve diagnostic reliability, support AI-assisted skin cancer detection, and contribute to high-quality image databases for clinical and automated analysis. Full article
(This article belongs to the Special Issue Novel Approaches to Image Quality Assessment)
Show Figures

Figure 1

12 pages, 1100 KiB  
Article
Lightweight U-Net for Blood Vessels Segmentation in X-Ray Coronary Angiography
by Jesus Salvador Ramos-Cortez, Dora E. Alvarado-Carrillo, Emmanuel Ovalle-Magallanes and Juan Gabriel Avina-Cervantes
J. Imaging 2025, 11(4), 106; https://doi.org/10.3390/jimaging11040106 - 30 Mar 2025
Viewed by 68
Abstract
Blood vessel segmentation in X-ray coronary angiography (XCA) plays a crucial role in diagnosing cardiovascular diseases, enabling a precise assessment of arterial structures. However, segmentation is challenging due to a low signal-to-noise ratio, interfering background structures, and vessel bifurcations, which hinder the accuracy [...] Read more.
Blood vessel segmentation in X-ray coronary angiography (XCA) plays a crucial role in diagnosing cardiovascular diseases, enabling a precise assessment of arterial structures. However, segmentation is challenging due to a low signal-to-noise ratio, interfering background structures, and vessel bifurcations, which hinder the accuracy of deep learning models. Additionally, deep learning models for this task often require high computational resources, limiting their practical application in real-time clinical settings. This study proposes a lightweight variant of the U-Net architecture using a structured kernel pruning strategy inspired by the Lottery Ticket Hypothesis. The pruning method systematically removes entire convolutional filters from each layer based on a global reduction factor, generating compact subnetworks that retain key representational capacity. This results in a significantly smaller model without compromising the segmentation performance. This approach is evaluated on two benchmark datasets, demonstrating consistent improvements in segmentation accuracy compared to the vanilla U-Net. Additionally, model complexity is significantly reduced from 31 M to 1.9 M parameters, improving efficiency while maintaining high segmentation quality. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

19 pages, 9360 KiB  
Article
Inspection of Defective Glass Bottle Mouths Using Machine Learning
by Daiki Tomita and Yue Bao
J. Imaging 2025, 11(4), 105; https://doi.org/10.3390/jimaging11040105 - 29 Mar 2025
Viewed by 76
Abstract
In this study, we proposed a method for detecting chips in the mouth of glass bottles using machine learning. In recent years, Japanese cosmetic glass bottles have gained attention for their advancements in manufacturing technology and eco-friendliness through the use of recycled glass, [...] Read more.
In this study, we proposed a method for detecting chips in the mouth of glass bottles using machine learning. In recent years, Japanese cosmetic glass bottles have gained attention for their advancements in manufacturing technology and eco-friendliness through the use of recycled glass, leading to an increase in the volume of glass bottle exports overseas. Although cosmetic bottles are subject to strict quality inspections from the standpoint of safety, the complicated shape of the glass bottle mouths makes automated inspections difficult, and visual inspections have been the norm. Visual inspections conducted by workers have become problematic because it has become clear that the standard of judgment differs from worker to worker and that inspection accuracy deteriorates after long hours of work. To address these issues, the development of inspection systems for glass bottles using image processing and machine learning has been actively pursued. While conventional image processing methods can detect chips in glass bottles, the target glass bottles are those without screw threads, and the light from the light source is diffusely reflected by the screw threads in the glass bottles in this study, resulting in a loss of accuracy. Additionally, machine learning-based inspection methods are generally limited to the body and bottom of the bottle, excluding the mouth from analysis. To overcome these challenges, this study proposed a method to extract only the screw thread regions from the bottle image, using a dedicated machine learning model, and perform defect detection. To evaluate the effectiveness of the proposed approach, accuracy was assessed by training models using images of both the entire mouth and just the screw threads. Experimental results showed that the accuracy of the model trained using the image of the entire mouth was 98.0%, while the accuracy of the model trained using the image of the screw threads was 99.7%, indicating that the proposed method improves the accuracy by 1.7%. In a demonstration experiment using data obtained at a factory, the accuracy of the model trained using images of the entire mouth was 99.7%, whereas the accuracy of the model trained using images of screw threads was 100%, indicating that the proposed system can be used to detect chips in factories. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

11 pages, 1088 KiB  
Article
Evaluating Super-Resolution Models in Biomedical Imaging: Applications and Performance in Segmentation and Classification
by Mario Amoros, Manuel Curado and Jose F. Vicent
J. Imaging 2025, 11(4), 104; https://doi.org/10.3390/jimaging11040104 - 29 Mar 2025
Viewed by 137
Abstract
Super-resolution (SR) techniques have gained traction in biomedical imaging for their ability to enhance image quality. However, it remains unclear whether these improvements translate into better performance in clinical tasks. In this study, we provide a comprehensive evaluation of state-of-the-art SR models—including CNN- [...] Read more.
Super-resolution (SR) techniques have gained traction in biomedical imaging for their ability to enhance image quality. However, it remains unclear whether these improvements translate into better performance in clinical tasks. In this study, we provide a comprehensive evaluation of state-of-the-art SR models—including CNN- and Transformer-based architectures—by assessing not only visual quality metrics (PSNR and SSIM) but also their downstream impact on segmentation and classification performance for lung CT scans. Using U-Net and ResNet architectures, we quantify how SR influences diagnostic tasks across different datasets, and we evaluate model generalization in cross-domain settings. Our findings show that advanced SR models such as SwinIR preserve diagnostic features effectively and, when appropriately applied, can enhance or maintain clinical performance even in low-resolution contexts. This work bridges the gap between image quality enhancement and practical clinical utility, providing actionable insights for integrating SR into real-world biomedical imaging workflows. Full article
(This article belongs to the Special Issue Tools and Techniques for Improving Radiological Imaging Applications)
Show Figures

Figure 1

13 pages, 5340 KiB  
Article
Riemannian Manifolds for Biological Imaging Applications Based on Unsupervised Learning
by Ilya Larin and Alexander Karabelsky
J. Imaging 2025, 11(4), 103; https://doi.org/10.3390/jimaging11040103 - 29 Mar 2025
Viewed by 227
Abstract
The development of neural networks has made the introduction of multimodal systems inevitable. Computer vision methods are still not widely used in biological research, despite their importance. It is time to recognize the significance of advances in feature extraction and real-time analysis of [...] Read more.
The development of neural networks has made the introduction of multimodal systems inevitable. Computer vision methods are still not widely used in biological research, despite their importance. It is time to recognize the significance of advances in feature extraction and real-time analysis of information from cells. Teacherless learning for the image clustering task is of great interest. In particular, the clustering of single cells is of great interest. This study will evaluate the feasibility of using latent representation and clustering of single cells in various applications in the fields of medicine and biotechnology. Of particular interest are embeddings, which relate to the morphological characterization of cells. Studies of C2C12 cells will reveal more about aspects of muscle differentiation by using neural networks. This work focuses on analyzing the applicability of the latent space to extract morphological features. Like many researchers in this field, we note that obtaining high-quality latent representations for phase-contrast or bright-field images opens new frontiers for creating large visual-language models. Graph structures are the main approaches to non-Euclidean manifolds. Graph-based segmentation has a long history, e.g., the normalized cuts algorithm treated segmentation as a graph partitioning problem—but only recently have such ideas merged with deep learning in an unsupervised manner. Recently, a number of works have shown the advantages of hyperbolic embeddings in vision tasks, including clustering and classification based on the Poincaré ball model. One area worth highlighting is unsupervised segmentation, which we believe is undervalued, particularly in the context of non-Euclidean spaces. In this approach, we aim to mark the beginning of our future work on integrating visual information and biological aspects of individual cells to multimodal space in comparative studies in vitro. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

15 pages, 2497 KiB  
Article
Hierarchical Knowledge Transfer: Cross-Layer Distillation for Industrial Anomaly Detection
by Junning Xu and Sanxin Jiang
J. Imaging 2025, 11(4), 102; https://doi.org/10.3390/jimaging11040102 - 28 Mar 2025
Viewed by 148
Abstract
There are two problems with traditional knowledge distillation methods in industrial anomaly detection: first, traditional methods mostly use feature alignment between the same layers. The second is that similar or even identical structures are usually used to build teacher-student models, thus limiting the [...] Read more.
There are two problems with traditional knowledge distillation methods in industrial anomaly detection: first, traditional methods mostly use feature alignment between the same layers. The second is that similar or even identical structures are usually used to build teacher-student models, thus limiting the ability to represent anomalies in multiple ways. To address these issues, this work proposes a Hierarchical Knowledge Transfer (HKT) framework for detecting industrial surface anomalies. First, HKT utilizes the deep knowledge of the highest feature layer in the teacher’s network to guide student learning at every level, thus enabling cross-layer interactions. Multiple projectors are built inside the model to facilitate the teacher in transferring knowledge to each layer of the student. Second, the teacher-student structural symmetry is decoupled by embedding Convolutional Block Attention Modules (CBAM) in the student network. Finally, based on HKT, a more powerful anomaly detection model, HKT+, is developed. By adding two additional convolutional layers to the teacher and student networks of HKT, HKT+ achieves enhanced detection capabilities at the cost of a relatively small increase in model parameters. Experiments on the MVTec AD and BeanTech AD(BTAD) datasets show that HKT+ achieves state-of-the-art performance with average area under the receiver operating characteristic curve (AUROC) scores of 98.69% and 94.58%, respectively, which outperforms most current state-of-the-art methods. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

16 pages, 5387 KiB  
Article
Dual-Stream Contrastive Latent Learning Generative Adversarial Network for Brain Image Synthesis and Tumor Classification
by Junaid Zafar, Vincent Koc and Haroon Zafar
J. Imaging 2025, 11(4), 101; https://doi.org/10.3390/jimaging11040101 - 28 Mar 2025
Viewed by 188
Abstract
Generative adversarial networks (GANs) prioritize pixel-level attributes over capturing the entire image distribution, which is critical in image synthesis. To address this challenge, we propose a dual-stream contrastive latent projection generative adversarial network (DSCLPGAN) for the robust augmentation of MRI images. The dual-stream [...] Read more.
Generative adversarial networks (GANs) prioritize pixel-level attributes over capturing the entire image distribution, which is critical in image synthesis. To address this challenge, we propose a dual-stream contrastive latent projection generative adversarial network (DSCLPGAN) for the robust augmentation of MRI images. The dual-stream generator in our architecture incorporates two specialized processing pathways: one is dedicated to local feature variation modeling, while the other captures global structural transformations, ensuring a more comprehensive synthesis of medical images. We used a transformer-based encoder–decoder framework for contextual coherence and the contrastive learning projection (CLP) module integrates contrastive loss into the latent space for generating diverse image samples. The generated images undergo adversarial refinement using an ensemble of specialized discriminators, where discriminator 1 (D1) ensures classification consistency with real MRI images, discriminator 2 (D2) produces a probability map of localized variations, and discriminator 3 (D3) preserves structural consistency. For validation, we utilized a publicly available MRI dataset which contains 3064 T1-weighted contrast-enhanced images with three types of brain tumors: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices). The experimental results demonstrate state-of-the-art performance, achieving an SSIM of 0.99, classification accuracy of 99.4% for an augmentation diversity level of 5, and a PSNR of 34.6 dB. Our approach has the potential of generating high-fidelity augmentations for reliable AI-driven clinical decision support systems. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

33 pages, 22075 KiB  
Systematic Review
A Systematic Review of Medical Image Quality Assessment
by H. M. S. S. Herath, H. M. K. K. M. B. Herath, Nuwan Madusanka and Byeong-Il Lee
J. Imaging 2025, 11(4), 100; https://doi.org/10.3390/jimaging11040100 - 27 Mar 2025
Viewed by 159
Abstract
Medical image quality assessment (MIQA) is vital in medical imaging and directly affects diagnosis, patient treatment, and general clinical results. Accurate and high-quality imaging is necessary to make accurate diagnoses, efficiently design treatments, and consistently monitor diseases. This review summarizes forty-two research studies [...] Read more.
Medical image quality assessment (MIQA) is vital in medical imaging and directly affects diagnosis, patient treatment, and general clinical results. Accurate and high-quality imaging is necessary to make accurate diagnoses, efficiently design treatments, and consistently monitor diseases. This review summarizes forty-two research studies on diverse MIQA approaches and their effects on performance in diagnostics, patient results, and efficiency in the process. It contrasts subjective (manual assessment) and objective (rule-driven) evaluation methods, underscores the growing promise of machine intelligence and machine learning (ML) in MIQA automation, and describes the existing MIQA challenges. AI-powered tools are revolutionizing MIQA with automated quality checks, noise reduction, and artifact removal, producing consistent and reliable imaging evaluation. Enhanced image quality is demonstrated in every examination to improve diagnostic precision and support decision making in the clinic. However, challenges still exist, such as variability in quality and variability in human ratings and small datasets hindering standardization. These must be addressed with better-quality data, low-cost labeling, and standardization. Ultimately, this paper reinforces the need for high-quality medical imaging and the potential of MIQA with the power of AI. It is crucial to advance research in this area to advance healthcare. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

14 pages, 3064 KiB  
Article
A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
by Zhaoyu Shou, Yanjun Lin, Jianwen Mo and Ziyong Wu
J. Imaging 2025, 11(4), 99; https://doi.org/10.3390/jimaging11040099 - 27 Mar 2025
Viewed by 81
Abstract
The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze [...] Read more.
The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze estimation models lack a focus on extracting essential features and fail to effectively model the spatio-temporal relationships among the head, face, and eye regions, which limits their ability to achieve lower angular errors. This paper proposes an appearance-based gaze estimation model (RSP-MCGaze). The model constructs a feature extraction backbone network for gaze estimation (ResNetSC) by integrating ResNet and SCConv; this integration enhances the model’s ability to extract important features while reducing spatial and channel redundancy. Based on the ResNetSC backbone, the method for video gaze estimation was further optimized by jointly locating the head, eyes, and face. The experimental results demonstrate that our model achieves significantly higher performance compared to existing baseline models on public datasets, thereby fully confirming the superiority of our method in the gaze estimation task. The model achieves a detection error of 9.86 on the Gaze360 dataset and a detection error of 7.11 on the detectable face subset of Gaze360. Full article
Show Figures

Figure 1

24 pages, 11715 KiB  
Article
Assessing Cancer Presence in Prostate MRI Using Multi-Encoder Cross-Attention Networks
by Avtantil Dimitriadis, Grigorios Kalliatakis, Richard Osuala, Dimitri Kessler, Simone Mazzetti, Daniele Regge, Oliver Diaz, Karim Lekadir, Dimitrios Fotiadis, Manolis Tsiknakis, Nikolaos Papanikolaou, ProCAncer-I Consortium and Kostas Marias
J. Imaging 2025, 11(4), 98; https://doi.org/10.3390/jimaging11040098 - 26 Mar 2025
Viewed by 231
Abstract
Prostate cancer (PCa) is currently the second most prevalent cancer among men. Accurate diagnosis of PCa can provide effective treatment for patients and reduce mortality. Previous works have merely focused on either lesion detection or lesion classification of PCa from magnetic resonance imaging [...] Read more.
Prostate cancer (PCa) is currently the second most prevalent cancer among men. Accurate diagnosis of PCa can provide effective treatment for patients and reduce mortality. Previous works have merely focused on either lesion detection or lesion classification of PCa from magnetic resonance imaging (MRI). In this work we focus on a critical, yet underexplored task of the PCa clinical workflow: distinguishing cases with cancer presence (pathologically confirmed PCa patients) from conditions with no suspicious PCa findings (no cancer presence). To this end, we conduct large-scale experiments for this task for the first time by adopting and processing the multi-centric ProstateNET Imaging Archive which contains more than 6 million image representations of PCa from more than 11,000 PCa cases, representing the largest collection of PCa MR images. Bi-parametric MR (bpMRI) images of 4504 patients alongside their clinical variables are used for training, while the architectures are evaluated on two hold-out test sets of 975 retrospective and 435 prospective patients. Our proposed multi-encoder-cross-attention-fusion architecture achieved a promising area under the receiver operating characteristic curve (AUC) of 0.91. This demonstrates our method’s capability of fusing complex bi-parametric imaging modalities and enhancing model robustness, paving the way towards the clinical adoption of deep learning models for accurately determining the presence of PCa across patient populations. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Figure 1

29 pages, 9142 KiB  
Article
Self-Supervised Multi-Task Learning for the Detection and Classification of RHD-Induced Valvular Pathology
by Lorna Mugambi, Ciira wa Maina and Liesl Zühlke
J. Imaging 2025, 11(4), 97; https://doi.org/10.3390/jimaging11040097 - 25 Mar 2025
Viewed by 204
Abstract
Rheumatic heart disease (RHD) poses a significant global health challenge, necessitating improved diagnostic tools. This study investigated the use of self-supervised multi-task learning for automated echocardiographic analysis, aiming to predict echocardiographic views, diagnose RHD conditions, and determine severity. We compared two prominent self-supervised [...] Read more.
Rheumatic heart disease (RHD) poses a significant global health challenge, necessitating improved diagnostic tools. This study investigated the use of self-supervised multi-task learning for automated echocardiographic analysis, aiming to predict echocardiographic views, diagnose RHD conditions, and determine severity. We compared two prominent self-supervised learning (SSL) methods: DINOv2, a vision-transformer-based approach known for capturing implicit features, and simple contrastive learning representation (SimCLR), a ResNet-based contrastive learning method recognised for its simplicity and effectiveness. Both models were pre-trained on a large, unlabelled echocardiogram dataset and fine-tuned on a smaller, labelled subset. DINOv2 achieved accuracies of 92% for view classification, 98% for condition detection, and 99% for severity assessment. SimCLR demonstrated good performance as well, achieving accuracies of 99% for view classification, 92% for condition detection, and 96% for severity assessment. Embedding visualisations, using both Uniform Manifold Approximation Projection (UMAP) and t-distributed Stochastic Neighbor Embedding (t-SNE), revealed distinct clusters for all tasks in both models, indicating the effective capture of the discriminative features of the echocardiograms. This study demonstrates the potential of using self-supervised multi-task learning for automated echocardiogram analysis, offering a scalable and efficient approach to improving RHD diagnosis, especially in resource-limited settings. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

Back to TopTop