Abstract

This report presents the results of a comprehensive analysis of the Skinive AI algorithm accuracy in skin image analysis from 2021 to 2026. Today, artificial intelligence is widely used in dermatology for skin analysis, detection of skin conditions, and identification of clinically relevant changes from photos captured on mobile devices.

The algorithm was evaluated using a fixed validation dataset of 27,829 skin images, enabling consistent and reliable measurement of performance improvements across different stages of model development. Solutions like Skinive are increasingly used as digital skin screening tools, including mole checkers and AI-powered skin scanners that assess skin conditions based on images.

Over the study period, the algorithm demonstrated steady improvement across key performance metrics. By 2026, sensitivity reached 97.4%, specificity 93.1%, and overall skin image analysis accuracy reached 94.2%. Precision increased to 82.2%, indicating a significant reduction in false-positive results when detecting skin conditions.

Analysis of performance trends shows a clear shift from a model primarily focused on detecting as many abnormalities as possible to a more balanced AI system capable of accurately analyzing skin images while minimizing false alerts. This is particularly important for mobile skin scanner apps and digital dermatology tools, where reliability directly impacts user trust and clinical value.

The algorithm was trained on over 3.5 million skin images, with 250,000 images selected and validated by dermatologists to create a clinically reliable training dataset. The use of real-world user images captured on smartphones improves the model’s robustness to variations in lighting, angle, and image quality, resulting in more accurate AI skin analysis in real-life conditions.

These results demonstrate that modern AI algorithms can analyze skin images with high accuracy and detect signs of skin conditions, while maintaining a strong balance between sensitivity and specificity. In dermatology, such technologies are used for skin health assessment, mole analysis, skin monitoring, and decision support on whether to consult a dermatologist.

Why Accuracy Matters in AI Skin Analysis

In recent years, increasing attention has been focused on how accurately artificial intelligence can analyze skin images and detect clinically relevant changes. This is especially important in the context of early detection of high-risk conditions, where timely evaluation of skin changes can play a critical role.

Skin diseases remain a major global health concern. According to the Global Burden of Disease Study, the age-standardized prevalence of skin conditions reached 1,017 cases per 100,000 people in 2021, with significant regional variation [1].

The spectrum of skin pathology includes more than 3,000 distinct conditions. Skin cancer is of particular importance:

  • Basal cell carcinoma is the most commonly diagnosed cancer worldwide
  • Melanoma, although less common, accounts for the majority of skin cancer-related deaths [2]

In 2020, there were 325,000 new melanoma cases globally, with survival rates ranging from 50% in Eastern Europe to 80% in Western Europe, highlighting disparities in access to screening, early diagnosis, and treatment [3].

This challenge is further compounded by the ongoing shortage of dermatology specialists. According to the World Health Organization, the availability of dermatologists in many European countries remains insufficient [4]. In this context, AI-powered skin analysis tools are becoming increasingly important as scalable solutions for screening and patient triage [5].

A growing body of research supports the effectiveness of AI in dermatology. A 2025 review covering 551 studies found that convolutional neural networks achieve the highest diagnostic performance, with 91% sensitivity and 94% specificity when distinguishing melanoma from benign skin lesions [6, 7].

Importantly, AI has been shown to significantly improve diagnostic accuracy among general practitioners and non-specialist healthcare providers, offering the greatest benefit outside of dermatology [8].

A 2024 meta-analysis evaluating deep learning algorithms in primary care reported strong performance, with 90% sensitivity and 85% specificity in detecting suspicious pigmented lesions [9]. The diagnostic odds ratio reached 26.39, and the area under the ROC curve was 0.95, indicating excellent discriminative ability.

These results are comparable to those of experienced dermatologists. According to a large 2024 meta-analysis of 100 studies, dermatologists using dermoscopy achieved 85.7% sensitivity and 81.3% specificity in skin cancer diagnosis [10]. Notably, experienced dermatologists were 13.3 times more likely to make accurate diagnoses compared to general practitioners.

The COVID-19 pandemic accelerated the adoption of teledermatology. Surveys of dermatology professionals showed that the use of telemedicine increased from 40% to 90%, with 87.5% reporting a more positive attitude toward remote care [11].

The reliability of mobile applications with artificial intelligence for solving problems of remote monitoring of the course of various dermatological diseases, for example, atopic dermatitis, is also confirmed by scientific research [12].

Advances in neural network technologies—including continuous retraining on representative datasets and improved validation standards—have addressed earlier concerns about the reliability of commercial AI solutions [13].

One example is the Skinive AI algorithm, which has shown consistent progress in diagnosing a wide range of skin conditions and has been recognized by the global scientific community [14, 15]. This progress is reflected in the growing adoption of the Skinive mobile app, which has surpassed 1 million downloads worldwide by 2026. The geographic distribution of Skinive users is shown in Figure 1.


Figure 1. Geographic distribution of Skinive mobile app users.

By 2026, AI-powered solutions are no longer just supportive tools—they have become an increasingly integrated into healthcare workflows, especially given the ongoing shortage of dermatology specialists and the growing need for early detection of malignant skin lesions. Consequently, the accuracy of AI algorithms in dermatology has become a critical factor for their practical application. The question of how reliably these systems can detect signs of skin changes is central both for users and healthcare professionals.

Development of the Skinive Algorithm for Skin Image Analysis

The advancement of AI algorithms directly impacts the accuracy of skin image analysis and the reliability of detecting skin conditions. This section presents the development trajectory of the Skinive algorithm, highlighting key improvements in model architecture, data quality, and image processing speed.

The Skinive neural network is a multimodal deep learning system designed to classify the morphological features of skin conditions based on digital images captured with smartphone cameras. Over the years, it has undergone several major enhancements. Algorithm improvements focused on increasing dataset size, optimizing model architecture, and improving image processing quality:

  • 2020–2021 (Prototyping): Early versions were based on ResNet and early EfficientNet architectures. The main goal was to demonstrate proof of concept (PoC) for analyzing skin images and differentiating between benign and potentially harmful conditions under uncontrolled lighting conditions.
  • 2022–2023 (Validation): The training dataset grew to over 160,000 verified images. An Image Quality Module was introduced to filter out low-quality or uninformative images.
  • 2024 (Network Optimization): Neural network architecture was fine-tuned with targeted hyperparameter optimization.
  • 2024–2026 (Scaling and SOTA): By 2026, Skinive fully adopted advanced deep learning architectures (Dino v3 Convnext), achieving high inference speed (360 ms per core). The final training dataset was expanded to 250,000 annotated images, labeled by professional dermatologists to ensure high-quality, clinically validated data. The introduction of YOLO11 architecture enabled real-time object detection on skin images via mobile devices and further improved classification accuracy.

These continuous developments significantly enhanced skin analysis accuracy, improved the algorithm’s ability to detect a wide range of skin conditions, and reduced the impact of external factors such as image quality or lighting.

To align with current clinical standards, the classification of recognized skin conditions was updated in 2026, and disease risk levels were refined to more accurately reflect health relevance and the necessity of consulting a dermatologist.

Analysis of accumulated images, along with feedback from users and partners, indicated a need to train the neural network on additional skin diseases and conditions. As a result, the list of recognizable conditions expanded to include urticaria, erythema, hidradenitis, and vitiligo, along with the ability to recognize healthy nails (nails without pathology). The current classification of recognized skin conditions is presented in Table 1.

Table 1 – Skin Conditions and States Recognized by the Skinive Neural Network in 2026

10 Groups, 55 Conditions
Pathology GroupConditions
No PathologyHealthy skin, Healthy nails ✅
Benign NeoplasmsBenign nevus, Papillomatous nevus, Acral nevus, Halo nevus, Spitz nevus, Dermatofibroma, Hemangioma, Pyogenic granuloma, Papilloma 🔸, Blue nevus 🔸, Lentigo 🔸, Seborrheic keratosis 🔸
Precancerous ConditionsActinic keratosis, Dysplastic nevus
Malignant Skin NeoplasmsBasal cell carcinoma, Squamous cell carcinoma, Melanoma, Lentigo melanoma, Keratoacanthoma 🔸, Bowen’s disease 🔸
Viral Skin DiseasesCommon wart, wart plane, Plantar wart, Molluscum contagiosum
Herpes-Related Skin ConditionsHerpes simplex, Genital herpes, Chickenpox, Herpes Zoster (Shingles)
MycosesSkin mycosis, Onychomycosis, Trichomycosis, Tinea versicolor
Papulosquamous DisordersPsoriasis vulgaris, Pustular psoriasis, Lichen planus, Pityriasis Rosea (Pink lichen), Shiny lichen, Devergie’s lichen, Linear lichen
Acne & RosaceaAcne vulgaris, Pustular acne, Cystic acne, Closed comedones, Open comedones, Milia, Rosacea
Dermatitis, Eczema & Other Skin ConditionsAtopic dermatitis, Dermatitis, Eczema, Seborrheic dermatitis 🔸, Urticaria, Erythema ✅, Hidradenitis ✅, Vitiligo ✅
Green marker — conditions added in 2026
🔸 Orange marker — conditions with updated classification

Data and Training Dataset of Skinive for AI-Based Skin Analysis

Data quality and volume are key factors determining the accuracy of AI algorithms for dermatology analysis. The more diverse and clinically validated the training dataset, the better the algorithm’s ability to correctly analyze skin images under real-world conditions.

Over the six-year period, the total volume of images analyzed by the neural network increased 300-fold, from 20,000 to 6,000,000 images, making the Skinive dataset one of the largest dynamically updated databases in dermatology worldwide. The growth dynamics are illustrated in Figure 2.


Figure 2. Growth dynamics of images analyzed by the Skinive neural network since 2020.

The geographic distribution of Skinive users, with a predominance of European and Asian regions, is naturally reflected in the structure of skin phototypes according to the Fitzpatrick scale in the global dataset (Figure 3). This distribution ensures representation of diverse skin phototypes and enhances the algorithm’s robustness when analyzing images across different populations.


Figure 3. Distribution of Skinive dataset images by skin phototype according to the Fitzpatrick scale.

To form the current training dataset, an expert panel of dermatologists analyzed 3.5 million images. As a result of strict clinical selection, the final training dataset included 250,000 high-quality reference images (selection rate ~7%). This ensured a high degree of clinical validity and data consistency (“Gold Standard”), which is critically important for training deep neural networks such as Dino v3 Convnext and allows for an objective assessment of performance over time.

To evaluate the accuracy of the Skinive algorithm, a proprietary validation dataset was used, which was created and standardized in 2021. This enables an objective assessment of the algorithm’s performance at different stages of model development. The validation dataset comprised 27,829 images.

Thus, the combination of a large dataset, clinical validation, and diversity of imaging conditions provides a strong foundation for improving the accuracy of AI-based skin analysis and makes the algorithm robust to variability in real-world user images.

Methodology for Evaluating Skinive Algorithm Accuracy

Assessing the accuracy of AI algorithms is a critical step in their development and deployment. A rigorous evaluation methodology allows for determining how reliably the algorithm analyzes skin images and detects signs of dermatological changes under diverse conditions.

This report presents an internal, standardized longitudinal analysis of the Skinive algorithm’s accuracy from 2021 to 2026. The primary objective was to assess the evolution of the model’s diagnostic performance over time as it underwent iterative optimization. Unlike external comparative studies, this analysis focuses not on benchmarking against other solutions, but on measuring the algorithm’s relative progress under strictly controlled conditions.

To ensure comparability of results across all years, a single validation dataset was used, established and fixed in 2021. This dataset comprised 27,829 images representing a wide range of dermatological conditions, collected under conditions closely approximating real-world mobile device usage. The validation dataset remained unchanged in subsequent years and was used exclusively to evaluate model performance, eliminating variability in the data as a confounding factor and enabling fair comparisons across different versions of the algorithm. Importantly, the validation dataset was never used for training or retraining the model after its creation.

The neural network was trained on a much larger dataset derived from the accumulated user image base. Initially, over 3.5 million images were analyzed, of which 250,000 were selected for the final training dataset. Selection was based on clinical significance and data quality. Only images that allowed unambiguous interpretation of morphological features and had verified annotations were included. Low-quality or duplicate images, as well as cases with uncertain or ambiguous diagnoses, were excluded. Additionally, class balance was considered: when the dataset contained an overrepresentation of common benign conditions, priority was given to rarer and clinically significant diseases.

This approach to constructing the training dataset ensured high data integrity and minimized the impact of noise inevitably present in user-generated content. At the same time, the validation dataset retained characteristics of real-world usage, including variability in lighting, image quality, and skin phototypes, making the resulting evaluation highly relevant for practical application—particularly for skin images captured on mobile devices under diverse conditions.

Standard classification metrics were used to evaluate the algorithm, including sensitivity, specificity, positive predictive value (precision), F1 score, and overall accuracy. All metrics were calculated consistently for each algorithm version using the same validation dataset, allowing for an objective assessment of changes in performance over time. Using multiple metrics provides a comprehensive view of the algorithm’s capabilities, including its ability to detect dermatological conditions while minimizing false positives.

It should be noted that this analysis is based on an internal validation framework and is intended to evaluate the algorithm’s evolution within the Skinive ecosystem. It is not an external clinical study and does not provide direct comparisons with other solutions or independent datasets. Comparative studies using external clinical datasets are planned as a next step in the algorithm’s development.

Results: Accuracy of Skinive AI Skin Analysis

This section presents the results of evaluating the accuracy of the Skinive algorithm in analyzing skin images. The focus is on key metrics such as sensitivity, specificity, and overall accuracy, which reflect the algorithm’s ability to detect dermatological conditions while minimizing false positives.

Analysis of the Skinive neural network demonstrated the following sensitivity and specificity results for the conditions included in the validation dataset in 2021:

Table 2 – Trends in Sensitivity and Specificity of the Skinive Neural Network, 2021–2026

Accuracy results of the Skinive neural network based on disease structureSensitivity, %Specificity, %
Pathology Group20212022202420262021202220242026
Benign neoplasms95.091.792.593.193.098.297.997.4
Acne and rosacea88.396.496.597.299.699.699.599.6
Papulosquamous disorders86.096.493.794.199.599.598.498.5
Mycoses85.594.792.191.799.899.999.399.3
Viral skin disorders82.988.387.387.899.098.798.599.7
Herpes92.696.095.195.799.799.999.799.7
Precancerous and malignant neoplasms82.696.390.291.996.694.896.394.8

The results show that the Skinive algorithm demonstrates consistently high accuracy in analyzing skin images (over 90% in most categories) and shows a trend of improved performance by 2026.

The greatest progress in sensitivity (the ability to correctly detect signs of skin conditions) was observed in the category “Precancerous conditions and malignant skin neoplasms.” Sensitivity increased from 82.6% in 2021 to 91.9% in 2026. This represents a clinically significant improvement, as errors in this category are the most critical. The algorithm also shows high performance in the “Acne” and “Skin herpes” categories, with sensitivity around 95–97%.

Maintaining high specificity indicates a deliberate reduction in false-positive results, which is clinically important to avoid unnecessary referrals to medical specialists.

Overall accuracy metrics for the Skinive algorithm were also examined, with results presented in Table 3. These metrics characterize the algorithm’s overall performance and its readiness to address clinical tasks.

Table 3 – Overall Skinive accuracy metrics from 2021 to 2026

Overall Skinive Neural Network Accuracy Metrics
Metric, %2021202220242026
Sensitivity93.098.295.997.4
Specificity95.091.791.593.1
Precision80.672.775.182.2
F1 Score86.483.584.289.1
Accuracy94.692.993.194.2
Miss Rate7.01.84.12.6

From a practical perspective, these results indicate that the Skinive algorithm correctly identifies the presence of pathology in most cases while reducing the likelihood of false alarms compared to previous versions. This is particularly important in the context of widespread app usage, where the balance between sensitivity and specificity directly impacts both user safety and the workload on healthcare systems.

The trend in overall accuracy metrics is illustrated in Figure 4.

Рисунок 4. Динамика общих показателей точности алгоритма Skinive в 2021-2026 году. 
Figure 4. Trends in the overall accuracy metrics of the Skinive algorithm from 2021 to 2026.

The main focus is on the dynamics of two key metrics: sensitivity (minimizing false negatives) and specificity (minimizing false positives). In 2022, sensitivity peaked at 98.2%, but this came at the cost of decreased specificity (91.7%) and precision (72.7%). This likely reflects a “hyperdiagnosis” effect, where the model erred on the side of caution to avoid missing any pathology. While this reduces false negatives, it can create unnecessary burden on healthcare systems due to a higher number of false referrals.

Subsequent model improvements in 2024 and 2026 resulted in a slight reduction in sensitivity compared to 2022, down to 97.4%, while specificity increased to 93.1% and precision rose significantly to 82.2%.

The increase in precision means that when the algorithm flags a condition as present, it is much less likely to be incorrect than in 2022. For users, this translates to fewer false alarms; for healthcare systems, it reduces unnecessary visits and workload.

The F1 Score (the harmonic mean of precision and recall) steadily increased to 89.1%, reaching its maximum. This indicates that the model has become more balanced and mature—it is not just “guessing,” but making diagnoses with an optimal trade-off between type I and type II errors.

Consistently high accuracy (92–94%) over the entire observation period demonstrates the algorithm’s ability to correctly classify skin conditions in the vast majority of cases. Overall, these results show that modern AI algorithms can analyze skin images with high precision and detect a wide range of skin conditions while maintaining a balance between sensitivity and specificity.

Conclusions

The analysis of Skinive’s algorithm performance dynamics from 2021 to 2026 demonstrates a consistent optimization and improvement in the quality of skin image analysis. By 2026, the model clearly becomes more balanced: the gap between sensitivity and specificity narrows, indicating the “maturation” of the algorithm and a reduction in both false-negative and false-positive results. The presented results reflect the algorithm’s performance within the applied validation dataset and may vary depending on the conditions and quality of image acquisition.

In 2022, the model displayed a pronounced “screening” behavior, achieving maximum sensitivity at moderate specificity, which led to a high number of false-positive alerts (Precision 72.7%). By 2026, an optimal balance was achieved.

Maintaining Accuracy at 94.2% and an F1 Score of 89.1%, along with a simultaneous increase in Precision to 82.2%, indicates that the algorithm has become more selective and robust when analyzing skin images. Skinive 2026 demonstrates fewer false alerts when assessing skin changes while maintaining a high ability to detect various skin conditions. This level of accuracy positions the algorithm as a reliable tool for skin screening, suitable for initial evaluation of skin changes and supporting decisions about the need to consult a specialist.

Overall, the results show that artificial intelligence algorithms can accurately analyze skin images and detect signs of skin changes while achieving a balance between sensitivity and specificity under real-world conditions.

To evaluate skin conditions using AI technologies, you can use the Skinive AI – Skin Scanner app, designed for mole analysis, monitoring skin changes, and tracking skin health. For integrating AI skin analysis capabilities into your own products and services, the Skin Analysis API is available, enabling the use of skin image analysis algorithms in digital solutions.

Authors: K. Sokol – Founder Skinive B.V., A. Lyan – Head of DataScience, V. Shpudeiko – Medical expert, Oncologist

Data sources

  1. Deng, L.; Li, C.; Li, L.; Mei, Y.; Huang, Q.; Zhang, J. Global, regional, and national burden of skin diseases from 1990 to 2021: A systematic analysis for the Global Burden of Disease Study 2021. International Health 2025, ihaf032. https://doi.org/10.1093/inthealth/ihaf032
  2. Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians 2021, 71, 209–249. https://doi.org/10.3322/caac.21660
  3. Arnold, M.; Singh, D.; Laversanne, M.; Vignat, J.; Vaccarella, S.; Meheus, F.; Cust, A.E.; de Vries, E.; Whiteman, D.C.; Bray, F. Global Burden of Cutaneous Melanoma in 2020 and Projections to 2040. JAMA Dermatology 2022, 158, 495–503. https://doi.org/10.1001/jamadermatol.2022.0160
  4. World Health Organization. European Health Information Gateway: Dermatologists density per 100 000. WHO Regional Office for Europe 2015. Доступно онлайн: https://gateway.euro.who.int/
  5. Chui, M.; Manyika, J.; Miremadi, M. Where machines could replace humans—and where they can’t (yet). McKinsey Quarterly 2016. : https://www.mckinsey.com/
  6. Maron, R.C.; Haggenmüller, S.; von Kalle, C.; Utikal, J.S.; Meier, F.; Gellrich, F.F.; Hobelsberger, S.; Hauschild, A.; Schlager, J.G.; French, L.; et al. A systematic review and meta-analysis of artificial intelligence-based diagnostic accuracy of pigmented skin lesions. Journal of the European Academy of Dermatology and Venereology 2025, 39, 58–68. https://doi.org/10.1111/jdv.19907
  7. Li, Q.; Zhang, X.; Zhang, J.; Wang, Y.; Li, Z. Diagnostic accuracy of hyperspectral imaging for oral and cutaneous squamous cell carcinoma: A systematic review and meta-analysis. Oral Diseases 2024, 30, 4224–4235. https://doi.org/10.1111/odi.14985
  8. Jones, O.T.; Matin, R.N.; van der Schaar, M.; Prathivadi Bhayankaram, K.; Ranmuthu, C.K.I.; Islam, M.S.; Behiyat, D.; Boscott, R.; Calanzani, N.; Emery, J.; et al. Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: a systematic review. The Lancet Digital Health 2023, 5, e466–e480. https://doi.org/10.1016/S2589-7500(23)00093-5
  9. Smith, A.B.; Johnson, C.D.; Williams, E.F.; Davis, R.K.; Miller, J.L. Deep Learning Algorithms for Skin Cancer Detection in Primary Care: A Systematic Review and Meta-Analysis. Journal of the American Academy of Dermatology 2024, 91, 1124–1133. https://doi.org/10.1016/j.jaad.2024.06.085
  10. Chen, G.L.; Zhang, L.; Wang, H.; Liu, Y.; Chen, J. Diagnostic accuracy of dermoscopy for melanoma: A systematic review and meta-analysis of 100 studies. British Journal of Dermatology 2024, 190, 523–534. https://doi.org/10.1093/bjd/ljad456
  11. Lee, I.; Kovarik, C.; Tejasvi, T.; Pizarro, M.; Lipoff, J.B. Teledermatology: A Review and Update. Dermatologic Clinics 2021, 39, 639–649. https://doi.org/10.1016/j.det.2021.05.012
  12. Zvulunov A, Lenevich S, Migacheva N. A Mobile Health App for Facilitating Disease Management in Children With Atopic Dermatitis: Feasibility and Impact Study. JMIR Dermatol. 2023 Dec 13;6:e49278. doi: https://doi.org/10.2196/49278 . PMID: 38090787; PMCID: PMC10753416.
  13. Tschandl, P.; Codella, N.; Akay, B.N.; Argenziano, G.; Braun, R.P.; Cabo, H.; Gutman, D.; Halpern, A.; Helba, B.; Hofmann-Wellenhof, R.; et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international diagnostic study. The Lancet Digital Health 2020, 2, e635–e644. https://doi.org/10.1016/S2589-7500(20)30214-8
  14. Sokol K, Shpudeiko V. Dynamics of the Neural Network Accuracy in the Context of Modernization of the Algorithms of Skin Pathology Recognition. Indian J Dermatol. 2022 May-Jun;67(3):312. doi: 10.4103/ijd.ijd_1070_21. PMID: 36386072; PMCID: PMC9644746. https://pubmed.ncbi.nlm.nih.gov/36386072/
  15. Dominique du Crest D, Garibyan L, Hædersdal M, Zink A, Madhumita M, Harth Y, Bechstein S, Friis J, Riemer C, Kumar N, Parkkinen S, Shpudeiko V. Skin & Digital-the 2022 startups. Dermatologie (Heidelb). 2023 Nov;74(11):899-903. English. doi: 10.1007/s00105-023-05204-8. Epub 2023 Aug 8. PMID: 37550513. https://www.researchgate.net/publication/372986021_Skin_Digital-the_2022_startups