top of page

About our Data

When building dermcheck.ai, we wanted to train our model on a diverse and expansive set of data. 

Data we used to train our model

Stanford

To ascertain potential biases in algorithm performance in this skin disease detection, the Diverse Dermatology Images (DDI) dataset was curated. It is the first publicly available, deeply curated, and pathologically confirmed image dataset with diverse skin tones. The DDI was retrospectively selected from reviewing pathology reports in Stanford Clinics from 2010-2020.

Data blending

Most of the readily available skin disease image data out there is heavily biased towards lighter skin tones. In order to combat this we have used image blending by overlaying two images over one another with the goal of darkening the skin tones in our dataset. We have leveraged the darkest classification in the Fitzpatrick classification of skin phenotypes in order to do this, and 33% of the images in our final dataset are of this classifiation.

bottom of page