Olfaction—the sense of smell—is the least understood of our senses. We use it constantly in our daily lives—choosing food that is not spoiled, as an early-warning sign of a gas leak or a fire, and in the enjoyment of perfume and wine. Despite many centuries of thought about how smell “works,” we still have no way to predict what a molecule will smell like. If you are given a chemical structure, the only way to determine what olfactory percept it gives is to smell it. This is known as the stimulus-percept problem, and it was solved long ago for color vision and tone hearing.
The goal of the DREAM Olfaction Prediction Challenge is to find models that can predict how a molecule smells from its physical and chemical features. A model that allows us to predict a smell from a molecule will provide fundamental insights into how odor chemicals are transformed into a smell percept in the brain. Further, being able to predict how a chemical smells will greatly accelerate the design of new molecules to be used as fragrances. Currently, fragrance chemists synthesize many molecules to obtain a new ingredient, but most of these will not have the desired qualities.
For this challenge, we are providing a large unpublished data set based on extensive smell-testing of 49 human subjects asked to sniff 476 different odor chemicals. Subjects were asked to tell us how pleasant the odor is, how strong the odor is, and how well the smell percept matches a list of 19 descriptors. To complement these perceptual data, we provide physical-chemical information about each odor molecule that our subjects smelled. Challenge participants will be tasked with analyzing these data to solve the following two sub-challenges:
Sub-challenge 1:
Build models that predict odor intensity (at 1/1000 dilution), as well as odor valence (pleasantness/unpleasantness) and the matrix of 19 odor descriptors (at high intensity/low dilution) for each of 49 subjects.
Sub-challenge 2:
Build models that predict odor intensity (at 1/1000 dilution) , as well as odor valence (pleasantness/unpleasantness) and the matrix of 19 odor descriptors (at high intensity/low dilution) for the average of all 49 subjects. Also predict the standard deviations.