Modern metabolomics, proteomics and natural product datasets have now reached into the millions of tandem mass (MS/MS) spectra. The rapidly growing size of these datasets precludes laborious manual data interpretation of all of the data. While MS/MS spectral library search approaches match spectra in an automated fashion, the limited size of available spectral libraries limits identification rates of datasets to single digit percentages. In addition, the sharing of experimental MS/MS data between researchers is not that common. What is needed is a way to organize both identified and unidentified spectra into structurally related molecular families that is searchable.

Researchers at UC San Diego have an invention that determines the source of a molecule. That is, what was the actual origin of a molecule that has been found in nature based on the mass spectrum for that molecule, without having to identify the molecule itself, and optionally, to model an entire sample based on the contexts of its constituent molecules This could be an animal product, food, personal care products, medication formulations, building materials, etc. 


This invention can be used in the agricultural industry to determine sources of contamination. In drug discovery, it can be used to find high productivity sources of scarce but highly desirable molecules. In medicine, the source of a harmful or irritant can be determined in order to improve health.


Prior inventions relied on the molecules' structure having previously been identified. This structural information could then rely on external structure databases to track the source. However, this invention does not rely on the knowledge of structure and can be done purely from the data and spectral matching.

This invention is at the working prototype stage. It is able to search a query spectrum against all MS/MS spectra from the mass spectrometry interactive virtual environment (MassIVE) repository on the global natural products social (GNPS) analysis platform.

This technology is patent pending and available for licensing and/or research sponsorship.

