Technology used

Metabolomic profiling, GC-MS techniques, data mining and analysis techniques.

Project description

Cancer remains one of the world’s leading causes of death, with approximately 14 million new cases each year. Population estimates indicate that the number of new cases is likely to increase by 70% in the coming decades, reaching 24 million cases by 2035.

In 2017, lung cancer was the leading cause of tumour deaths in Europe, followed by colon cancer, breast cancer and prostate cancer. In Spain, cancer was the second leading cause of death, almost equal to cardiovascular diseases(27.4% compared to 29.5%).

In Spain, according to the latest estimates from 2015, pancreatic cancer ranks eleventh in incidence in men and seventh in women, with some 6914 cases per year (3513 cases in men and 3401 cases in women), and an increasing trend in incidence rates over the last 20 years.

Pancreatic cancer-specific survival has not changed significantly over the last 40 years, irrespective of the stage of the disease. Patients with advanced disease continue to have a 5-year survival of 2% or less. Retrospective analyses of resected pancreatic cancer patients reveal the same survival rate in the 2000s as in the 1980s.

For all these reasons, it has become essential to undertake projects to improve current methods of prognosis and the configuration of chemotherapy treatments according to their potential effectiveness.

The overall objective of the project is to develop a tool capable of predicting and forecasting the response of pancreatic cancer to chemotherapy treatments, based on data and the expression pattern of volatile organic compounds (VOCs), in order to facilitate individualised treatment for each patient.


Hequipa Description
Hequipa Response

Reply from Encore-Lab

  • Integration of the model in the tool
  • Programming of the ICT tool functionalities
  • Implementation of information security and encryption system protocols within the tool.
  • Development of APIs for manipulating information from open databases.
  • Data analysis technician, Technical Engineer and/or Mathematician.
  • Coordination of statistical analysis, bioinformatics and modelling tasks.
  • Collaborate in the definition of data anonymisation protocols.
  • Define the characteristics of the application to be developed
  • Define data architecture to make it open
  • Preliminary analysis of medical data and VOC analysis data.
  • Model validation
  • Validation of the ICT tool
  • Definition of strategies for dealing with missing or anomalous data.
  • Data quality analysis
  • Processing and preparation of the dataset for training and cross validation.
  • Model development
  • Continuous evaluation of the models developed.
  • Search for relationships between data and results
  • Integration of the model into the ICT tool


The project proposes two independent lines of action that will be merged into a predictive tool that will integrate:

The project proposes two independent lines of action that will be merged into a predictive tool that will integrate:

• Analysis of the clinical and analytical data of patients with pancreatic cancer available in the San Pedro hospital databases using data mining techniques.

Volatile organic compounds (VOCs) are the end products of cellular metabolism and can be detected in both breath and body fluids. VOCs may reflect metabolic changes in response to external factors and intrinsic factors such as inflammation necrosis, altered microbiota, and, of course, cancer.

This is the first time that this technique has been proposed for the identification of biomarkers for pancreatic adenocarcinoma, although it has been proposed for the detection of other types of cancer such as melanoma.

The CRISP-DM methodology, an open standard process model that describes the common approaches used in data mining, will be used for data analysis. It is currently the most widely used analytical methodology.

Initially, an analysis will be made of data obtained from the demographic, analytical and clinical data available at the San Pedro hospital of patients treated with chemotherapy for the
pancreatic cancer (retrospective analysis). These will be used to develop algorithms that classify patients according to the efficacy of chemotherapy treatment (prospective analyses).

The tool will be able to predict which chemotherapy treatment for pancreatic cancer is the most appropriate, based on data obtained from metabolomic profile analysis and data analysed using data mining techniques based on learning from the clinical history of previous patients. The tool can be used by doctors in their practices by connecting to a server where the application will be hosted. This tool will process the information entered by the doctors and will allow a classification of the patient into various chemotherapy response groups.

Hequipa Result