Data processing joins fight to treat cancer
Laboratory technicians battling cancer want to improve diagnosis and treatment of the disease. But they are drowning in data from modern biological techniques. New Web-based software – validated in three European oncology hospitals – can extract potentially life-saving knowledge from such data in minutes.
An integrated software platform
In Europe, cancer is the second cause of death; worldwide it accounts for 23.5 per cent of all deaths. The race to beat this disease increasingly depends on groundbreaking bioinformatics research. Welcome as they are, the various techniques being developed in this field create massive amounts of data.
“Bioinformatics faces several challenges,” says Philippe Boutruche, coordinator of the IST project HKIS. “Life scientists need to access data from many different sources and in a variety of formats.” He adds that they lack standards to cross all this data, which cover everything from human DNA to genomes, and may spend weeks doing this manually.
Under HKIS, the five partners developed an integrated software platform for biological and biomedical data processing in cancerology. “It was built around Amadea, software used by banks and marketers for processing, crossing and transforming data. We saw its potential for handling the huge volumes of patient data generated from cancer-research techniques.”
The basic interactive platform is just 20 MB in size. Aimed at medical and biological professionals, it can connect to all data types saved in any form or structure. It can integrate and analyse new data sources from public and private databases much faster than more labour-intensive solutions.
The platform needs no programming, can be accessed on the Internet and may be used by people with different expertise levels. Thanks to a cache memory management system and special algorithms, it provides graphical output for each analysis stage in real time, even if data is stored on another server.
“We want to provide doctors, bioinformaticians and clinicians with a common environment to build data-driven experiments,” says Boutruche. “The project’s platform is homogeneous, so there is no need to export or configure data from one format to another. Being integrated, it allows a continuous workflow with raw data saved in XML format. Users can run statistical mining or algorithms, which may show why the genes of some patients are more susceptible to cancer.”
Trials prove successful
Successful trials were conducted in 2003 in specialist cancer hospitals in the Ulm Medicine University, the Curie Institute and the European Oncology Institute. Two of them used real medical data from their own databases, while the third focused on data mining. “Our platform helped to define some predictive diagnostic genes for identifying genes of interest in bladder and pancreas cancer,” notes the coordinator.
He believes the project’s technology could benefit a variety of other medical and biology domains. Among them are genetic diseases, therapeutic targets and drug discovery, genotyping and biotechnologies in general. Others include the management of genetic databases, where the software could enable quality assessment and automation.
By mid-2005, the partners will have a commercial product for biology labs, adding a specialised bio-pack to the original software. This pack will integrate the project’s major results, including the ability to access data from different databases and to upgrade the platform.
Tara Morris | alfa