Friday, 14th May 2021
Logo Islandinfo
Mauritius in your hands            

Find in Mauritius

Sight Seeing
Eating Out
Body & Soul
Real Estate
Mauritius Map
Mauritius Online Magazine May 2017 Issue
Expatriates in Mauritius
Mauritius Discovery
Mauritius Explore
Mauritius Escape

Forthcoming Events
in Mauritius

Events & Galleries
in Mauritius
Min: 19 Max: 27
Partly Cloudy
Other regions of Mauritius
11 Feb 2011

High-energy physicists tries to preserve and reuse data

It may seem odd that particle physicists would ever want to look back at decades-old experiments as they forge ahead with newer, bigger hardware. However, with updated theories and perspectives, physicists can extract new results from old data. Siegfried Bethke, the head of the Max Planck Institute for Physics in Munich, Germany, managed to publish over a dozen papers when he reexamined data from his days as a young physicist at DESY, a high-energy physics lab in Germany.

Bethke participated in an experiment called JADE that finished in 1986. Since then, physicists have gained more knowledge about strong coupling strength, a phenomena that is better studied at JADE's lower collision energies than those from modern colliders. Thus, Bethke gleaned important information about the strong interactions that bind quarks and gluons together to form hadrons (the constituents of atomic nuclear matter like protons and neutrons) from old JADE data.

While the end result was great, Bethke had a tough time obtaining and using the old data, as it was disorganized and scattered from Heidelberg to Tokyo. One data set existed only as printed ASCII text. Outside of data rescue, software was another issue. Modern computers no longer run code like MORTRAN, so a graduate student spent a year to recreate old code before the data could be fully analyzed.

Bethke’s story highlights a significant problem in the field of high-energy physics and serves as a cautionary tale for current work. For example, each experiment performed at CERN’s Large Hadron Collider (LHC) is a costly affair that produces large amounts of data. That data will be carefully collected and analyzed by a team of scientists. However, if LHC’s scientists follow the trend in particle physics, their data will become orphaned once they’ve completed their work, meaning it will be extremely difficult for future generations to reuse data from experiments that cannot be readily reproduced.

CERN scientists and researchers from several other facilities have grouped together to preserve data by creating DPHEP (Data Preservation in High Energy Physics). DPHEP recommends that research budgets provide for a data archivist position. The data archivist will preserve data along with key supplementary information that is necessary to interpret and put the data in perspective for future generations. They also recommend creating virtualized software that simulates the computers of today, so whatever programs current physicists use for their data workup can be used long after present technology expires.

Overlapping astronomical and medical data analysis tools

Medicine and astronomy are completely different fields, but scientists have found that they can share data visualization tools. Collaborations between astronomers and medical experts have benefited both sides, and two particular examples are showcased in the same issue of Science.

In one case, Alyssa Goodman, an astronomer at Harvard University, needed to find a way to visualize her data, obtained by surveying star-forming regions in 3-D. Radiologist Michael Halle from Brigham and Women’s Hospital in Boston had developed 3D Slicer, a visualization software, to create 3D images from MRIs and other medical scans.

Goodman and Halle collaborated to adapt 3D Slicer for astronomical data. They managed to add a third dimension, velocity, to her 2D images. As a result, they can more easily spot events like the jets of gas that are ejected from newborn stars. While using 3D slicer, the astronomers have also developed algorithms that could be applicable for the medical field, such as visualizing coronary arteries.

In another case, astronomers applied image-analysis software used for picking out details from large telescope surveys to find cancerous cells dispersed among healthy ones. Current methods for cancer cell detection include staining tissues for certain biomarkers, followed by visually inspecting the stained tissue for cancer cells based on the intensity and abundance of staining.

Human inspection of stained biomarkers is time consuming and subjective, so there is substantial room for error. Nicholas Walton at the University of Cambridge applies computer algorithms, normally used for finding galaxies and stars, to spotting cancerous cells in a project called PathGrid.

PathGrid can screen hundreds of images for a particular biomarker in just a few minutes, whereas it would take a human pathologist a few hours. PathGrid should also be more accurate, as it removes human subjectivity from the process. The faster screening would also speed up the validation process for new biomarkers, enabling them to go into clinical practice more rapidly. Walton hopes that hospitals will adopt automated screening like PathGrid within three years.

Public competition for data analysis

The scientist collecting data may not be the best person to analyze that data. That’s the premise of Kaggle, a company founded by Anthony Goldbloom. Kaggle charges companies or individuals a fee for starting data prediction competitions. The competition organizer will have to provide whatever data they want analyzed, so competitors can use it to create models that will predict particular outcomes. The competition is open to the public, and signing up to compete is free.

Kaggle has run just over a dozen competitions so far, with modest prizes of $150 to $25,000, but each of its competitions has resulted in a better prediction model than the one that was originally in place. For example, in a three-month competition for predicting how well a patient with HIV might respond to a cocktail of antiretroviral drugs based on the patient’s DNA sequences, the Kaggle winner came up with a model that is 78 percent accurate. The best model before that had 70 percent accuracy.

Kaggle’s popularity in the science community will depend on researchers' willingness to expose their data to the public. According to Goldbloom, most researchers refuse to participate in Kaggle and almost have a "visceral reaction" to the idea. Opening up data analysis to the public, where people are more likely to think outside of the box for a given field, can benefit many research projects, but concerns for the security and privacy of their data could prevent scientists from giving it a try.

« Back
Publish your article with us for free
Home | About us | Contact Us | Advertising | Link to Us | Airport   Bookmark and Share Site by: Islandinfo & Maxuz Web Agency