If you are working with cancer genomics, we have some good news!
The Genomic Data Commons (GDC) (https://gdc.nci.nih.gov), which will be launching on June 1st 2016, will be standardising and reanalysing a large collection of public cancer datasets including TCGA (The Cancer Genome Atlas) for adult cancers, TARGET (Therapeutically Applicable Research to Generate Effective Treatments) for childhood cancers and more. The reprocessed data will be available to query across the different datasets, downloadable in its reprocessed form as well as available via an API interface.
You will be able to access the GDC with your dbGaP account, so although you will still need to go through the application process (if you have not done so already) to access these datasets, you will benefit from direct access to all the NCI cancer data in a one-stop shop.
The GDC not only saves you the time to download all this data (which would take you approximately 2 months!!) it will also save you the time of reformatting all files to fit the same standard and curating all phenotype information to follow the same syntax and ontologies. Future cancer datasets will be added to the GDC as they become available, for instance the Exceptional Responders study and other smaller studies are already lined up to be included.
The launch of the GDC is a welcome innovation for the whole cancer research community and it will provide a headache remedy for many cancer research teams around the world. The cancer patients who have participated in these studies will also be happy to know that the data they have contributed will be more effectively used to speed up cancer research.