Experimental Data Parsers

clinicalParser.py

parser(projectId, type='clinical')[source]
project_parser(projectId, config, directory)[source]
experimental_design_parser(projectId, config, directory)[source]
clinical_parser(projectId, config, clinical_directory)[source]
parse_dataset(projectId, configuration, dataDir, key='project')[source]

This function parses clinical data from subjects in the project Input: uri of the clinical data file. Format: Subjects as rows, clinical variables as columns Output: pandas DataFrame with the same input format but the clinical variables mapped to the right ontology (defined in config), i.e. type = -40 -> SNOMED CT

extract_project_info(project_data)[source]
extract_responsible_rels(project_data, separator='|')[source]
extract_participant_rels(project_data, separator='|')[source]
extract_project_tissue_rels(project_data, separator='|')[source]
extract_project_disease_rels(project_data, separator='|')[source]
extract_project_intervention_rels(project_data, separator='|')[source]
extract_project_rels(project_data, separator='|')[source]
extract_timepoints(project_data, separator='|')[source]
extract_project_subject_rels(projectId, design_data)[source]
extract_subject_identifiers(design_data)[source]
extract_biosample_identifiers(design_data)[source]
extract_analytical_sample_identifiers(design_data)[source]
extract_biological_sample_subject_rels(design_data)[source]
extract_biological_sample_analytical_sample_rels(design_data)[source]
extract_biological_samples_info(clinical_data)[source]
extract_analytical_samples_info(data)[source]
extract_biosample_analytical_sample_relationship_attributes(clinical_data)[source]
extract_biological_sample_timepoint_rels(clinical_data)[source]
extract_biological_sample_tissue_rels(clinical_data)[source]
extract_subject_disease_rels(clinical_data, separator='|')[source]
extract_subject_intervention_rels(clinical_data, separator='|')[source]
extract_biological_sample_clinical_variables_rels(clinical_data)[source]

proteomicsParser.py

wesParser.py

parser(projectId)[source]
parseWESDataset(projectId, configuration, dataDir)[source]
loadWESDataset(uri, configuration)[source]

This function gets the molecular data from a Whole Exome Sequencing experiment. Input: uri of the processed file resulting from the WES analysis pipeline. The resulting Annovar annotated VCF file from Mutect (sampleID_mutect_annovar.vcf) Output: pandas DataFrame with the columns and filters defined in config.py

extractWESRelationships(data, configuration)[source]