Uploading Sample and Data Relationship Format for Proteomics (SDRF-Proteomics) to CKG as Experimental Design and Clinical Data.

Here, we show how to easily upload SDRF-proteomcis standard format to CKG by converting it into the Experimental design and clinical data required by CKG.

[1]:
from ckg.graphdb_builder import builder_utils

Read SDRF file and convert it to CKG format

[6]:
df = builder_utils.convert_sdrf_file_to_ckg('../assets/example.sdrf')
[7]:
df.head()
[7]:
subject external_id biological_sample external_id analytical_sample external_id grouping1 tissue Alkaline phosphatase measurement (88810008) Aspartate aminotransferase measurement (45896001) Bilirubin level (302787001) Body mass index (60621009) Fasting blood glucose level (271062006) Low density lipoprotein cholesterol measurement (113079009) Alanine aminotransferase measurement (34608000) Waist circumference (276361009)
0 31 31 31_C6 Healthy blood plasma 54.0 30.0 15.0 27.774423 5.07 2.1 24.0 108.0
1 32 32 32_C7 Healthy blood plasma 27.0 28.0 17.0 28.727377 6.09 4.3 27.0 108.0
2 33 33 33_C8 Healthy blood plasma 69.0 21.0 9.0 28.841532 4.93 4.1 18.0 90.0
3 34 34 34_C9 Healthy blood plasma 101.0 26.0 12.0 42.056933 5.33 4.8 22.0 134.0
4 35 35 35_C10 Healthy blood plasma 61.0 25.0 8.0 29.434851 4.80 3.9 18.0 102.0

Extract Experimental Design

[8]:
experimental_design_cols = ['subject external_id', 'biological_sample external_id', 'analytical_sample external_id', 'grouping1']
experimental_design_data = df[experimental_design_cols]

experimental_design_data.head()
[8]:
subject external_id biological_sample external_id analytical_sample external_id grouping1
0 31 31 31_C6 Healthy
1 32 32 32_C7 Healthy
2 33 33 33_C8 Healthy
3 34 34 34_C9 Healthy
4 35 35 35_C10 Healthy
[ ]: