STEP 2

In this step datasets are checked for possible normalization issues and batch effects on a one-to-one basis. Batch effects are assessed by analysing each sample's distribution using the Shapiro-Wilk test for normality and checking whether the center of each distribution is similar to the other samples; most cases did not require any additional normalization (pre-processing steps in respective publications described in Supplementary Table S1), only Geiger et al. (2012) and Guo et al. (2012) required some additional quantile-normalization to account for slight sample deviations from normal distributions.

This step additionally involves normalizing the abundances of complex-associated subunits to the trimmed mean of the complexes (as previosuly described in Ori et al., 2013; Ori et al., 2016). Briefly, proteins belonging to the same complex were normalized by the respective trimmed mean (or interquartile mean) of the complex subunits across all individuals/samples. In case of proteins involved in multiple complexes, the average value from all the corresponding complexes was taken into account. Given the complex-normalized abundances, the variance of each subunit in a given complex was calculated. The output-directory of this step summarizes these results.


wp_step2_code.py

Python code required for checking normalization of datasets and complex-based normalisation.


complex_filtered.zip (41MB)

Result files after filtering for complex-related proteins only.

complex_stoichiometry.zip (52.6MB)

Result files after filtering for complex-related proteins normalized to complex.