Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis

Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis

Resource Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis Graphical Abstract Authors Yolanda T. Chong, Judice L.Y. Koh, ..., ...

3MB Sizes 36 Downloads 80 Views

Resource

Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis Graphical Abstract

Authors Yolanda T. Chong, Judice L.Y. Koh, ..., Charles Boone, Brenda J. Andrews

Correspondence [email protected]

In Brief High-content, high-throughput imaging and automated analysis enables quantification of the abundance and localization of the GFP-tagged ORFeome in yeast, including how protein dynamics shift over time and respond to different perturbations.

Highlights d

d

Combined SGA technology, high throughput imaging, and automated analysis using SVM Quantified in vivo abundance and localization of 4,100 GFP fusion proteins

d

Identified proteome changes over time in rapamycin, hydroxyurea, and rpd3D strain

d

Images and abundance and localization data for all screens in CYCLoPs database

Chong et al., 2015, Cell 161, 1413–1424 June 4, 2015 ª2015 Elsevier Inc. http://dx.doi.org/10.1016/j.cell.2015.04.051

Resource Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis Yolanda T. Chong,1,4,6 Judice L.Y. Koh,1,4,7 Helena Friesen,1 Kaluarachchi Duffy,1,2 Michael J. Cox,1,2 Alan Moses,3 Jason Moffat,1,2,5 Charles Boone,1,2,5 and Brenda J. Andrews1,2,5,* 1The

Donnelly Centre of Molecular Genetics 3Department of Cell and Systems Biology University of Toronto, Toronto, ON M5S3E1, Canada 4Co-first author 5Co-senior author 6Present address: Cellular Pharmacology, Discovery Sciences, Janssen Pharmaceutical Companies, Johnson & Johnson, 30 Turnhoutseweg, Beerse 2340, Belgium 7Present address: Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, #02-01 Genome, Singapore 138672, Singapore *Correspondence: [email protected] http://dx.doi.org/10.1016/j.cell.2015.04.051 2Department

SUMMARY

Proteomics has proved invaluable in generating large-scale quantitative data; however, the development of systems approaches for examining the proteome in vivo has lagged behind. To evaluate protein abundance and localization on a proteome scale, we exploited the yeast GFP-fusion collection in a pipeline combining automated genetics, high-throughput microscopy, and computational feature analysis. We developed an ensemble of binary classifiers to generate localization data from single-cell measurements and constructed maps of 3,000 proteins connected to 16 localization classes. To survey proteome dynamics in response to different chemical and genetic stimuli, we measure proteome-wide abundance and localization and identified changes over time. We analyzed >20 million cells to identify dynamic proteins that redistribute among multiple localizations in hydroxyurea, rapamycin, and in an rpd3D background. Because our localization and abundance data are quantitative, they provide the opportunity for many types of comparative studies, single cell analyses, modeling, and prediction.

INTRODUCTION The regulation of most biological processes involves changes in protein abundance and localization. However, because of an absence of quantitative data describing global protein localization and abundance, our understanding of proteome responses to perturbations remains rudimentary. Methods for rapid acquisition of in vivo quantitative data about the dynamic proteome are needed to enable the identification of members of protein complexes and coregulated proteins, give global information

about post-translational modifications, protein stability and kinetics, and suggest dependency relationships. One of the most comprehensive reagent sets available for proteome-wide surveys of cell biological phenotypes such as protein localization is the budding yeast open reading frame (ORF)-GFP collection (Huh et al., 2003). Each strain in the ORF-GFP collection carries a unique fusion gene construct in which an ORF is fused to the GFP gene, generating a full-length protein with a COOH-terminus GFP fusion, whose expression is driven by the endogenous ORF promoter. The ORF-GFP collection contains 4,100 strains (of a possible 5,797; Saccharomyces Genome Database http://www.yeastgenome.org/) that gave a GFP signal above background in standard growth conditions and for the majority of proteins, the GFP moiety appears to have little effect on protein function (Huh et al., 2003). The ORF-GFP fusion collection has been analyzed using widefield microscopy and manual inspection to assign 70% of the proteome to subcellular locations (Huh et al., 2003). Additional studies have examined the collection in various conditions and several have quantified abundance of the GFP-tagged proteins (Benanti et al., 2007; Breker et al., 2013; Mazumder et al., 2013), but all have relied on manual assessment of localization, which is non-quantitative and suboptimal since assignments made by different individuals can be subject to biases and often show poor agreement (for review, see Chong et al., 2012). A few recent studies have used automated approaches to analyze protein localization in yeast. Unsupervised clustering of subcellular localization patterns identified colocalized proteins (Handfield et al., 2013; Loo et al., 2014) and a classification approach assessed a mixture of spatial patterns to identify changes in protein localization in a microchemostat array (De´nervaud et al., 2013). However, while these approaches are information-rich, they have not attempted to computationally assign a specific localization for each protein and thus are of limited use to biologists. Here, we describe a high-content screening and machine learning approach to measure protein abundance and localization changes in a systematic and quantitative fashion on a genome scale. Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc. 1413

RESULTS AND DISCUSSION Automated Assessment of Protein Abundance and Localization Our goal was to create an automated platform to quantify the abundance and localization of the 4,100 visible fusion proteins in the ORF-GFP collection. To do this efficiently, we used yeast synthetic genetic array (SGA) technology (Tong et al., 2001) to introduce a cytosolic red fluorescent protein (RFP), a marker of cell boundaries, into the ORF-GFP array (a list of all strains assayed is shown in Table S1). SGA technology allows introduction of any mutation of interest into the array, thus enabling us to combine high throughput genetics and imaging. Our experimental pipeline also includes high-throughput microscopy, automated image analysis, and pattern classification through machine learning. To develop and validate our screening protocol, we imaged our wild-type collections during early log-phase growth, capturing data for >200 cells on average for each individual strain (3 million cells total, n = 3). Cell boundaries were defined in the red channel and features from both the red and the green channel were extracted for each cell using CellProfiler (Carpenter et al., 2006) (Experimental Procedures; Supplemental Experimental Procedures; CellProfiler pipeline available at http:// cyclops.ccbr.utoronto.ca/DOWNLOAD/Download.html). As an initial benchmark of the quality of our data, we extrapolated protein abundance from the mean GFP intensity (Ig) for each protein in the collection (Table S1A–S1D). Our measurements agreed with other studies that estimated protein abundance in strains from the original ORF-GFP collection using flow cytometry (r  0.9 for 2,172 proteins; Figure S1A) (Newman et al., 2006), western blot analysis of TAP-tagged strains (r  0.64 for 2,595 proteins; Figure S1B) (Ghaemmaghami et al., 2003), and mass spectrometry (r  0.66 for 2,684 proteins; Figure S1C) (Kulak et al., 2014), and GFP intensities across all of our biological replicates were highly correlated (r > 0.98; Figure S1D). These findings validate our image acquisition and feature extraction pipelines for studying global protein abundance at the single cell level. To extract information about protein localization patterns from our images, we adopted a machine learning approach. Individual cells were segmented based on their cytosolic RFP and texture measurements from both the RFP and GFP channels were extracted. These measurements, from a representative subset of strains known to localize to single subcellular compartments (Huh et al., 2003), were used to generate training sets for automated classification of remaining proteins. To discriminate among multiple compartments, we found that we needed an ensemble of 60 binary classifiers, which we developed from training sets that comprised >70,000 cell instances hand-picked from images from one of our wild-type screens (Data S1; Supplemental Experimental Procedures). We were able to computationally distinguish among 16 subcellular compartments (Supplemental Experimental Procedures); examples of cells with an ORF-GFP corresponding to each localization are shown in Figure 1A. The classifiers were used to generate a set of 16 quantitative localization scores (LOC scores) for each protein that reflects the proportion of cells assigned to each of the 16 defined subcellular compartments under a given condition 1414 Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc.

(Experimental Procedures; Table S2A). We assessed the performance of our classifiers in several ways. First, we compared our computationally derived pattern assignments to visually assigned localization annotations (Huh et al., 2003) and found 94% agreement among the set of 1,097 proteins assigned to a single compartment by both methods (Figure S1E, upper). Second, we assessed localization of proteins assigned to one or more of seven major localization classes covering 91% of the proteome by comparison to the manual assessment of the yeast ORF-GFP collection (Huh et al., 2003) and found that our computational approach achieved >70% in recall and precision (Figures S1E, lower, and S1F). We explored instances in which our classifier ensemble incorrectly assigned a localization. We found that misclassified proteins tended to be lower abundance (median Ig for correctly classified proteins = 1.1 3 10 3, median Ig for incorrectly classified proteins = 6.6 3 10 4; p < 2.2 3 10 16, Wilcoxon rank sum test). Most false positives also reflected mis-assignment of localizations to compartments that look highly similar. For example, the vacuole and nucleus are both large round compartments and our computational approach misclassified a fraction of vacuolar proteins as localizing to the nucleus or to both the nucleus and the vacuole (35% of the proteins assigned to the vacuole in Huh et al., 2003) (Figure S1E, lower). In contrast, our automated imaging approach had high precision for nucleus (70%) with only 0.7% of our nuclear proteins assigned to the vacuole in Huh et al., 2003). Likewise, small punctate compartments are difficult to distinguish computationally and we observed occasional proteins misclassified as cortical patches, spindle pole, peroxisome, or nucleolus. The original visual assignments of the ORF-GFP collection were validated using secondary assays for 700 proteins whose localization was difficult to determine (Huh et al., 2003); our computational approach achieved a relatively high level of precision and recall based on a single experiment using automated analysis (Figures S1E and S1F). To display the results from our genome-wide screens in a way that would facilitate the visualization of functional information about the proteome, LOC scores, and protein abundance information were extrapolated from GFP signals to generate an ‘‘abundance localization map’’ or ALM (Figure 1B; ALMs 1–18 in Data S2). The wild-type ALM consists of 2,834 proteins (nodes) connected to one or more of 16 localization classes (hubs). Because of auto-fluorescence of yeast cells, our approach could not distinguish between cells with protein of very low abundance and a complete absence of the protein (see Experimental Procedures); 646 proteins with abundance below our threshold are not included in the ALM (Table S1) as well as 664 proteins that were below the threshold for significant LOC scores. In addition to localization information, absolute protein abundance measurements are represented in the ALM by node color (Figure 1B). A large cohort of proteins (1,029) localized to both the cytoplasm and the nucleus; in the ALM, these are divided into two groups: proteins whose localization is predominantly cytosolic and proteins whose localization is predominantly nuclear. By representing these data as an ALM, we are able to visualize proteins with a common localization. The expanded views of

A

Bud

Budneck

Budsite

Cell Periphery

Cortical Patches

Cytoplasm

ER

Endosome

Golgi

Mitochondria

Nuclear Periphery

Nucleolus

Nucleus

Peroxisomes

Spindle Pole

Vacuole/Vac Memb

B

Intensity Molecules Log2 (Ig) per cell

YKL075C GSG1

NAT3 VAM7

CEX1

TTI1

DPH2 YKL215C

YMR102C

RTG2

STB6

YIR014W

RTC5 MKS1

SLF1

SGN1

YBP1 TSR1

ACF4 BRR1

GDT1

PRP2 P

YDR239C

DOM34 SPT2

MUK1

PRP45

HDA1 SAL1

RMT2

NGR1

PRP40

YVH1

YGR203W

MSN2

CUE5

ATG16 QNS1

PCL7

ARD1 YOL098C

MUD1

CDC123

PTP3 APN1

HAP5

IPI1

BUL2

PIB1

CWC22

FIS1 RIM20

YAP3

KTR4

TAF8

AIM20

RPP2A

SPP41 CYK3

OCA1

GYL1 GIM5

MTR4

MGE1

PHO2

FKH2

FMN1

YDR266C

CLF1

BER1 KRE5 TRM9 RAV1

OPI10

BRN1

HOS3

TFB3 PDR17

SMC5

TFC3

YGR017W

SET1

SRB7 RPN14

YLR063W MET10

LYS14

SKP1

HAL9

RTR2 SDS22 RAD1

SCJ1

CAB2

SHG1

YPK2

UPF3

DDI1

GRX2 TYE7 SCC2

CKI1

UBP15

APC5

NTC20

Y U2 YJ

PPH22

VTS1

SPT20

HST2 MRP8

KOG1

EAF5

BDP1

UFD4 CTS2

SRB5

PHA2 DOT6

SKI8

PSK2 K

MDJ2

LSM8

GID8

JLP2 P

CAB4

HRK1

SNF5 ARP2

ECM21

HSF1

YTA6 TRM7

IRC24 HUA1

CDD1

MAF1

MED11

ZDS2 EPL1 ADD37

BST1 CRZ1

CDC36

HEM12

HEM13

SIC1

CDC7

TRM44

RTG1

MTC2

ERG28

HSM3

PRP9

HDA3

JHD2

SPT3

YRB2 ECM14

PUF4 DTD1 ECM25

PDR1

YPR022C

ISY1

AAC3 SNF4

SOM1

VID24

OTU2

MSB1

ORC6

HIF1 TAF3

MCD1

SPP382

RPR2 R

ATG18

KIC1

NOB1

PRS4 S TAH1

WHI4

REG1 TFC8

LSM12 RHB1 BUD14

PDC2

CUL3

URE2 LPP1

PIP2 BUD17

SUT1

SRX1

PLB3 SKS1

PRP3

VPS71

PML1

AFT1

DOG2

RTT101 YGR210C

TRM11

CCL1

RNR2

DFR1 YMR259C

PRP22

SUA5

UBX4

SIP1

GIM4 SNP1

NPY1

YNL040W GCD6

NYV1 SDS24 RPN10

DBF20 CSE2

ICY2

YAT2

KAP114

RKM3

OTU1 SNM1

STB4

GCD2

MAK3

SAP155 SRL1

AMD1 ELP2

XKS1

SLH1

SDS3 HFI1

YOL022C

DYN1

ROX1 YBP2

ECM17

YFL034W

PLP2

SIZ1

DCD1 HRR25 NMA1

SLT2

ECM32

YRB30

MPM1

PSK1

SRS2

KCS1

CDC23

GRE2

CNS1

MAK10

PAN3

K S1 KS

SOK1

GDS1 SHE2

YGL046W

PGM3 TFC1

LSM3

JJJ3 J

YJR116W

IWR1

POC4

NPC2

RPH1

YAP7

SPC110

TFB5 SYH1 RPS28B

BEM4

HSP42

GDB1

GLN3

SCP1

SOK2

MSH6

SYF1

XDJ1 PPH3

MET4

RPL16A

RGT1

NPL4

SNT1 RIB4 APM2

PBS2 S

RAD3

SAP185

YDC1

PMD1

PEX19 MMR1

RIM11

MOD5 RGM1

YGR250C

BNA3

RPS29A

CIN2

SDC1

YHC1

DOT5

BUL1

APC1

YPL009C NFT1 MEU1

SPE2 E

RPS0A 0

SET3

HUL5

NAS6 RAV2

RNT1

URK1

PRP21

CRH1

MHP1 MET18

DPB3 TEA1

TRM12

RIO2

NPR1

CSM1

MTQ2

IES4

CUP9

PYK2 SSN2

IMD2

HST3 BDF2 CDC25 RGA2

VPS72

FZF1 YPS7

YBR028C OCA2 CHS7

COX12

YTH1 MRI1

NAT5

Utp4

-9.5

299

Utp5

-9.2

413

Utp8

-9.1

418

Utp9

-9.2

408

Utp10

-8.7

630

Utp15

-9.8

224

Nan1

-9.1

415

Nup82

-9.6

274

Nup159

-9.5

312

Gle1

-10.7

75

HSH49

DEG1

RUB1

SMY2

UBA4

SWC5

SEN1

COX23

SSK2 K

YJR085C

TRL1

PRS3

PRP28

YCG1 PAP2

KIN1

SKN7

HDA2

MED2

SNF11

YHR122W

CCR4 RPS10A

RNA14 YPK9 YNL311C GCN3

CAF40

UBC8 SPT8

PAT1

FSH2

SNT309

THG1 PCL6

GYP5

IMH1

SMC6

DPH1

PGD1

PPH21 TFB2

NMD4

GSF2

UBP3

YML096W

SWP82

HRQ1

NTH1

RTT102

ZRG8

YDR049W COX19

PCS60

ADI1 YNR014W

SNF12

CST6

VHS1

TFB1

NIT3

ULS1

CUS2

PSY SY4

KTR1

VTC3

PPM2

SWI4

RCN2

OCA5

TFB4

SWI3

YLR225C BEM3

ATG2

CMK2

IRE1

RTG3

MAG2

PRP19

KAR4 HIT1

YDR539W

RAD26

TFC6

PFS2

YMR1

CAC2

MMS21

CDC43

GPB1

GCD10

AIR2

DIP2

AFG2

MPH1

CMP2 GFD1 UBC5

UBP8 VID30

CAF16

LTV1 SSD1

HIR2

CWC27

PCF11

ZTA1

PBP1

GIS1

NAS2

UFD2

GBP2

YBR238C MUP3

HER1

SMD1

YAE1

YKL069W

YPT32

DUS3

MOT3

TIP41 MDM20

DSE4

ARG80

RFC5

RXT2

GDA1

BRF1

IES2

GDE1 SMF3

TRI1

ROX3

MPE1

PRP42 MOB1 TAN1

YJL070C

ERG12

NMD5

TAF1

LDB19 RPS18A

YML079W

PSP2 P

AOS1

MVB12

NDL1 PIN4

SRB4 B

RUP1

YFR006W TSL1

MTR10

DOA1

PIN3

SWI6

NTH2

DBF2

MIG2

HBS1

DUG2

ARO7

YOR131C NST1

MBP1

GCN5

HSP31

KEL3

NPL6

MIG1

RAD50

RPN7

EAF3

YGL039W VID27

CET1

FYV6

SSL2

PAP1 HAC1

YKE2 EDC2

LSM1

SUA7 HOS2

PAH1

TMA108

WRS R 1

LEU3

CUP1-2

VHR1

SMB1

HSH155

SCD5 SKY1

PRP11

IKI1

MDH3

RKM4 INO1

TRZ1 HMS2

YLR118C

SIW14 YOR289W

PBP4 P

YLR126C

YOR296W

YKL023W

PRP6 PRP5

DUS1

TOF1

CDC16

ERP5 P

GSH1 POP1

NGL2

RCY1

GIS4 PSP1

GLR1

STR2

NAM7

RTT106 CAB1

SPE4 E

DIG1

CTI6

MOG1

SSK1

BPL1

RFA1

BAS1 FHL1

RFC1

YGR117C

MET7

CTF4

UME6

MCM1

YBR197C

ECM30

EAF6

APE2

PHO23

RAD2

CBF1

IRA2 AHC2 SRP1

FRA2

MPT5

YMR244C-A

NBA1

RDH54

PHO91

PAN5

PDR3

YBL010C

LSM4

MCA1 DPH5

LOT6

MED1

AZF1

ALY1

POL31

RNQ1

TAF6

IOC4

TPM2 YBL104C

RRP1

YJL012C-A

ALG14

GCR1

YJR011C

RFC3 RIX1

CSI2 CWC25

MDM35 CUE3

RAD23

PRP39

YNL143C RRD1 LSG1

YGR151C

RBS1

URM1

YMR196W YMR074C

STB3

TFA1

URA6

SGT1

YPR045C STE7 YNL313C

SER33

YHL039W

EDC1

ABD1 MSH3

SKI3

DOS2

MMS2

TMA20

GRE3

SQS1

DST1

YGR093W

YPR115W

ART5

GCN2 PRB1

UBA2

TRP1

YOR262W

APD1

SGT2

RKM1

SNU71

HIS2

AKL1

POP5

YGR205W

LSB1

TIF4632

YHR009C YNL157W

RAD27

POL12

YLR218C YDR170W-A LSM2

ASF1

DAL81

BUR2 MFT1

JJJ1

MOT2

S E2 SS E

MOT1

EPS P 1

WTM2

MSL5

PFD1

RKI1

GGA1 STE5

DET1

YPL191C

TDH1

IST3

RET1 PAC10

EAP1

TRM3

KSP1

GYP7

FOL1

RPC17

BUD13

CWC24

YMR178W TYR1

NOT3

GAT1

DYS1

PGM2

YGL079W

LSM7

SAS5

XPT1

NGG1

DHH1

SPB4 B RPD3

NRK1

PRS2

ELP3

YLR108C

PTA1

LUC7

Nucleus

YBR242W VPS64 DUG1

MSB3

IKI3

OCA6

UFD1

SAP A 1

MRS6 YPL067C

MAP1

TPS1

MYO2

NSE3

CNA1 YMR315W UTR4

GIR2 UBP7

STE11

PUS7

RAD52

YBR225W

AIM7

CTK3 RTT107

ADD66 TMA22

SMM1

YDR186C

TFC4 LAP2

IML2 POP2

RBG1

NAT1

YLR345W YGR111W

YDR391C

BNI1

KIN28 YIR035C

YGL101W

SNF8

ATG13

PRS5 YOR006C

SNF1

MPP6

GSP2

RIM101 RPS18B

UBC1

SGF11

PRY1

GRX3 MRC1

ACF2 YML081W HYR1

YFR017C

YNL247W

REI1

STR3 CUP1-1

PUF3 BRE5

SCH9

RTC3

GLC8

CPR6

YJL055W

TOP2 HPR1

GLO2

STE12

ROD1

SGF73

YOL134C

SRP68

YOR214C

RPC53

ARP7

TAF9

SMX3 ISW1

PTI1 RSC2

RFA3

IES1

BUR6

LEA1

TFG1

REF2

NOP13

UBP2 LSB5

DSK2

IES3

UGP1

CHZ1

CWP1 RRP43

NOG2

ISW2

HNT2 YNG2

YER079W

MSO1

ABZ1

BUD27

MXR1

GLN4

RMD1 BCK1

MDY2 DAK1

YAR1

HHO1

BRE1

SMC4

GPM3

MED4 POL5 YPR172W

MED7

YBR139W

RAD9

ATG1

YPL225W

LAP4

RPE1

YOL057W

MUQ1

DDR48

PHO81

IFH1

NRD1

BUD31 SFH1

CYC1

HPA3

REH1

HEM2 RBG2

SER3 R NNT1 SES1

PRO3 TPS3

STH1

YLR455W MET13

MDR1

YER163C

SUB1 NMA2

Cytoplasm

VIP1

FEN2

KTI12

ARF2 HTS1

GET3

TRM112

FES1

ADE16

RBK1

GCS1

CDC37

GVP36 TMA7

REE1 HIR3 PRM5

VPS9

YBL028C

RFA2 RFC4

ESS S 1 HPC2

GAL11 YHB1

CTK1 BNA6

TMA17 PCM1

ABF1 DAT1 SFH5

ECM15 NRP1

YIL161W

YLR143W

POP8

YEL014C

CBC2

SNF2

TAE1

YGR269W

ITC1 YER084W

YER152C

TPA1 POL2

YAR009C VPS75

CDC8

PUS1 RVB1

HSL7

YCR090C

SWC4 VID22 MVD1 MAD2 POP3

YHL008C YKR023W

SMX2

TPK2

FSH1

SNU114

PRD1 ROM2

ADA2

RPB8

PPT1

RFC2

SGV1

YCK2 HSN1

KAP120

YKL169C

RPL19B

NCB2

YCS4 FRS1

CGR1

PAN6

SKT5

SFL1 NMD3

SCD6 YDL144C

GCD1

STO1

SFP1

COS6

RPS17A

LTP1

HUG1

YPD1

YER134C PUF2 LTE1

HOM3 SDO1

PRO1

CCA1

GCN1

STF2

YNK1

NPT1

RPT4

POL1

YPT7

ADH6

RPL1A

RSC8

SPT4 T

YDL085C-A

CDC53

RPB9

STI1 CSE1

PTC3 GET4 RNA1

RAI1

RPL23A

PUT3

SIP5

CFT2

RPL12A QRI1

RPL18B IZH1

JIP5 SYC1

TIM17

YPL229W

HAM1

THI7 OCA4

YBL055C

RPL21A

CAM1

LAP3

SSA4 A

RPT6

SMC2

YEL047C

INP51

HTL1

SUR1

ECM29

SAP30 SPO12

SAK1 RPL14A

TRM82

INM1

GDI1

SPT15

REB1

SET2

YNL010W

EFT1

CYS4

WWM1

RAT1 CKA1

YML082W CTR9

NAF1 RBA50 RPS6B RRP12

A 1 AAH NUP116

YDR250C

YSA1

GIM3

MRT4 SMC1 FPR3 R

GRX1 KES1

ATP12

FRE6

SIT1

RAM1

TRP4

THP2

RIB2 YGR126W

RPL34B

CFT1

GUS1

ERG10 BAR1

TPK3

PHO13 SKI2

MRN1

ATP5

HRB1

UBC4

RSF2

RPO31

DSS4 POB3

THI80

TAH11

NCS2

BCP1

NHP6B

RGR1

TOS1

HOS4

PTC2 P E1 PS

ATO3

DUG3

MES1

NUG1

KTI11 RPL5 GSY2 Y CLU1

BUD20 RPL40B

RAS1

PUB1

RPS4B

WAR1 CTR86

ARO1

MDS3

AGA1

YLR179C

AHA1

RPL6A

KEM1

HHF1

PSY SY2

TBF1 ASG1

GAL83

ATG3

IMD3

LHP1 ATG26 TPS2 RPL11B

KAE1

NMA111

NOG1

NCL1 GPX2

RPS19B

SPT7

YOR238W TPD3

THR1

ARB1

TIF1

CCS1

URH1

RPN12

KAP104

LOT5

RPB10

RRP4 P

NAT4 ZWF1

FPR2 R

SSZ1

YLR132C

YCP4 REX2 X RPL17B CWC21

NOC3

TFA2

WTM1

SBP1 MCK1

IDS2

RTT10

RPS9B

YBR096W

HSP12 SOL3

RPL36B

DPS1

TAL1

IOC2

CAR2

YPL108W

TRP3

ERG13

ARP5

SHQ1

SPT16

NBP35

GYP1

ECM1 FAP7

YHR131C RPL33B

RPS10B

RPS30B SBE22 YPT31

PYC1

CPR5

FRA1

YGR219W

PRE8

URA5 WHI5

IDI1 RIB1

TRM8 TIF11

YBL036C

UTR1

YAP1

TIF4631 FUN12

RPL15B

RPO21

FCP1

ATX1

YOR283W

IES5

PMI40

ALA1

NUT2

NOP53

SGF29

HOG1

PDE2

RPS6A 6 RPO41

RPN8

SOH1

RPL21B

YMR226C

GCN20

RIB3

LYS20 YNL190W

YCR018C-A UBI4

ARO10 IMD4

OLA1

HRP1

STE2

SPL2

LYS1 CDC24 PRP8

RPL9A

HTA2 PLP1 FAS2

PBI2 ACE2

TYS1 RPL35B

RPS16A

YRA1

MCM6

YML108W

CYS3 CDC39

RPL19A

RXT3

YPL260W

SCW4 W

YMR291W

BNA1

RLI1 TMA16 RPS24B

RPN1

MCM7

UBP12

RPL40A

SUI2

FCY1

PRT1

YLR363W-A

RPL22B

FPR1

MSB2

YKL102C RPL14B

DRE2

MTD1

GPD2

SSB1

BCY1

YLR301W

NHP6A

HPT1

HTB2

MSN5

CKS1

RDI1

HXK1

RIX7

RPL43B YHR132W-A

YTA7

YKR043C CNB1

SIN4

WHI3 SVF1

YOR051C

DLS1

MCM4

RPS26B

DAS2

CDC28

YNL155W

RPS0B SEM1

TRM5

GIS2 ARG1

LYS21 RAD6

YNL108C

RPS25B YBR090C

SRB6

RPS7B

SNO1

ADE5,7

PGK1

CDC34

HTB1

TUB4 DIS3

YKR096W

RPL37B

YLR257W

BRR2 R

YNL058C

CDC60 APT1

YOR302W

PBY1

RPL36A

YGR001C YNL134C ZUO1

YCR051W

MET22

SDS23 RPG1

UBA1

SAM4 URA2

PGM1 YNL035C

CKB2

SNA3 SSE1 RPL20B

PKH2

RRP17

NUT1

YJL169W

DDP1

RPL42A

TOS9

GRS1

POP6 ADO1

RPL2A

TUB1

RPS24A

ENO1

SCL1

CKB1

SUM1 YLL032C

TIF5

RPN2

YMR122W-A

PRE5

YFL006W PMU1 HOR2

SUP45

ADE17

RAS2 TRP2 GUK1 PGI1 S F2 SY TRM10

YPT52

BMH1

RPN6

TOM1

ACB1

HXK2

RPL34A

DCS1 BUD32

SRO9

PRS1

RPL4B

YIH1

TCI1

RPL24A

MSH2

KAP122

HEM3 THO2 PRE9

ADE8

UBC9 CPR7

SUP35

TIF6

ARO3

PRP38

ARO9 CAF20 RPL20A

YLR287C

RPS21A

RSR1 RPS17B

TIF3

RRP40

RRP45

SYN8

CDC33

YER156C NAM8

VPS74

RPL7B RPL27B

MCM2

MIC17

LYS2

SBA1

SMC3

URA4

YCK1

RPS4A S4 4

YPR1

ADE6 RPS28A

YSH1

TRX1

TIF2

RPN11

SLX9

SOD1 NEW1

TSR2

NPL3

AAP1

VPS21

RPS8B

DPB2 CDC9 VTC1

LIA1

FUN30 NPA3

HIS7

ADE1

TMA19

RPS27A RPS9A S9

RPL38

MED6

URA7 AIM29 RPL11A

IOC3

UBC13

CRM1 SWI1

YKR018C

NAB2

SPP381

HMT1

IPP1

ACS2

ADE2 SET5

RRP6

RPN5

FUS3

PRE10 HSC82

STM1

SFA1

SAM1

VAS1

THS1

PRE6

PFY1 APA1

FAS1

RRP42

ARG4

NIP1 CKA2 YKL071W

HCH1

PRO2

RPO26

HOM6

PYC2 TRX2

EGD1

CSL4

RPN9

GLY1

YDL063C

SEC14

GPM1 RPL41A

RPS29B

APT2 RPL37A

DEF1 DUR1,2

CDC15

YER034W

SER1

ILS1

NOT5 UGA1 CPR1

SER2 R ARO4

INO80

ALB1

RNR4

MFA1

BAT2

RPL6B

TRP5

NIP7

SAM2

TKL1

GPH1

ELF1

KAP123

PDS5

LYS9

CIC1 BDF1

YOR385W PRE4 E

RPP1A

BDH1 MAK21 ARX1

RPL26A EGD2

SPT6 T CDC55 RPL41B

RPB2

HRT2 CPA1 OYE2

YBR085C-A

HSP82

TSA1

NSA2

PPX1

GDH2

RPS22A YDL124W

MBF1

YIL091C YIL108W RPL13B

ADH5

BRX R 1 RPS22B RHO2

RPP1B

ADK1

TMA23

RPS1B

RPL1B

SOL1

RRP5

PXR1

THR4

FOL3 RPB11

SGD1 SML1 TPI1 HOM2

UTP13 ADE12

FPR4 R

MAK5

UTP6

MEF1 MAP2

RPL43A ASN2

TDH2 RPP2B

SIS1 HNT1 CYC8

BFR2 R

HSP104

RPL7A

RPS19A LEU1

SRP40 ENP1

DBP8

URA1 ASN1

PFK2 DBP6

ARO8 DBP7 UTP30 ARC1

YCR016W NCE103

CAR1

ZPR1 PUF6

BMH2 DLD3

RRT14

YCR087C-A

HGH1

RPS21B

HTZ1

RPC19

RRP46 GDH1 PAF1

TMT1

RPL10

LEO1

NOP7 RSA3

RPB4 B BUD21 UTP14

UTP18 SMT3 ARG3 RPL24B

RPS26A

UTP22

RPS23A NOP4

COX17

TRR1

RCL1

SSB2 B SHM2 RSC9 SSF2

ALD6

YGR251W SRM1 RPL26B NSA1 RPS27B URB1 TAF12

REX4 X

NOC2

MAK11

PSA1

UBP10

NOP15

RTF1

TEF4

SSF1

RPS8A S8

EFG1

RPC40

DRS1

RPL3 RRP8 CHD1 HIS4 DBP3 SSA1 CDC73

FYV7

RPC37 ECM16

RPC82

PFK1

RTR1

RPL29 RPL9B

GLK1

MKT1

ESF2 F

RPB3

Budneck

ADE3

RPL12B

RPS25A

EFT2

UTP4

NAN1

NOC4

RSC58 URB2 TOF2 UTP8

UTP15 RRP9 ESF1 TOP1 RRP15 FAL1

UTP5

YGR283C

NOP16 NOP2 YTM1

KAR3

UTP10

MTR3 SPT5 T

UTP9

NOP12

BUB1 TUP1

NOP8 SKI6

MYO1

BNI5

Nucleolus

MLC2

ELM1

BNR1

CDC14

FOB1 MOB2 SIS2

NAP1

RHR2

FAA3

YPL150W

GLN1

CSR1 YLR407W VHS2

SST2 T TEF1

LSB3 ASK1

RPA34 HAA1 MET6

ZDS1 YEH2

NUF2

ADE4

FPK1

GAR1

GND1 PRP43 RTS1

GIP3

AIM44

RPL8A

GIN4 DBP2

RPA12

KIN4

RRP7

SPC105

YPK1 SOF1

MUP1 MCH1 ASK10 AME1

ALR1

MRPL6

NOP10

NKP2

HMO1

SLN1

Spindle Pole

YEF3 VTH1

TEF2 ALY2 OPY1

RRB1 NSG1 BEM2 SIR3 EMP47 PCT1 ITR2 SPO14 GIP4 TRM1

RBL2 SAC7 SIR2 YOR304C-A

MDM1 CTR1

OPI1 CEP3

CAF120

SEC2

AHP1

AVL9 ENT4

ACK1 RGD1

YDL025C

JSN1

TDH3

MSS4

WHI2

BNI4

PXL1

STE20

SSA2 A CDC11 CDC10

CLA4

KIN2 YVC1 SHS1 RCK2 YKR045C

BUD3 TAT1 YLR422W CPA2 AXL2

BUD22 SPC24

PWP1 PWP2 P

RPA43

NSL1

TID3

RPS14B

DAD4

NET1

DAD2

DSN1 RPA49

NOP9

NOP6 MRPL24 CBF2

TSC13 RPA14

SLK19

SPC25 STU2

HCA4

YJL018W

NVJ1

KRI1

NHP2

DAD1 ZIM17

NOP58 NOP56

NSR1

ERG6

CBF5

MDG1

RPA135

PTK2

YLR326W

RPA190

BBP1 MDH2 BUD2

YDR034W-B TGL4

YFR016C

Cell Periphery

Budsite

YKL105C

TAT2

YDL173W

CAB3

PKC1

OSH6

YPP1

BOR1

ERG25

YJL171C BSD2

MRPL38

YOR285W

DAP1

BIG1

ARE1

VPS13 ACC1

RCR1

ANT1

NUD1

CNM67

DAD3

BIM1

SPC97

SPC98 SPC29

Peroxisome

TEM1

MTW1

SPC34

TGL3

PEX3 X

PEX5 X

BIK1

AXL1

DFG5 RHO5

CDC48

SQT1 SCS7

SRO7

UBP1 ZRT2 YLR194C

PIS1

BUD4 KAP95

TYW1

AQR1 SRP14

PIR1

TPO3

GAS3 OPI3 MNR2

SCS3

VCX1

DNF3

FAA1 SWP1 PHO90

CHO2 HLJ1 S P1 SY

ILV2 YLR050C

YDR090C SKG1

ALG13

SAC1 PEX14 APM4

PHS1

EMC6

PEX13 FLC1

AVT7

VPH2

BAP2

ISU2 ATP3

RER1 HXT2 PEX25

APL1

YGL140C YGL010W

SEC66

PEA2

SCP160

YOP1

APL3 VTC2

ACO2 ATP1

TPN1

ITR1 ECM40

SKG6

VMA22 SRP21

DCP1

PDR16 YMR124W PMT1

ATF1 GCV2 PEX10

RTN1

ATP14 SMY1 MRH1

GCV3

LSC2

BEM1

MAE1

ENT2

LYS4

FAR3

HXT6 SRP72 SEC3

PEX1 PEX17

CAB5 YAP1802 GCV1 ACP1

ENT1

AVT3

PMT2

EPT1 BOI2

YOR1

LAS17

AIM45

EFR3 R

TIM44 YGL108C

YAP1801 YEH1

SEC5 MDN1

YDL012C

PEX12

BUD6

ENA5 LCB4

ENA1

PST1

DGK1

PEX6 X

VHT1

ALD4

ISU1

QCR7

USA1 BOI1 VPS55

Bud

FET3

SNQ2 FMP27

RAX2 YNL146W

ATR1 EGT2

FLC3

ENA2 CRP1 GNP1

YER053C-A

YDR348C YBL029C-A A YGR266W

FAR7

PDR12

GPA2

HIP1 YLR413W END3 YMR295C FAR11

TCB3

UIP5 QDR2 FCY2 FUI1

TCB2 TPO1

SFK1

HXT3

MID2

QDR3

RFS1

MEP1

PDR5 TSC10

HOL1

EXO84

NFS1

ERG27 PRX1

SEC6

SEC15

MIR1

CRN1 KEL1

MCX1

PUT2

ILV1 MSS116

SEC10 ALD5

RGA1

DLD2

EXO70

CTP1 MRPL32

ISD11

YGR235C

SEC7 YDR061W

PDB1 MSS51

YGR026W

NIS1

PEX30

YHM2 NDI1

FTR1 ATP2 MDM38 CHS5 LRG1 YTA12

ER

UFE1 ALG2

TIP20

GTB1

NNF2

ESA1

PBN1

PSD1 FAR8 SLP1

MNS1

HNM1

CYT1

GRX5 IDP1

TIM23

Mitochondria

SEY1

MRP13

AIM42

YMR031C TPO4 IST2 DNF2

YGR130C SLM1

LEM3 ARF3

VPS8 SLA2 BSP1

VPS3

YDR210W

PAM16 TUF1 YOR356W COX5A MIS1 ABF2

TOM20

CBK1

SVL3

COY1

MNP1

YLH47

COX7

SOD2 KGD1

ERG7

YHR003C

MDH1

STF1

COX6 MRPL13

NAT2

MDJ1

SEC20

PDA1

MAS2

CAX4 YJR111C SLY41

FPS1 KIP1

YBR016W

ZRG17

VPS16

MSC3

YLR414C

MRPL8 PEX21

PER1

CEM1

YHR162W

TIM50

CCZ1

YSC85 GUP1

RIM1

GEA2

SKG3

ILV6

NUM1

RVS161

GET2 DCP2

ARC15

MYO3

RSN1

YSY6

RGD2 SHO1

FLC2

MYO5

PPZ1

DSL1

PRK1

AIM5

EDE1 YCR043C

TCB1 UBX2

INP52

MRPL25

LEU9

GLO3

NNF1

SED4

ALG5

SDP1

BZZ1 IST1 SHE3

ABP1

R E1 RS

PUT1

COX14

MCM22

TWF1

WSC2

YMR086W

TOM70

ALO1

YMR253C

SSM4

PAM17 YKL027W

GGA2

ERG26 SEC12

ATP15 SNX41 PIN2

CBP3

RMD9

ALG12

ATP19 RUD3

TNA1

SPC3

ARG8

LEU4

UTP23

DJP1 ENT5 SEC39 TIM54

ALG1

ATP7

CAP1 AUR1

COR1 CLC1

SEC11 CMD1

COX15

YDR367W ARG2

YHR045W W PIL1 MSC2 ALE1 SCT1 CUE4

YDR476C SVP26

YGR031W

Cortical Patches Nuclear Endosome Vacuole/Vacuolar Periphery Membrane

MRPL19

FUM1 RBD2 YMR010W

IRC23

YMR134W

ORM2

NEO1

LAS21

SUR2 R Y U3 YJ

RIC1 GRH1

BCS1 CBP6 KEX2 YGR021W

OSH3

OSH2

YIM1

OXA1

SFB3

GPI8 CYB5

YNR040W YGR207C

BCH1

ARC35

ORM1

EMC4

APL2 ATP11 MRPL50

LAA1

AIM37

PAN1

LAC1

KEX1

SYS1

PST2

RSM25

CYC3

EHT1

SPC1

ADP1

TIM21 NFU1

RVS167

APQ12

NHA1

AIM21

NCE102

CBR1

VRP1

NSG2

DNF1

COG8

MHR1 ARG5,6 MSC6

USO1 ATP17

MEP3

YBR287W

SEC24 ERP3

YPR071W SEC31

YNL181W

ERP2 P

YEL043W YDR307W

NCP1 YGR125W PEX29

PNC1

AIM41 QCR2 YNL122C IMG1 YER078C ABC1

MDL2

COG6 BUD7

GTS1

DPL1

MRPS17

YMR148W

SDH2

SHM1 DID4 AIM38 COX8 MRP1

UPS1

YGR102C NAM9

MBA1

AIM17

YPL107W MRPL10 YJR003C

TOM71

YML6

MTM1

COG5

RSM26

MRPL31

MSF1 MNE1 MSP1

MRPL15 YFH1

NCS6

MRP4 MRPL20

SRV RV2

CWH43 LCB2

UPS2

COG3

AIP1

ARC18

SEC16

GPI14

TMN3

MYO4

ARC40

YJL021C

GWT1

TOS7

FYV4

SLA1

SAC6

MSC7

RCR2

GOT1

NHX1

SFB2

YDL119C MRP7 ARH1

YMR31

MRPL35

IFM1

PIM1

BET3

HOC1

NTE1 CPT1

FMP52

YHR192W

AIM13

ERP4 RP

YPR063C

ODC1 FZO1

LIP5 MRP49

SEN54 RSM7

RSM27 YME1

MSY1

RPT3

COX20

SNX4

PGA2

GPT2

GPI13

SLY1

SWA2 GAS1

TOS6

YLR290C

NCA2

MSE1

TRS130

LCB1

EMC2

ISA1

MRPS35

PET127 VPS5

MRPL11

PET100

AIM24

ISA2 ADH3

PET10 SWS2 S

MNN9

AFG3

MTF2 ECM19 PDX1 SSQ1 CDC50 RTC6

MNN10

FMP21

DSS1 AIM34

MAS1

OM45

VPS17

CYM1

PET123

MRPL51

ATG20

YDL121C FAA4

YNR021W PHO86

PGA3

DCW1 KRE27 CCW14

UTR2

ARF1

ABP140 SBH2

YIP3

MRPL1

AIM36 VPS52 RSM23

MMT1

CIT1 MRPL16

QCR10

AIM9

MRPL39

MAD1

DPM1

KRE1

QCR8

ATM1 SGM1

HEM15

MRPL4

PEP5 P

YML007C-A

CHA1

PNT1

PEP8 P

BFR1 SNL1 WBP1 SEC63

YGR150C

MRPL37 COX18

MRPL23 VPS53

MLP2

SEC13

AAC1

ERV41

01.Oct

MRPL27

SLC1

STT3

COQ9 SSH1

ERV29 SLG1

CBP2

SEN15 RSM24

COX4

HOR7

STE24

ANP1 COQ1

EMC1

BCH2 EMP24

AIM31

MSK1

LYS12

PIC2

MRPL28

VPS35

SEC18

ATG21

ISM1

YLR064W

MIP1 OST1

SNA2

CPD1

ERG5 KRE11

ERG3

MRPL36 YME2

MRPL9

FMP41 TIM11

SMI1 TGL5 MIA40

TVP38 FEN1

CCW12 ELO1

COG1

EHD3

MSH1

YDR056C SOP4 SNX3

PRY3

VAC14

SPF1 YMD8 PKR1 VPS41 IFA38 SHP1

VPS60

TCO89

DIP5 DFM1

ERG11

SUR4 R

PMR1

CTR2

MRL1

EAR1 YLR001C

SEC72

ATG27 ICE2

TIP1

BPT1 GTR2 ERV25 SHR3

GLE2

PIB2

GTR1

CWP2

OLE1

VPS68

ALG9 PMP1

RPM2

Golgi

DNM1 CCP1

YPS30 AFG1 SAC3 YOL107W

VPS54

YHL017W

OMS1

TRS31 HEM1

YLR253W

PHB2 YER077C

PHB1 HSE1

IDH1

OSH1 YGR012W

YMC2 STV1

CAF4

KGD2

YKL063C CST26 DOA4

TCA17 MMF1

LOS1

SFT2

NTA1 VAC8

LAG1

YKL061W

IRC22

NUP170

MEX67

NDE1 SCY1

MEH1 APC11

OST4

MRP20 VMA10

TMS1

GPI11

YBT1

ESBP6 P

MRS4

GEA1

COA1

MAM3 FET5

APM3

YET3

VBA4 V BA4 YDR352 W

HMG1

YLL023C

MDV1

YER128W

YOL092W

MFB1

VPS15

NUP188

VMA4

LSC1

FSF1 RIF1 VMA5

PMP2

PTH2

SIR4

SLM4 ERG1

FTH1 SEC22

COT1

MRPL49

PHO88

TRS20 TUB3

VAM6 IMG2 SYG1

BNA4

BTS1 EMP70 GPI17

YML018C

YPT35

MRP51 RTC1 JAC1

PEP3 P BRO1

YCF1 YKL077W DBP5

MAM33 ARL3

VTA1

RSM22

MVP1

YBR241C

RFU1 YJL062W-A VPH1 TRS120 KIP3

TFP1

AAT1

SEC21

SEC28 SNA4

APM1 KRE33 VMA2 DOP1

IDH2 GYP6 MRPL22

VPS45

MRPS5

VPS1 VPS33

TVP15 NUP159

POL3 CHC1

APL5

Log 2 (Intensity Ig)

NUP82

FRQ1

COP1

APS P 3

AKR1 GLE1 NUP60

NUP1

VRG4

NUP53

TRS23 POM34

SEC27 NUP85

NUP133

PGA1

TVP18 SRC1

NUP157 PEP1 ASM4

NUP84 YPR174C

ARL1

NUP57

OAC1 NIC96 MON2

NUP100 SEC26

NUP49 NUP145

MTC1 DRS2

ULP1 HEH2 RET2 NDC1 ESC1 SSP120

APL6

NUP192

INP53 GTT3

HMG2

MLP1

-11

-4

Figure 1. Abundance Localization Map of a Wild-Type Strain (A) Images of single cells, showing typical proteins assigned by our classifiers to each localization. Each strain produces a cytosolic RFP to mark cell boundaries and a different GFP-tagged protein. (B) Left: abundance Localization Map (ALM) illustrating protein abundance and localization information for 2,834 proteins (nodes) connected to one or more of 16 colored hubs that represent distinct cellular locations. Absolute protein abundance measurements are represented on a gray-green-yellow scale for each protein. Right: expanded views of components of the network are shown in blue and red boxes. Micrographs of cells expressing proteins are shown, whose localization is typical of the assigned compartment (nucleolus [blue box] and nuclear periphery [red box]) with protein abundance (in molecules per cell), estimated from the GFP intensity values. See also Figure S1, Tables S1, S2, and S3, and Data S2.

components of the network in Figure 1B illustrate members of known protein complexes that colocalize, with representative micrographs. For example, our automated classifiers correctly

assigned all seven components of the t-UTP complex to the nucleolus (Figure 1B, blue box) and 25 of the 27 components of the Nup82 nuclear pore complex available in the GFP Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc. 1415

nucleus organization vesicle organization transport protein complex biogenesis vacuole organization cellular homeostasis cellular component morphogenesis cell cycle signaling process conjugation vesicle-mediated transport cellular membrane organization cytokinesis cell budding cytoskeleton organization fungal-type cell wall organization cellular lipid metabolism peroxisome organization translation cofactor metabolism generation of precursor metabolites, energy mitochondrion organization cellular respiration cellular amino acid & derivative metabolism ribosome biogenesis response to chemical stimulus cellular carbohydrate metabolism protein modification cellular aromatic compound metabolism cellular protein catabolism heterocycle metabolism sporulation RNA metabolism chromosome organization DNA metabolism transcription meiosis transposition response to stress chromosome segregation essential non-essential phosphorylated protein non-phosphorylated protein conserved in eukaryotes

A

B

Nucleus

Budneck Ce ll Perip h e ry Bu d site

SpindleP ole

Peroxis ome ER Co rtica l Pa tch e s

V a cu o le / Vacu o la r Me m b ra n e

Mitochondria

N u cle a r Perip h e ry Endos ome Log 2(Intensity Ig) - 11

20

ER Budneck Budsite Bud Cortical Patches Endosome Golgi Cell Periphery Vacuole/Vac Membrane Cytoplasm Nucleus Spindle Pole Peroxisome Nuclear Periphery Nucleolus Mitochondria Single localization Multiple localizations

Nucleolus

Bu d

Golgi

1 Log(P-value)

Cyto p la sm

- 6.5

Figure 2. Thematic Diagram and Functional Enrichment Analysis of Shared Protein Localization (A) A summary network of protein localization based on the data illustrated in Figure 1B is shown. (B) Functional enrichment analysis of proteins assigned to various cell localizations using our automated classifiers. Log (p values) indicating the significance of the functional enrichments are color-coded (key at top of graph). See also Figure S2.

collection (Aitchison and Rout, 2012) to the nuclear periphery (some of which are shown in Figure 1B, red box). Two components of the Nup82 complex, Sec13, and Nup116, appeared to behave differently because they were not assigned to the nuclear periphery; our computational localization assignments for these proteins correspond with published manual annotations (Huh et al., 2003). These examples illustrate the utility of ALMs for providing in vivo validation of protein complexes identified through biochemical or other means (see below for more discussion). For the purposes of visualizing broader trends in our data, we summarized the ALM into a network where each compartment is represented by a node whose size indicates the number of proteins with that localization (Figure 2A). The edges connecting nodes represent dual occupancy of proteins to more than one subcellular compartment and edge thickness indicates the number of proteins that are shared between compartments (Figure 2A). Node and edge colors reflect mean protein abundance (Figure 2A; Data S2). By depicting our ALM data in this manner, we are able to easily visualize the proportion of shared localization patterns between various subcellular compartments, as well as the average abundances of shared proteins. For example, the highest fraction of proteins with multiple localizations were those shared between the nucleus and the cytoplasm (67% of 1,538 colocalized proteins), consistent with the well-characterized shuttling of many proteins involved in transcription, RNA metabolism, and DNA repair (Aitchison and Rout, 2012). We saw strong colocalization of proteins to cortical patches, bud, budsite, and cell periphery (38% of the 264 localization assignments to these four compartments), largely due to the localization of cell polarity proteins associated with the actin cytoskeleton to each of these compartments at different times in the cell cycle. 1416 Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc.

We saw enrichment of proteins with annotated roles associated with specific organelles in cognate cellular compartments, e.g., proteins with annotated roles in cytoskeletal organization, budding, and cytokinesis were highly enriched in the bud-associated compartments and cortical patches (Figure 2B). Essential proteins were enriched in nucleus, nucleolus, and spindle pole and were more frequently associated with multiple localization classes (Figure 2B) and physical interactions among proteins were enriched significantly within each organelle except for the cytoplasm (Figure S2A). Finally, in agreement with previous studies (Huh et al., 2003), the majority of the localizations were to the cytoplasm (37%) and nucleus (28%; Figures 1B, 2A, and S2B), with most of these proteins assigned to both classes (Figures 2B and S2C). However, even outside of the shared nuclear and cytoplasmic proteins, 18% of the proteome (511 proteins) showed multiple localizations in up to five different compartments (Figures S2C and S2D), emphasizing that the proteome is far more complex than a mutually exclusive organization of subcellular units. Proteins that localize to multiple compartments were enriched in roles that involve regulation, such as cell cycle, signaling, cytokinesis, and budding and were more likely to be phosphoproteins, consistent with a role for phosphorylation in regulating subcellular movement (Figure 2B). Analysis of total GFP fluorescence also provided useful proteome-wide information about protein abundance. For example, the most abundant proteins in the cell were enriched in the cytoplasm (some of which colocalized at the cell periphery) and within the ER, Golgi, vacuole/vacuolar membrane, and nucleolus, which function in part to maintain the integrity of the intracellular environment (Figures 2A and S2E). Indeed, highly abundant proteins were enriched for roles in cell homeostasis (vacuole/vacuolar membrane, p < 5.9 3 10 5)

and transport (cell periphery, p < 4.5 3 10 8; ER, p < 2.0 3 10 6; Golgi, p < 2.7 3 10 5), as well as translation (cytoplasm, p < 3.4 3 10 69) and metabolism (cytoplasm, p < 1.5 3 10 6) (Figures 2A and S2F). Our automated analysis allowed us to distinguish proteins that varied in concentration over three orders of magnitude, from an estimated lower limit of 40 molecules per cell for Are1, an ER protein that was undetectable by western blot as a TAP-tagged protein, to an upper limit of 19,000 molecules per cell for Gln1, glutamine synthase, one of the highest abundance proteins in the TAP data set (Ghaemmaghami et al., 2003) (Figure 1B; Table S1). Our data set can also be mined using more detailed single cell analyses. For example, we compared single cells containing two GFP-tagged proteins that are known to relocalize between nucleus and cytoplasm in a cell cycle-dependent manner, Mcm2 (Yan et al., 1993) and Whi5 (Costanzo et al., 2004), and two proteins that are localized to the nucleus but are degraded in a cell cycle-specific manner, Sic1 (Visintin et al., 1998) and Far1 (Blondel et al., 2000). For each protein, we plotted the fraction of single cells assigned to cytoplasm, nucleus, or whose protein was below our threshold of detection for abundance (Supplemental Experimental Procedures). The Mcm2 and Whi5 proteins in all of the single cells were classified as localizing to either nucleus or cytoplasm (Figure S2G) with a larger fraction of cells containing nuclear Mcm2 than nuclear Whi5, consistent with their cellcycle biology. In contrast, for Sic1 and Far1, most of the cells that had detectable protein were classified as nuclear, while the majority of cells had undetectable protein levels (Figure S2G), suggesting that the protein had been degraded. Thus, processing single cell data offers the potential to reveal complex regulatory mechanisms governing protein localization and abundance. The increased sensitivity afforded by our approach allowed us to reliably assign a quantitative localization for 52 of the 156 proteins previously annotated as ‘‘ambiguous’’ (Table S3). For example, we confidently assigned Itr2, a myo-inositol transporter, Yeh1, a steryl ester hydrolase, and Bor1, a boron efflux carrier (http://www.yeastgenome.org/), to the cell periphery. Orthogonal functional genomic data sets supported our novel assignments of several proteins to subcellular compartments. We assigned the uncharacterized protein Ymr010w to the Golgi, and YMR010W shows strong synthetic lethal interactions with genes involved in transport/retention of proteins in the ER and Golgi (Costanzo et al., 2010; Koh et al., 2010). We also assigned Yir003w (Aim21) to cortical patches, and YIR003W has many genetic interactions with known actin patch genes (Costanzo et al., 2010). Together, these data suggest that our classifiers efficiently capture localization patterns and abundance information for most proteins and that these localizations can reveal important biology and contribute to predictions of protein function. Quantitative Analysis of Proteome Abundance Changes in Response to Environmental and Genetic Perturbations We exploited the ALMs to explore the response of the proteome to either environmental or genetic perturbation in an unbiased and quantitative fashion. We chose two chemical perturbations—rapamycin and hydroxyurea (HU) treatment—both of which compromise yeast proliferation and have been used

extensively as reagents to study nutrient deprivation and DNA replication stress, respectively. Rapamycin is an inhibitor of the TORC1 kinase complex, which controls cell growth by affecting protein synthesis and metabolism (Loewith and Hall, 2011), while HU inhibits DNA replication through deoxyribonucleotide depletion (Alvino et al., 2007). As a proof-of-principle for analysis of a genetic perturbation, we also examined proteome abundance and dynamics in a mutant lacking the catalytic subunit of a lysine deacetylase (KDAC), Rpd3 (Yang and Seto, 2008), which we had previously analyzed in systematic genetic screens (Kaluarachchi Duffy et al., 2012). Rpd3-mediated lysine deacetylation is a dynamic posttranslational modification with a well-defined role in negatively regulating gene expression through modification of histones. Rpd3 also deacetylates many non-histone proteins, regulating properties such as protein stability and protein-protein interactions (Kaluarachchi Duffy et al., 2012; Henriksen et al., 2012). Our quantitative approach is ideal for identifying changes in proteins that are not related to transcriptional changes. Because of the automated nature of our analysis, we were able to examine 15 different screening conditions (three times for HU treatment; nine times for rapamycin treatment; three rpd3D screens) and extract features from a total population of >20 million cells. With our classifiers, we assigned localization patterns for the GFP-fusion proteins expressed in 95% of cells imaged. Our compendium of high-resolution images as well as quantitative information on protein abundance and localization in both wild-type and perturbed cells is available as a searchable web-accessible interface called Collection of Yeast Cells and Localization Patterns (CYCLoPs) (http://cyclops. ccbr.utoronto.ca). Protein abundance information revealed 674 unique changes across the 15 conditional screens (Table S4; proteins with abundance changes using a stringent 2-fold cutoff (dPL); dPL < 1 (red) or dPL >1 (green) shown in Figure 3 and Table S5). Importantly, we detected protein abundance changes consistent with known biology and with other studies. For example, deletion of RPD3 largely resulted in increases in protein abundance (Figure 3), consistent with the well-characterized role of Rpd3 as a repressor of gene expression. We also observed 22 proteins, including several ribosome biogenesis and ribosomal proteins, that were downregulated in rapamycin and upregulated in rpd3D (Figure 3, center right). These findings reflect that inhibition of TORC1 by rapamycin leads to repression of Ribi (ribosome biogenesis) and RP (ribosomal protein) coding genes in a manner dependent on Rpd3 (for review, see Broach, 2012). Inhibition of TORC1 by rapamycin also causes repression of energy-consuming processes (Dechant and Peter, 2008)—our imaging assay showed a time-dependent decrease in abundance of regulators of protein translation by rapamycin (Figures 3, bottom right, and S3A) while levels of proteins involved in vacuolar protein degradation (e.g., Prc1 peptidase) and energy regeneration (e.g., Cox4 subunit of cytochrome c oxidase) were increased (Figures 3, top right, and S3A). Interestingly, there was very little overlap between our HU and rapamycin screens with respect to protein abundance changes (Figure 3). This difference largely appears to reflect activation of the Environmental Stress Response (ESR) (Gasch et al., 2000) following rapamycin treatment and growth inhibition (209/355 Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc. 1417

Figure 3. Protein Abundance Changes following Perturbations Detected Using Automated Imaging Hierarchical clustergram of GFP proteins showing a >2-fold change in protein level (GFP intensity). dPL values were calculated from treated over untreated data sets and clustered using average linkage with uncentered correlation. Parts of the clustergram showing distinct trends in the data are boxed in yellow. To the right are shown expanded views of the boxed regions, illustrating groups of proteins that have increased abundance in rapamycin (top), increased abundance in rpd3D together with decreased abundance in rapamycin (middle), and decreased abundance in rapamycin (bottom). See also Figure S3 and Tables S4 and S5.

of total changes occurred in proteins encoded by ESR genes; Table S5) but not following HU treatment (only 28/91 total changes occurred in proteins encoded by ESR genes; Table S5) (Dubacq et al., 2006). This difference may be explained by the effect of HU versus rapamycin on growth of the cells: HU treatment activates the intra-S phase checkpoint, causing cycling cells to accumulate in S/G2 (Alvino et al., 2007), whereas 1418 Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc.

rapamycin causes inhibition of growth (Loewith and Hall, 2011), which is associated with the ESR (Brauer et al., 2008). The changes in protein abundance in cells treated with rapamycin correlated closely with published protein abundance changes assessed by mass spectrometry (Fournier et al., 2010) (Figure S3B, upper). To ask if protein abundance changes correlated with changes in transcript levels, we compared the protein abundance values from our rapamycin experiment with published data sets measuring mRNA levels. We saw that rapamycin caused a delayed relationship between mRNA and protein levels with maximal correlation between mRNA expression at 120 min and protein levels at 620 min of rapamycin treatment (R = 0.7), similar to the findings by mass spectrometry (Fournier et al., 2010) (Figure S3B, lower). To ask if transcript and protein abundance were also correlated following a genetic perturbation, we considered any protein whose abundance changed detectably in the rpd3D strain (no stringent cut-off applied). Of all the proteins that increased in abundance, 79% were associated with genes whose transcript levels increased in the absence of RPD3 (p < 3.3 3 10 12) (Figure S3C) (Fazzio et al., 2001), and this set of proteins was enriched for components of the cellular stress response (Figure S3A), consistent with Rpd3 biology (Alejandro-Osorio et al., 2009). While we see generally good agreement between our protein levels and published mRNA levels, proteins that change in abundance in the absence of changes in the corresponding transcript are of particular interest as candidates for post-translational regulation. For example, of the seven proteins that changed in abundance >5-fold with no

corresponding change in published mRNA levels (Table S4) (Fazzio et al., 2001), three have been found to be hyperacetylated in an rpd3D strain by mass spectrometry (Henriksen et al., 2012): Pbi2, Tsl1, and Hyr1. Acetylation of these proteins may be regulating their abundance, consistent with known roles for acetylation in negatively regulating protein stability in yeast (Robert et al., 2011). Because our data are quantitative, we can identify unexpected changes in the proteome, even with a high background of protein abundance changes resulting from Rpd3 repressing transcription. Flux Networks Map Localization and Abundance Changes in Environmental and Genetic Perturbations As described above, ALMs provide a useful means of displaying proteome-scale information about protein abundance and localization in living cells. We reasoned that comparative analysis of ALMs derived from perturbed cell populations would enable us to examine protein dynamics in an unbiased and quantitative manner. To do this, we used LOC scores derived from wildtype and perturbed ALMs to define a ‘‘z-LOC score.’’ The z-LOC score quantifies changes in protein localization between sub-cellular compartments; a negative z-LOC value denotes the exit of a protein from a particular compartment, while a positive z-LOC value represents movement toward a specific location. For example, in the rpd3D strain, Pct1, an enzyme involved in phospholipid biosynthesis (Howe et al., 2002), showed a negative z-LOC score for the nuclear periphery and a positive z-LOC score for the nucleus, indicating displacement from the nuclear periphery to the nucleus (Figures 4A–4D). To display z-LOC score data in a biologically useful way, we generated ‘‘flux networks,’’ which are directional, quantitative vectors that define the relationship between two ALMs and represent the dynamic localization and abundance changes following a perturbation. In the flux networks, large nodes represent subcellular locations while small nodes are moving proteins, whose abundance relative to unperturbed cells is color-coded. The vectors illustrate the direction of movement of the protein and the thickness of the vector corresponds to the proportion of cells in the population with the indicated change in localization. The flux networks for our rpd3D, HU, and rapamycin experiments are shown in Figures 4C, S4A, and 5A. Some proteins, such as Dig2, a repressor of mating-specific gene expression (Figures 4C and 4D), are computationally scored with a specific localization pattern only in the perturbed situation (Dig2 retains some cytoplasmic localization but is enriched in the nucleus in the rpd3D strain). In this case, the vector on the rpd3D flux network shows movement into a compartment with no specific departure localization (Figure 4C). Interestingly, both Pct1 (see above) and Dig2 are acetylated proteins in vivo and both are hyperacetylated in an rpd3D strain (Henriksen et al., 2012), suggesting that Rpd3-mediated deacetylation may regulate their localization. In general, flux networks provide a quantitative snapshot of protein changes within the cell following environmental or genetic perturbation. A recent screen visually identified localization changes in the GFP collection treated with HU (Tkach et al., 2012), allowing us an opportunity to benchmark our computational method for detecting protein localization changes. Our computational analysis

identified 54 of the 97 proteins for which we could visually identify the changes reported by Tkach et al. in our images (Figure S4A; Table S6A; Supplemental Experimental Procedures). For another 36 of the 97 proteins, our computational approach identified the relevant localization in wild-type and HU but the change was not statistically significant; generally for these proteins a small fraction of the cells showed the localization change. Our computational approach also identified a novel set of 40 significant localization changes that had not been scored in the previous manual screen (Figure S4A; Table S6B). The majority of these (34) were proteins that had a mixed localization pattern in wild-type (for example nucleus and cytoplasm), which translates to more subtle quantitative localization changes in HU that would be challenging to score manually. Thus our computational approach is able to efficiently and quantitatively detect changes in protein localization with high sensitivity, capturing dynamic information that cannot be reasonably scored by assessment of thousands of images by eye. Integrating protein interaction information into flux networks highlights co-translocation of protein complexes or functionally related proteins. For instance, HU treatment resulted in relocalization of several proteins that form P-bodies or cytoplasmic granules, which sequester untranslated mRNAs under certain stresses like glucose deprivation (Teixeira and Parker, 2007; Tkach et al., 2012). We observe nine P-body proteins recruited from a diffuse cytoplasmic pattern to foci (Figures S4A and S4B), which expands upon similar observations (Tkach et al., 2012; De´nervaud et al., 2013). Proteins Change in Localization or Abundance but Usually Not Both Flux networks can also be used to explore the general relationship between protein abundance and localization changes. Following 80 min of HU treatment, 28 localization and 40 abundance changes (vPL > 1) were observed in the yeast proteome (Tables S2 and S4). Remarkably, only one protein, Smf3, an iron transporter, changed both localization and abundance using our criteria. Treatment with HU induces activation of the Aft regulon, which controls iron mobilization (Dubacq et al., 2006), leading to an increase in SMF3 transcript abundance; however, relocalization of Smf3 has not been previously observed. Similar trends were seen after prolonged exposure to HU and in cells perturbed by rapamycin treatment or deletion of RPD3, suggesting that the response of the proteome to genetic or chemical insults involves a change in protein abundance or localization but rarely both (Figure S4C). Our Rapamycin flux networks (Figure 5) also uncover intriguing time-dependent changes in the proteome that recapitulate known and meaningful biology. For example, our flux network displaying changes after 140 min of rapamycin treatment (RAP140) (Figure 5A) revealed translocation of transcription factors Sfp1, which promotes RP gene transcription, and Stp1, which is involved in amino acid sensing, from the nucleus to the cytoplasm. In contrast, Stb3, a repressor of RP gene expression, moved from the cytoplasm to the nucleus, reflecting the need to inhibit protein translation in starvation conditions, which are mimicked by rapamycin treatment (Figure 5A) (Broach, 2012) We also detected subcellular movement of two other Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc. 1419

A

B

C

D

transcription factors, Gln3 and Gat1, whose translocation to the nucleus is a hallmark of TORC1 inhibition (Figure 5A) (Broach, 2012). Observing cells over time following rapamycin treatment allowed us to track proteins that move transiently between com1420 Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc.

Figure 4. Automated Assessment of the Proteome in an RPD3 Deletion Mutant (A) Changes in protein localization to the nucleus in an rpd3D strain. The graph illustrates the distribution of z-LOC scores of all GFP-fusion proteins scored using our nuclear classifier in an rpd3D strain relative to wild-type. One outlier, Pct1, with increased nuclear localization is highlighted by the arrow. (B) Pairwise comparison between z-LOC scores for the nucleus versus those for nuclear periphery localization in rpd3D relative to wild-type. Pct1 is highlighted as an example of a protein showing obvious relocalization from the nuclear periphery to a more general nuclear localization in the rpd3D strain. (C) Flux network illustrating abundance and localization changes in the proteome in an rpd3D strain. The wild-type and rpd3D ALMs were used to generate the flux network where large nodes represent cell locations while the small nodes are moving proteins. (D) Representative micrographs showing Pct1GFP and Dig2-GFP localization in wild-type and rpd3D cells. See also Figure S4.

partments. A striking example is Rts3, a PP2A and Sit4 phosphatase-associated protein of unknown function (Gavin et al., 2006; Breitkreutz et al., 2010). Rts3 redistributed from the nucleus to the cytoplasm at early time points (Figure 5B), with a re-concentration in the nucleus after prolonged rapamycin treatment, which excludes it from the flux network in Figure 5A. This type of movement is often associated with signaling molecules that respond to a change in a stimulus and then adapt and return to their stead- state localization, e.g., the Hog1 MAP kinase in response to high osmolarity (Ferrigno et al., 1998). Our flux network also revealed an increase in Rts3-GFP abundance, which we confirmed using a TAPtagged version of Rts3 (Figure 5B). The transient relocalization and the abundance change suggest that Rts3 may play a key role in response to TORC1 inhibition. Indeed, we and others have found that Rts3 is required for cell survival in rapamycin (Figure 5B) (Parsons et al., 2004; Xie et al., 2005; Hood-DeGrenier, 2011). Thus, flux network-based analysis of the proteome over time can highlight unusual proteome dynamics in response to environmental or genetic stress and facilitate attribution of protein function. Several proteins that translocated out of the nucleolus following exposure of cells to rapamycin for 140 min also shared physical

A C

D

B

Figure 5. Proteome Dynamics in Rapamycin-Treated Cells over Time (A) Flux network illustrating protein localization and abundance changes after 140 min of rapamycin treatment. Proteins annotated as having physical interactions (BioGrid) are indicated with red edges. Representative micrographs of cells expressing GFP-fusions of transcription factors whose localization is influenced by rapamycin treatment are shown below: untreated cells (RAP0) and cells treated with rapamycin for 140 min (RAP140). (B) Micrographs showing transient relocalization of Rts3 from largely nuclear in untreated cells to diffusely cytoplasmic within 15 min of rapamycin treatment, with a re-concentration in the nucleus after prolonged rapamycin treatment (120 min). At bottom left is a western blot showing Rts3-TAP, following treatment with rapamycin for times indicated, with Swi6 loading control. Bottom right shows spot dilutions revealing sensitivity of an rts3D strain to rapamycin. (C) Pairwise comparison between z-LOC scores for the nucleus versus those for the nucleolus in cells treated with rapamycin for 60 min. Components of the exosome are highlighted with red arrows while proteins involved in ribosome biogenesis are highlighted with blue arrows. Rapamycin treatment induces exosome complex redistribution from being concentrated in the nucleolus to becoming dispersed throughout the nucleus as early as 60 min. (D) Representative micrographs of untreated and rapamycin-treated cells expressing GFP-fusions to exosome components (left; green images), a Sik1-RFP nucleolar marker (middle; red images) and an overlay of the two images (right).

interactions (Figure 5A, protein interactions indicated by red edges). Rrp43, Rrp42, Rrp4, Ski6, and Dis3 are members of the exosome, a complex involved in processing of rRNA, small nucleolar RNAs (snoRNAs), small nuclear RNAs (snRNAs), tRNAs, and cryptic unstable transcripts (CUTs) (Lykke-Andersen et al., 2009). A re-distribution from the nucleolus to the nucleus was

observed for eight out of ten components of the exosome after a 60 min exposure to rapamycin (Figure 5C). We confirmed this movement of the exosome using colocalization of GFP-tagged exosome components in cells expressing the nucleolar marker Sik1-RFP (Figure 5D). Subcellular movement of the exosome during rapamycin treatment has not been observed previously, Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc. 1421

and suggests a significant nuclear role for the complex and perhaps the need for specific RNA regulation or processing events during the cellular response to rapamycin. Thus, integrating flux networks with protein interaction data can provide new insight into protein dynamics and point to new compartment-specific functions for protein complexes. PERSPECTIVE Although incredibly powerful, functional genomic approaches that explore gene expression (Kemmeren et al., 2014), protein-protein interactions (Rees et al., 2011) and genetic interactions (Costanzo et al., 2010) fail to yield a spatio-temporal resolution that will be required to understand biological processes as complex dynamic systems. Here, we describe a high-content screening system that allows us to rapidly survey proteome dynamics in living cells. This approach is designed to define proteome flux using a computational image-based method and provide a proof-of-principle for both a technical and a conceptual platform that ought to be easily adapted to other systems. This study is our first attempt at a quantitative exploration of abundance and localization of the proteome. We used a high quality, available resource, the ORF-GFP collection, which includes 71% of the full proteome, with 1,492 strains excluded that failed to yield a GFP signal above background in the original manual assessment of the collection under standard growth conditions (Huh et al. [2003] and Saccharomyces Genome Database http://www.yeastgenome.org/). Our experimental approach can be readily modified to include technical improvements that should enable more comprehensive analysis of the proteome. A number of the ‘‘missing’’ proteins are detectable by western blot when C-terminally tagged with the TAP moiety (344 proteins) (Ghaemmaghami et al., 2003) and peptides corresponding to another 543 proteins have been identified in at least one mass spectrometry study (Nagaraj et al., 2012; Kulak et al., 2014); these proteins are clearly produced under standard conditions. Some proteins may be undetectable due to destabilization caused by the C-terminal tag; for these, insertion of the tag at the N terminus of the protein may produce useful alleles. In general, use of yeast-optimized versions of GFP that outperform the original GFP(S65T) for brightness, photostability, and function as fusion proteins (Lee et al., 2013) is likely to allow visualization of more low abundance proteins. In addition, low abundance proteins may be detectable using a recently developed protein tag known as the SunTag, which can recruit multiple copies of an antibody-GFP fusion protein and permits imaging of single molecules (Tanenbaum et al., 2014). The remaining proteins, which are likely not expressed under standard conditions, will have to be characterized on a case-by-case basis; growth in other conditions, chosen from transcription data or that affect growth of the corresponding mutant, may enable image-based analysis of these proteins. Automated image analysis of the proteome offers biological insight at several different levels. First, our compendium of images offers a qualitative view of the abundance and localization of each protein. In contrast to some other imaging data sets, we provide data for an average of 80 cells per image, allowing 1422 Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc.

assessment of general trends for each protein, including cellcycle distribution and variability within a population. Second, these data reveal previously unidentified regulatory events and offer many opportunities for generation of particular hypotheses. Third, although our data set contains many interesting individual findings, we predict that this type of data will be of particular value for comparative analyses using quantitative methodologies. Comparison of image compendia from different mutant strains or conditions with relevant proteomic data sets promises to provide insight into regulated proteolysis, post-translational modifications, and the regulation of protein-protein interactions. We also demonstrate that we can use single cell measurements to identify and distinguish two types of changes in cell-cycleregulated proteins and many other types of single-cell analysis should be possible with our data. Finally, quantitative data sets are particularly important for the development of models for pathway activity and cellular function. Just as transcriptome data have revolutionized our understanding of gene regulation, quantitative proteomic data, with the dimension of localization added to more straightforward abundance measures, will change the way researchers understand the cellular consequences of genetic and environmental stress. We suggest that integrating condition-specific quantitative abundance and localization data for the proteome will allow even greater predictive power and represent an essential component underlying modeling approaches. EXPERIMENTAL PROCEDURES Generation of Yeast Strains and Growth Conditions S. cerevisiae strains used in this experiment are listed in Table S1. Using a modified SGA protocol (Tong et al., 2001) RPL39pr-tdTomato and marked alleles were introduced into the arrayed yeast GFP collection (Huh et al., 2003). Haploid MATa strains from SGA were grown and imaged in low fluorescence synthetic medium (Sheff and Thorn, 2004) supplemented with methionine, NAT, and 2% glucose. Image Acquisition, Analysis, and Pattern Classification Large-scale acquisition of fluorescent micrographs was conducted using a high-throughput spinning-disc confocal microscope (Opera, PerkinElmer). For co-localization analysis, images were acquired using a spinning-disc confocal (WaveFX, Quorum Technologies) connected to a DMI 600B fluorescence microscope (Leica Microsystems) and ImagEM charge-coupled device camera (Hamamatsu C9100-13, Hamamatsu Photonics). Image analysis, segmentation, and acquisition of measurements (Figure S5A) were conducted using CellProfiler (Carpenter et al., 2006). Mean GFP intensity was used to represent the protein abundance in the cell and changes > 2-fold compared to wild-type were defined as significant. To generate the training sets, objects from the WT1 screen were visually inspected using CellProfiler Analyst (Jones et al., 2008), to select objects suitable for training different patterns (see Data S1) representing distinct locations and stages of the cell cycle. The resulting handpicked ‘‘designer’’ training sets consisted of over 70,000 objects. Using a supervised machine learning approach, an ensemble of 60 binary classifiers was generated to identify cells in 3 stages of the cell cycle, 16 localization patterns, ghost objects, and dead cells (Figures S5B–S5D). Classifiers were generated using SVM (Platt, 1998) (sequential optimization, PolyKernel with 1 as exponent) and improvement in accuracy was achieved through bootstrap aggregation (Breiman, 1996). An automated classification system was developed, using WEKA (Hall et al., 2009) and in-house statistical programs to generate training models from WT1 and to apply classifiers to all single cell instances across 18 screens.

Defining Changes in Localization For a protein i, the LOC score reflects the proportion of cells that are assigned to a specific localization class j under condition k:

Alvino, G.M., Collingwood, D., Murphy, J.M., Delrow, J., Brewer, B.J., and Raghuraman, M.K. (2007). Replication in hydroxyurea: it’s a matter of time. Mol. Cell. Biol. 27, 6396–6406.

nijk LOCijk = P : nijk

Benanti, J.A., Cheung, S.K., Brady, M.C., and Toczyski, D.P. (2007). A proteomic screen reveals SCFGrr1 targets that regulate the glycolytic-gluconeogenic switch. Nat. Cell Biol. 9, 1184–1191.

j

For each localization a cutoff was determined such that each strain was assigned to that localization if the fraction of cells assigned to that localization passed the cutoff (Table S7). Data Visualization The abundance localization maps (ALMs) and flux networks were generated using Cytoscape (Cline et al., 2007). Two types of node were defined in the ALMs: 16 localization classes as ‘‘hub’’ nodes, and individual proteins as ‘‘protein’’ nodes. For the ALMs in Data S2, node positions were determined by edge-weighted spring-embedded layout, and the visual distance of the edge between the hub and protein node depicts the quantitative localization membership of the protein to the specific organelle (i.e., shorter distance for higher LOC score). For the WT1 ALM shown in Figure 1B, the positions of the localization clusters were determined by edge-weighted spring-embedded layout and protein nodes were manually shifted to more clearly show abundance groupings. The node colors in gray-green-yellowscale are proportional to the Ig value (e.g., darker node for more abundant protein). SUPPLEMENTAL INFORMATION Supplemental Information includes Supplemental Experimental Procedures, five figures, seven tables, and two data sets and can be found with this article online at http://dx.doi.org/10.1016/j.cell.2015.04.051. AUTHOR CONTRIBUTIONS B.J.A., C.B., and J.M. designed and supervised the project. Y.T.C., M.J.C., and S.K.D. carried out and analyzed experiments. J.L.Y.K., Y.T.C., H.F., and A.M. performed large-scale analysis and interpretation. B.J.A., C.B., J.M., J.L.Y.K., and Y.T.C. created figures. B.J.A., C.B., J.M., Y.T.C., J.L.Y.K., and H.F. wrote the manuscript. ACKNOWLEDGMENTS We thank Jeff Liu, David Wade-Farley, Quaid Morris, Lee Zamparo, Harm Van Bakel, Kyle Tsui, and Corey Nislow for technical assistance and advice. We thank Michael Costanzo for helpful discussions and comments on the manuscript. This work was supported by grant MOP-97939 from the Canadian Institutes for Health Research to B.A. and C.B., and with funds from the Ontario Institute for Cancer Research to J.M. Infrastructure for high-content imaging was acquired with funds from the Canadian Foundation for Innovation and the Ministry of Research and Innovation (Ontario) (grant 21745 to B.A. and C.B.). B.A., J.M., and C.B. are Senior Fellows in the Genetic Networks program of the Canadian Institute for Advanced Research.

Blondel, M., Galan, J.M., Chi, Y., Lafourcade, C., Longaretti, C., Deshaies, R.J., and Peter, M. (2000). Nuclear-specific degradation of Far1 is controlled by the localization of the F-box protein Cdc4. EMBO J. 19, 6085–6097. Brauer, M.J., Huttenhower, C., Airoldi, E.M., Rosenstein, R., Matese, J.C., Gresham, D., Boer, V.M., Troyanskaya, O.G., and Botstein, D. (2008). Coordination of growth rate, cell cycle, stress response, and metabolic activity in yeast. Mol. Biol. Cell 19, 352–367. Breiman, L. (1996). Bagging predictors. Mach. Learn. 2, 123–140. Breitkreutz, A., Choi, H., Sharom, J.R., Boucher, L., Neduva, V., Larsen, B., Lin, Z.Y., Breitkreutz, B.J., Stark, C., Liu, G., et al. (2010). A global protein kinase and phosphatase interaction network in yeast. Science 328, 1043–1046. Breker, M., Gymrek, M., and Schuldiner, M. (2013). A novel single-cell screening platform reveals proteome plasticity during yeast stress responses. J. Cell Biol. 200, 839–850. Broach, J.R. (2012). Nutritional control of growth and development in yeast. Genetics 192, 73–105. Carpenter, A.E., Jones, T.R., Lamprecht, M.R., Clarke, C., Kang, I.H., Friman, O., Guertin, D.A., Chang, J.H., Lindquist, R.A., Moffat, J., et al. (2006). CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 7, R100. Chong, Y.T., Cox, M.J., and Andrews, B. (2012). Proteome-wide screens in Saccharomyces cerevisiae using the yeast GFP collection. Adv. Exp. Med. Biol. 736, 169–178. Cline, M.S., Smoot, M., Cerami, E., Kuchinsky, A., Landys, N., Workman, C., Christmas, R., Avila-Campilo, I., Creech, M., Gross, B., et al. (2007). Integration of biological networks and gene expression data using Cytoscape. Nat. Protoc. 2, 2366–2382. Costanzo, M., Nishikawa, J.L., Tang, X., Millman, J.S., Schub, O., Breitkreuz, K., Dewar, D., Rupes, I., Andrews, B., and Tyers, M. (2004). CDK activity antagonizes Whi5, an inhibitor of G1/S transcription in yeast. Cell 117, 899–913. Costanzo, M., Baryshnikova, A., Bellay, J., Kim, Y., Spear, E.D., Sevier, C.S., Ding, H., Koh, J.L., Toufighi, K., Mostafavi, S., et al. (2010). The genetic landscape of a cell. Science 327, 425–431. Dechant, R., and Peter, M. (2008). Nutrient signals driving cell growth. Curr. Opin. Cell Biol. 20, 678–687. De´nervaud, N., Becker, J., Delgado-Gonzalo, R., Damay, P., Rajkumar, A.S., Unser, M., Shore, D., Naef, F., and Maerkl, S.J. (2013). A chemostat array enables the spatio-temporal analysis of the yeast proteome. Proc. Natl. Acad. Sci. USA 110, 15842–15847. Dubacq, C., Chevalier, A., Courbeyrette, R., Petat, C., Gidrol, X., and Mann, C. (2006). Role of the iron mobilization and oxidative stress regulons in the genomic response of yeast to hydroxyurea. Mol. Genet. Genomics 275, 114–124.

Received: November 7, 2014 Revised: February 5, 2015 Accepted: April 9, 2015 Published: June 4, 2015

Fazzio, T.G., Kooperberg, C., Goldmark, J.P., Neal, C., Basom, R., Delrow, J., and Tsukiyama, T. (2001). Widespread collaboration of Isw2 and Sin3-Rpd3 chromatin remodeling complexes in transcriptional repression. Mol. Cell. Biol. 21, 6450–6460.

REFERENCES

Ferrigno, P., Posas, F., Koepp, D., Saito, H., and Silver, P.A. (1998). Regulated nucleo/cytoplasmic exchange of HOG1 MAPK requires the importin beta homologs NMD5 and XPO1. EMBO J. 17, 5606–5614.

Aitchison, J.D., and Rout, M.P. (2012). The yeast nuclear pore complex and transport through it. Genetics 190, 855–883. Alejandro-Osorio, A.L., Huebert, D.J., Porcaro, D.T., Sonntag, M.E., Nillasithanukroh, S., Will, J.L., and Gasch, A.P. (2009). The histone deacetylase Rpd3p is required for transient changes in genomic expression in response to stress. Genome Biol. 10, R57.

Fournier, M.L., Paulson, A., Pavelka, N., Mosley, A.L., Gaudenz, K., Bradford, W.D., Glynn, E., Li, H., Sardiu, M.E., Fleharty, B., et al. (2010). Delayed correlation of mRNA and protein expression in rapamycin-treated cells and a role for Ggc1 in cellular sensitivity to rapamycin. Mol. Cell. Proteomics 9, 271–284. Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G., Botstein, D., and Brown, P.O. (2000). Genomic expression programs in the

Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc. 1423

response of yeast cells to environmental changes. Mol. Biol. Cell 11, 4241– 4257.

Lykke-Andersen, S., Brodersen, D.E., and Jensen, T.H. (2009). Origins and activities of the eukaryotic exosome. J. Cell Sci. 122, 1487–1494.

Gavin, A.C., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C., Jensen, L.J., Bastuck, S., Du¨mpelfeld, B., et al. (2006). Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636.

Mazumder, A., Tummler, K., Bathe, M., and Samson, L.D. (2013). Single-cell analysis of ribonucleotide reductase transcriptional and translational response to DNA damage. Mol. Cell. Biol. 33, 635–642.

Ghaemmaghami, S., Huh, W.K., Bower, K., Howson, R.W., Belle, A., Dephoure, N., O’Shea, E.K., and Weissman, J.S. (2003). Global analysis of protein expression in yeast. Nature 425, 737–741.

Nagaraj, N., Kulak, N.A., Cox, J., Neuhauser, N., Mayr, K., Hoerning, O., Vorm, O., and Mann, M. (2012). System-wide perturbation analysis with nearly complete coverage of the yeast proteome by single-shot ultra HPLC runs on a bench top Orbitrap. Mol. Cell. Proteomics 11, M111.013722.

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I.H. (2009). The WEKA data mining software: an update. SIGKDD Explor. 11, 10–18. Handfield, L.F., Chong, Y.T., Simmons, J., Andrews, B.J., and Moses, A.M. (2013). Unsupervised clustering of subcellular protein expression patterns in high-throughput microscopy images reveals protein complexes and functional relationships between proteins. PLoS Comput. Biol. 9, e1003085. Henriksen, P., Wagner, S.A., Weinert, B.T., Sharma, S., Bacinskaja, G., Rehman, M., Juffer, A.H., Walther, T.C., Lisby, M., and Choudhary, C. (2012). Proteome-wide analysis of lysine acetylation suggests its broad regulatory scope in Saccharomyces cerevisiae. Mol. Cell. Proteomics 11, 1510–1522. Hood-DeGrenier, J.K. (2011). Identification of phosphatase 2A-like Sit4-mediated signalling and ubiquitin-dependent protein sorting as modulators of caffeine sensitivity in S. cerevisiae. Yeast 28, 189–204. Howe, A.G., Zaremberg, V., and McMaster, C.R. (2002). Cessation of growth to prevent cell death due to inhibition of phosphatidylcholine synthesis is impaired at 37 degrees C in Saccharomyces cerevisiae. J. Biol. Chem. 277, 44100–44107. Huh, W.K., Falvo, J.V., Gerke, L.C., Carroll, A.S., Howson, R.W., Weissman, J.S., and O’Shea, E.K. (2003). Global analysis of protein localization in budding yeast. Nature 425, 686–691. Jones, T.R., Kang, I.H., Wheeler, D.B., Lindquist, R.A., Papallo, A., Sabatini, D.M., Golland, P., and Carpenter, A.E. (2008). CellProfiler Analyst: data exploration and analysis software for complex image-based screens. BMC Bioinformatics 9, 482. Kaluarachchi Duffy, S., Friesen, H., Baryshnikova, A., Lambert, J.P., Chong, Y.T., Figeys, D., and Andrews, B. (2012). Exploring the yeast acetylome using functional genomics. Cell 149, 936–948. Kemmeren, P., Sameith, K., van de Pasch, L.A., Benschop, J.J., Lenstra, T.L., Margaritis, T., O’Duibhir, E., Apweiler, E., van Wageningen, S., Ko, C.W., et al. (2014). Large-scale genetic perturbations reveal regulatory networks and an abundance of gene-specific repressors. Cell 157, 740–752. Koh, J.L., Ding, H., Costanzo, M., Baryshnikova, A., Toufighi, K., Bader, G.D., Myers, C.L., Andrews, B.J., and Boone, C. (2010). DRYGIN: a database of quantitative genetic interaction networks in yeast. Nucleic Acids Res. 38, D502–D507. Kulak, N.A., Pichler, G., Paron, I., Nagaraj, N., and Mann, M. (2014). Minimal, encapsulated proteomic-sample processing applied to copy-number estimation in eukaryotic cells. Nat. Methods 11, 319–324. Lee, S., Lim, W.A., and Thorn, K.S. (2013). Improved blue, green, and red fluorescent protein tagging vectors for S. cerevisiae. PLoS ONE 8, e67902. Loewith, R., and Hall, M.N. (2011). Target of rapamycin (TOR) in nutrient signaling and growth control. Genetics 189, 1177–1201. Loo, L.H., Laksameethanasan, D., and Tung, Y.L. (2014). Quantitative protein localization signatures reveal an association between spatial and functional divergences of proteins. PLoS Comput. Biol. 10, e1003504.

1424 Cell 161, 1413–1424, June 4, 2015 ª2015 Elsevier Inc.

Newman, J.R., Ghaemmaghami, S., Ihmels, J., Breslow, D.K., Noble, M., DeRisi, J.L., and Weissman, J.S. (2006). Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441, 840–846. Parsons, A.B., Brost, R.L., Ding, H., Li, Z., Zhang, C., Sheikh, B., Brown, G.W., Kane, P.M., Hughes, T.R., and Boone, C. (2004). Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways. Nat. Biotechnol. 22, 62–69. Platt, J. (1998). Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods—Support Vector Learning (MIT Press), pp. 185–208. Rees, J.S., Lowe, N., Armean, I.M., Roote, J., Johnson, G., Drummond, E., Spriggs, H., Ryder, E., Russell, S., St Johnston, D., et al. (2011). In vivo analysis of proteomes and interactomes using Parallel Affinity Capture (iPAC) coupled to mass spectrometry. Mol. Cell. Proteomics 10, M110.002386. Robert, T., Vanoli, F., Chiolo, I., Shubassi, G., Bernstein, K.A., Rothstein, R., Botrugno, O.A., Parazzoli, D., Oldani, A., Minucci, S., and Foiani, M. (2011). HDACs link the DNA damage response, processing of double-strand breaks and autophagy. Nature 471, 74–79. Sheff, M.A., and Thorn, K.S. (2004). Optimized cassettes for fluorescent protein tagging in Saccharomyces cerevisiae. Yeast 21, 661–670. Tanenbaum, M.E., Gilbert, L.A., Qi, L.S., Weissman, J.S., and Vale, R.D. (2014). A protein-tagging system for signal amplification in gene expression and fluorescence imaging. Cell 159, 635–646. Teixeira, D., and Parker, R. (2007). Analysis of P-body assembly in Saccharomyces cerevisiae. Mol. Biol. Cell 18, 2274–2287. Tkach, J.M., Yimit, A., Lee, A.Y., Riffle, M., Costanzo, M., Jaschob, D., Hendry, J.A., Ou, J., Moffat, J., Boone, C., et al. (2012). Dissecting DNA damage response pathways by analysing protein localization and abundance changes during DNA replication stress. Nat. Cell Biol. 14, 966–976. Tong, A.H., Evangelista, M., Parsons, A.B., Xu, H., Bader, G.D., Page´, N., Robinson, M., Raghibizadeh, S., Hogue, C.W., Bussey, H., et al. (2001). Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368. Visintin, R., Craig, K., Hwang, E.S., Prinz, S., Tyers, M., and Amon, A. (1998). The phosphatase Cdc14 triggers mitotic exit by reversal of Cdk-dependent phosphorylation. Mol. Cell 2, 709–718. Xie, M.W., Jin, F., Hwang, H., Hwang, S., Anand, V., Duncan, M.C., and Huang, J. (2005). Insights into TOR function and rapamycin response: chemical genomic profiling by using a high-density cell array method. Proc. Natl. Acad. Sci. USA 102, 7215–7220. Yan, H., Merchant, A.M., and Tye, B.K. (1993). Cell cycle-regulated nuclear localization of MCM2 and MCM3, which are required for the initiation of DNA synthesis at chromosomal replication origins in yeast. Genes Dev. 7, 2149– 2160. Yang, X.J., and Seto, E. (2008). The Rpd3/Hda1 family of lysine deacetylases: from bacteria and yeast to mice and men. Nat. Rev. Mol. Cell Biol. 9, 206–218.