A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of...

29
1 A comparative and integrative approach identifies ATPase family, AAA domain containing 2 as a likely driver of cell proliferation in lung adenocarcinoma Robert Fouret* 1 , Julien Laffaire* 2 , Paul Hofman 3 , Michèle Beau-Faller 4 , Julien Mazieres 5 , Pierre Validire 6 , Philippe Girard 6 , Sophie Camilleri-Bröet 7 , Fabien Vaylet 8 , François Leroy-Ladurie 9 , Jean- Charles Soria 10 , Pierre Fouret 11 Affiliations: 1. DCom, Télécom ParisTech, Paris, France 2. Programme Carte d’Identité des Tumeurs, Ligue Nationale Contre le Cancer, Paris, France 3. CHU Nice, Nice, France 4. CHU Strasbourg, Strasbourg, France 5. CHU Toulouse, Toulouse, France 6. Institut Mutualiste Montsouris, Paris, France 7. Hôpital Européen George Pompidou, Paris, France 8. Hôpital d’instruction des armées Percy, Clamart, France 9. Centre Chirurgical Marie-Lannelongue, Le Plessis-Robinson, France 10. Institut Gustave-Roussy, Villejuif, France, and Université Paris XI, Le Kremlin-Bicêtre, France 11. INSERM Génétique des tumeurs, Villejuif, France, and Université Pierre et Marie Curie, Paris, France Running title: ATAD2 drives cell proliferation in lung adenocarcinoma Key-words: lung adenocarcinoma, cell proliferation, ATAD2, MYC, driver of cancer Financial support (to P Fouret) - Institut National du Cancer (Programme National d’Excellence Spécialisé Poumon) - Ligue Nationale Contre le Cancer (Programme Carte d’Identité des Tumeurs) - Association pour la Recherche sur le Cancer (grant number SFI20101201740). Correspondance: Prof. Pierre Fouret, INSERM Génétique des tumeurs U985, Institut Gustave-Roussy, 114 rue E. Vaillant, 94805 Villejuif Cedex, France. Tel +33 (0)1.42.17.77.82 Fax +33 (0)1.42.17.77.77 email [email protected] No conflict of interest Word count: 4645 Figure: 4 Tables: 2 Supplementary material: 3 figures, 4 tables, 1 material and methods and supplementary figure legends * These authors are first co-authors. Research. on November 24, 2020. © 2012 American Association for Cancer clincancerres.aacrjournals.org Downloaded from Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Transcript of A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of...

Page 1: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

1

A comparative and integrative approach identifies ATPase family, AAA domain containing 2 as a likely driver of cell proliferation in lung adenocarcinoma Robert Fouret*1, Julien Laffaire*2, Paul Hofman3, Michèle Beau-Faller4, Julien Mazieres5, Pierre Validire6, Philippe Girard6, Sophie Camilleri-Bröet7, Fabien Vaylet8, François Leroy-Ladurie9, Jean-Charles Soria10, Pierre Fouret11 Affiliations: 1. DCom, Télécom ParisTech, Paris, France 2. Programme Carte d’Identité des Tumeurs, Ligue Nationale Contre le Cancer, Paris, France 3. CHU Nice, Nice, France 4. CHU Strasbourg, Strasbourg, France 5. CHU Toulouse, Toulouse, France 6. Institut Mutualiste Montsouris, Paris, France 7. Hôpital Européen George Pompidou, Paris, France 8. Hôpital d’instruction des armées Percy, Clamart, France 9. Centre Chirurgical Marie-Lannelongue, Le Plessis-Robinson, France 10. Institut Gustave-Roussy, Villejuif, France, and Université Paris XI, Le Kremlin-Bicêtre, France 11. INSERM Génétique des tumeurs, Villejuif, France, and Université Pierre et Marie Curie, Paris, France Running title: ATAD2 drives cell proliferation in lung adenocarcinoma Key-words: lung adenocarcinoma, cell proliferation, ATAD2, MYC, driver of cancer Financial support (to P Fouret) - Institut National du Cancer (Programme National d’Excellence Spécialisé Poumon) - Ligue Nationale Contre le Cancer (Programme Carte d’Identité des Tumeurs) - Association pour la Recherche sur le Cancer (grant number SFI20101201740). Correspondance: Prof. Pierre Fouret, INSERM Génétique des tumeurs U985, Institut Gustave-Roussy, 114 rue E. Vaillant, 94805 Villejuif Cedex, France. Tel +33 (0)1.42.17.77.82 Fax +33 (0)1.42.17.77.77 email [email protected] No conflict of interest Word count: 4645 Figure: 4 Tables: 2 Supplementary material: 3 figures, 4 tables, 1 material and methods and supplementary figure legends * These authors are first co-authors.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 2: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

2

Statement of translational relevance: Our results suggest that the aberrant expression of MYC targets that participate in the program

responsible for uncontrolled proliferation may be attributed to ATAD2 deregulated expression. This

further suggests that ATAD2 levels may predict the MYC dependency of lung adenocarcinoma, which

should be exploited for therapeutic purposes. While MYC has been considered as a frequent and very

relevant therapeutic target in lung cancer, specific inhibition of MYC has not been achieved and no

MYC inhibitor is currently in the clinic. ATAD2 is worthwhile to investigate as a therapeutic target,

which appears feasible given its ATPase activity and its bromodomain.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 3: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

3

Abstract: Purpose: To identify genetic changes that could drive cancer pathogenesis in never and ever smokers

with lung adenocarcinoma.

Experimental Design: We analyzed the copy-number and gene expression profiles of lung

adenocarcinomas in 165 patients and related the alterations to smoking status. Having found

differences in the tumor profiles, we integrated copy-number and gene expression data from 80

paired samples.

Results: Amplifications at 8q24.12 overlapping MYC and ATAD2 were more frequent in ever smokers.

Unsupervised analysis of gene expression revealed two groups: in the group with mainly never

smokers the tumors expressed genes common to normal lung; in the group with more ever smokers

the tumors expressed ‘proliferative’ and ‘invasive’ gene clusters. Integration of copy-number and

gene expression data identified one module enriched in mitotic genes and MYC targets. Its main

associated modulator was ATAD2, a co-factor of MYC. A strong dose-response relationship between

ATAD2 and proliferation-related gene expression was noted in both never and ever smokers, which

was verified in two independent cohorts. Both ATAD2 and MYC expression correlated with 8q24.12

amplification and were higher in ever smokers. However, only ATAD2 – and not MYC -

overexpression explained the behavior of proliferation-related genes and predicted a worse

prognosis independently of disease stage in a large validation cohort.

Conclusions: The likely driving force behind MYC contribution to uncontrolled cell proliferation in

lung adenocarcinoma is ATAD2. Deregulation of ATAD2 is mainly related to gene amplification and is

more frequent in ever smokers.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 4: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

4

The majority of lung cancers are caused by tobacco smoking. However, even in people who have

never smoked, lung cancer would rank as the seventh most common cause of cancer death

worldwide(1). In 2000, lung cancer in never smokers accounted in France for 17% cancer deaths in

women and 4% in men(2).

For four major genes involved in the pathogenesis of lung cancers, ALK, EGFR, KRAS and TP53,

striking differences in the molecular alterations of these genes have been found in lung cancers in

never and ever smokers(3)(4). Molecular alterations include translocations for ALK or point

mutations for EGFR, KRAS and TP53(5)(6). In addition, copy-number changes contribute through

associated gene deregulation to the malignant phenotype. For instance, MYC is frequently amplified

and overexpressed in lung cancers(7). No study has reported definitive associations between

amplifications or deletions and smoking status(8).

We analyzed the copy-number and gene expression profiles of lung adenocarcinomas and related the

alterations to smoking status. Having found differences in the tumor profiles, we integrated copy-

number and gene expression data to identify genetic changes that could drive cancer pathogenesis.

The present study differed from previous studies on two aspects. Firstly, the number of tumors from

never smokers was greater in our study than in previous studies(8)(9). Secondly, to control for

potential bias the ever smoker group was constructed by matching ever smokers to never smokers,

such that the whole cohort was enriched in never smokers and the group of ever smokers had clinical

characteristics (sex, disease stage) identical to never smokers.

Methods Detailed information on patients, samples and methods used in copy-number, gene expression and

survival analyses are available as supplementary material.

Patients and samples

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 5: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

5

All 165 study patients were treated by surgery for lung adenocarcinoma without prior chemotherapy.

Fifty-eight patients received cisplatin-based adjuvant chemotherapy. Never smoker status was

defined by a lifetime exposure of less than 100 cigarettes. The tumors were classified according to

the TNM system in use at the time of diagnosis(10). The pathological diagnoses were reviewed

according to current histological classification for lung carcinoma(11)(12). Cases for which a doubt

about the primary site in the lung remained were excluded. All adenocarcinomas were invasive. A

bronchiolo-alveolar component was recorded when a non invasive lepidic growth was seen adjacent

to a component of invasive adenocarcinoma.

This study was part of the Lung Genes (LG) project, which was approved by the Institut National du

Cancer review board (Programme National d’Excellence Spécialisé Poumon). Informed consent was

obtained from patients for the use of their lung surgical samples.

Only cases with an average of tumor cells equal to or above 50% were included. Genomic DNA and

RNA were extracted and assessed for integrity and quantity following stringent quality control

criteria (cit.ligue-cancer.net).

Genomic DNA analysis

Genomic DNAs were hybridized on Illumina SNP HumanCNV370 chips (Illumina, San Diego, CA). The

GISTIC version 2.0 algorithm (www.broadinstitute.org/cancer/pub/GISTIC2) was used to identify

significant regions of amplification or deletion. The frequencies of aberrations contributing to

significant peak regions were compared using chi-square tests.

Gene expression analysis

Total RNAs were hybridized to Affymetrix Human Genome U133 Plus 2.0 GeneChip, (Affymetrix,

Santa Clara, CA). Unsupervised hierarchical clustering analysis of tumor samples from the LG cohort

and normal lung samples from eleven female Asian never smokers (accession number: GSE 19804)

was performed on the most variant probe sets. Differences between sample clusters were tested

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 6: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

6

using the chi-square test. Hypergeometric enrichment for Gene Ontology sets (GeneOntology.org)

and MYC target genes (www.myccancergene.org) were calculated with FDR correction of p-values.

Literature Vector Analysis (LitVAn) was used to infer gene cluster functionality with an evaluation of

the significance of their scores (litvan.bio.columbia.edu).

Both genomic and gene expression data were deposited in ArrayExpress database (accession

number: E-MTAB-923).

ATAD2 relative expression was measured by real-time RT-PCR using the Hs00204205 TaqMan® probe

(Applied Biosystems, Carlsbad, CA, USA).

Integration of copy-number and gene expression data

We used COpy Number and EXpression In Cancer (CONEXIC) to integrate matched copy number

(amplifications or deletions) and gene expression data from 80 paired samples(13).

As described by Akavia et al., CONEXIC is based on the following assumptions: (a) a driver mutation in

a “modulator” gene should be associated (correlated) with a group of genes that form a “module”;

(b) copy number aberrations often influence the expression of genes in the module via changes in

expression of the modulator.

The CONEXIC learning algorithm consists of three key steps:

1. Selection of candidate genes that are recurrently amplified or deleted in tumors.

2. Single Modulator step that creates an initial association between expression of candidate

drivers and expression of genes modules.

3. An iterative Network Learning step to improve the initial model.

During the Single Modulator and the Network learning steps, the search is driven by the optimization

of a Bayesian scoring function similar to Module Networks(14). For each node, the driver-split

combination that achieves the highest score is selected as long as it is verified to be statistically

significant. Significance is tested using Lee et al. permutation test(15); up to three top-scoring

modulator genes are tried, and if none of them pass the permutations significance test no more splits

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 7: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

7

are added to the driver tree. In addition to significance testing, non-parametric bootstrap serves to

eliminate spurious correlations.

The output is a driver network that divides the expressed genes into modules, and associates each

module with a driver tree. Each node in the tree is associated with a driver gene (modulator gene)

and a threshold expression level (split value), and divides the expression values of the module’s

members into samples in which the modulator’s expression is below the threshold and those in

which the modulator’s expression is above the threshold. Each side of the split at the first root of the

tree (herein designated as the first-order split) can contain further splits (secondary splits) using

other modulator/expression threshold pairs.

A detailed description of the selection of candidate genes and of the specified parameters for the

Single Modulator and the Network learning steps as well as a discussion of the significance levels

under which modulators were identified are available as supplementary information (Supplementary

methods).

The modules and their modulators were visualized using Genatomy

(www.c2b2.columbia.edu/danapeerlab/html/genatomy).

To nominate the module associated with smoking status, we used gene set enrichment analysis

(GSEA)(16).

For validation, we used the publicly-available gene expression data from 68 lung adenocarcinomas

(accession number: GSE 12667)(17) and from 391 lung adenocarcinomas (caarraydb.nci.nih.gov,

pId=1015945236141280)(18). The linear relationship between a modulator and its associated genes

was measured using the Pearson correlation coefficient.

Survival analysis The univariate overall survival analyses were performed using the Kaplan-Meier method and log-rank

tests. In the multivariate proportional hazard Cox overall survival analysis, ATAD2 expression was

studied together with age, sex and disease stage.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 8: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

8

Results Different frequency of aberrations in significant peak regions between never and ever smokers A total of 121 high-quality genomic profiles were obtained. Frequent aberrations (frequency >25%)

included gains on 1p, 1q, 5p, 5q, 6p, 7p, 7q, 8q, 14q, 16p, 17q, 20p, and 20q, and losses on 1p, 3p, 4q,

5q, 6q, 8p, 9p, 10q, 12p, 13q, 15q, 17p, and 18q. The GISTIC 2.0 algorithm was applied to identify

regions that were significantly amplified or deleted (figure 1). A total of 59 significant peak regions

with a frequency of 13% to 84% were identified, including 22 regions that were amplified and 37

regions that were deleted.

The frequency of amplifications or deletions in the 59 significant regions was compared between

never and ever smokers. After adjustment for multiple comparisons using Bonferroni method, only

two regions were differentially amplified or deleted according to smoking status: amplifications were

more frequent in ever smokers (83%) compared to never smokers (52%) at 8q24.12 (q-value=0.02),

whereas deletions were more frequent in ever smokers (50%) compared to never smokers (13%) at

4q35.2 (q-value 0.0006).

Two groups of tumors with distinctive gene expression clusters and different clinicopathological

annotations

Unsupervised hierarchical clustering of gene expression in 103 high-quality tumor profiles and 11

normal lung samples (GSE19804) revealed two groups of tumors (figure 2). The partition was stable

as assessed by resampling. The first group of tumors was characterized by the expression of a gene

cluster (cluster c) that was common to normal lung samples and mainly absent from tumors in the

second group. Two genes clusters (cluster f and cluster i) were overexpressed in the second group of

tumors, while they were both expressed at low levels in most tumors of the first group and in normal

lung samples.

The cluster f was enriched for GO terms ‘cell cycle process’ (q-value 1.7E-17) and ‘mitotic cell cycle’

(q-value 1.8E-11) and designated as the ‘proliferative’ cluster. Typical genes in the proliferative

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 9: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

9

cluster encoded cyclins (CCNA2, CCNB1, and CCNE2), the cyclin-dependent kinase CDK6, E2F

transcription factors (E2F7, E2F8) and twelve proteins involved in mitosis. The cluster i was enriched

for the GO term ‘extra-cellular matrix’ (7.0E-14). Typical genes in cluster i encoded members of the

disintegrin and metalloproteinase (ADAM12, ADAMDEC1, ADAMTS5) or matrix metalloproteinase

(MMP1, MMP3, MMP11, MMP12, MMP13) families.

Using LitVAn, significant terms associated with the proliferative cluster were ‘cyclin’ and ‘mitotic’ as

well as ‘spindle’, reflecting the enrichment for genes participating to the mitotic spindle (BUB1,

CENPF, KIF14, KIF15, NDC80, NEK2, NUF2, SKA1, SPC25, TPX2, TTK) (genome.ucsc.edu). LitVAn

significant terms for cluster i included ‘invasion’, favoring its designation as the ‘invasive’ gene cluster

(supplementary table S1).

The first group of tumors whose gene expression resembled normal lung comprised 35 never

smokers (74%) and 12 ever smokers, while the second group comprised 28 never smokers (50%) and

28 ever smokers (p-value 0.01). In the first group the tumors more frequently presented with a

bronchiolo-alveolar component (p-value 5E-6) or harbored an EGFR mutation (p-value 0.0002),

whereas in the second group they more often harbored a KRAS mutation (p-value 0.02).

ATAD2 as a likely driver of cell proliferation Eight-hundred and eighteen genes overlapped significant aberrations less than a third of

chromosome length, including 350 genes overlapping 19 amplifications and 468 genes overlapping

34 deletions. Among these genes, 109 genes overlapped the two regions that were differentially

altered between never and ever smokers. The expression of 175 genes, including 35 that overlapped

the 8q24.12 and 4q35.2 smoking status related aberrations (table 1), was significantly altered (p-

value <0.05) by either their amplification status or deletion status.

Using CONEXIC, we found a model comprising 67 modules that were associated with 31 main

modulators (i.e. likely drivers at the first order split of the regulatory programs) explaining the

behavior of 10001 genes.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 10: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

10

We wished to uncover the likely drivers of differentially expressed clusters, which were revealed by

the unsupervised analysis of gene expression in the whole cohort. We had two collections of gene

sets, one provided by the unsupervised analysis in 103 patients (the a to j gene clusters), the other by

CONEXIC in 78 patients with paired genomic and gene expression data (the 67 modules). Both

collections were determined without the help of clinical or biological annotations. We thus

conducted hypergeometric enrichment analysis to identify which of the gene sets overlapped in the

two datasets and thus indicate likely drivers of the overlapping clusters in the whole cohort. The

proliferative cluster (cluster f in the unsupervised analysis) intersected very significantly (q-

value=2.0E-37) with CONEXIC module 62 (figure 3, panel A). Twenty-eight of 46 genes of cluster f

were identified as module 62 genes. The proliferative cluster did not intersect with any other

CONEXIC modules. Module 62 genes were enriched in the GO term ‘cell cycle process’ (q-value=1.2E-

87). The main modulator associated with module 62 was ATPase family, AAA domain containing 2

(ATAD2), a gene located at 8q24.12.

A linear relationship between the expression of ATAD2 and the expression of proliferation-related

genes in both never and ever smokers

To verify the association between ATAD2 expression and genes in its module, the module 62

regulatory programs identified in the LG cohort were applied to gene expression data from 68 lung

adenocarcinomas of the Ding cohort(17). The profiles of module 62 genes were compared using

identical split expression values for the modulators. The relationship between high ATAD2 levels and

overexpression of module 62 genes was verified in the Ding cohort (supplementary figure S1, panel

A). When ATAD2 was low, however, the second order regulatory programs depending on TUBB3 did

not classify samples in the Ding cohort as well as in the LG cohort, suggesting that this secondary

regulator was not optimally chosen by CONEXIC. We replaced TUBB3 expression by ATAD2

expression, which improved the classification of samples in the Ding cohort (supplementary figure

S1, panel B) without altering substantially the original module 62 profiles in the LG cohort

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 11: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

11

(supplementary figure S1, panel C). These results suggested that ATAD2 expression alone could

explain the behavior of module 62 genes across datasets.

To enable the confirmation of a linear relationship between ATAD2 and the proliferative cluster

(cluster f), Pearson correlation coefficients were calculated in the LG cohort, the Ding cohort and a

third independent cohort of 391 patients(18). There was a strong dose-response relationship

between ATAD2 and the proliferative cluster in every cohort as the expression of genes belonging to

the proliferative cluster increased with higher ATAD2 levels in the LG (correlation coefficient 0.85),

Ding (correlation coefficient 0.77) and Shedden (correlation coefficient 0.75) cohorts.

In the LG cohort the expression of the proliferative cluster increased with higher ATAD2 levels in

never (correlation coefficient 0.83) and in ever smokers (correlation coefficient 0.87) (figure 3, panel

B). A similarly strong linear relationship was noted in both the Ding and the Shedden cohorts for

never and ever smokers and patients with unknown smoking status (table 2).

In the LG cohort the expression of ATAD2 strongly correlated with the expression of the proliferative

cluster in every subgroup defined by sex, disease stage, bronchiolo-alveolar component, EGFR or

KRAS status (table 2 and supplementary figure S2). Although high ATAD2 was less frequent among

tumors without 8q24.12 amplification, ATAD2 was differentially expressed and strongly correlated

with expression of the proliferative cluster in tumors with or without the amplification (figure 3,

panel C and table 2).

ATAD2 and module 62 relationships with 8q14.12 amplification and smoking status

Based on the array data, ATAD2 expression was associated with 8q24.12 amplification status (p-value

1.1E-4) (table 1) and was higher in ever smokers (p-value 0.0004). It was neither associated with

CDKN2A, nor RB1 overlapping deletions.

In a subset of 76 patients with available RNA for real-time RT-PCR analysis, ATAD2 expression was

increased in ever smokers compared to never smokers (fold change 1.75, p-value 0.02) and in

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 12: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

12

patients with 8q24.12 amplification compared with no amplification (fold change 2.27, p-value

0.001).

As the number of ever smokers was higher in the group of tumors expressing the proliferative cluster

f, we tested the association of smoking status with each CONEXIC module using GSEA. The profiles in

ever smokers as compared to never smokers were significantly enriched in the module 62 gene set to

which the highest enrichment score was given (0.86 for the enrichment score, 1.67 for the

normalized enrichment score, 0.02 for the nominal p-value and 0.22 for the FDR).

Relationships between ATAD2 , MYC and proliferation-related genes

Module 62 genes were enriched (q-value 3.2E-4) for genes of the MYC target database

(www.myccancergene.org). Three CONEXIC modules other than module 62 were associated with

ATAD2 as their main modulator, one of which was also enriched for MYC targets (q-value 0.005).

Enrichments for the proliferative cluster genes, GO terms containing ‘cell cycle process’ and MYC

targets were aligned only for module 62 (supplementary table S2).

Amplifications of 8q24.12 included both ATAD2 and MYC in every sample save one. Like ATAD2, MYC

expression was associated with 8q24.12 amplification (p-value 8.0E-5) (table 1) and was higher in

ever smokers compared to never smokers (p-value 0.002).

The correlation of the proliferative cluster with MYC (correlation coefficient 0.39) was less strong,

however, than with ATAD2 (correlation coefficient 0.85). Remarkably, the correlation of MYC targets

in the proliferative cluster was less strong with MYC (correlation coefficient 0.33) than with ATAD2

(correlation coefficient 0.83). Likewise, mitotic spindle genes correlated weakly with MYC, while they

correlated strongly with ATAD2 (supplementary table S3). Overexpression of the proliferative cluster

occurred in tumors with low MYC and it correlated with ATAD2 (Figure 3, panel D). The modest

correlation of the proliferative cluster with MYC was verified in both Ding and Shedden cohorts (table

2).

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 13: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

13

Survival of patients

In the 78 patients from the LG cohort with paired copy-number and gene expression array data, the

survival rates were 0.75 (95% confidence interval (95%CI) 0.65-0.88) at 3 years and 0.61 (95% CI 0.48-

0.77) at 4 years for the low ATAD2 group and 0.59 at 3 years (95% CI 0.43-0.83) and 0.44 (95% CI

0.28-0.71) at 4 years for the high ATAD2 group (figure 4, panel A). Survival was not significantly

different according to ATAD2 expression (p-value 0.14). Late disease stage was associated with a

shorter survival (p-value 0.01). None of the other clinical or biological variables, including 8q24.12

amplification (p-value 0.58) and MYC expression (p-value 0.77), was associated with survival.

In the 75 patients who were studied using PCR to measure ATAD2 expression, neither the PCR data

(p-value 0.41), nor the ATAD2 array data (p-value=0.43) were associated with survival.

In the 349 patients from the Shedden cohort, the survival rates were 0.74 (95% CI 0.68-0.81) at 3

years and 0.65 (95% CI 0.58-0.73) at 4 years for the low ATAD2 group and 0.58 (95% CI 0.51-0.66) at 3

years and 0.5 (95% CO 0.43-0.58) at 4 years for the high ATAD2 group (figure 4, panel B). Survival

time was longer in the low ATAD2 group compared to the high ATAD2 group (p-value 0.002). Late

disease stage was strongly associated with shorter survival (p-value 1E-16).

Multivariate proportional hazard COX models were tested to investigate the association of ATAD2

array data with survival and to adjust for age, sex and disease stage in the 78 patients from the LG

cohort and in the 349 patients from the Shedden cohort. In the LG cohort, the best model (likelihood

p-value 0.008) included ATAD2, age and disease stage. An older age (hazard ratio (HR) 2.18, 95% CI

1.04-4.55; p-value 0.04) and late disease stage (HR 2.32, 95% CI 1.17-4.6; p-value 0.02) were

associated with a shorter survival. High ATAD2 was not significantly associated with survival (HR 2.04,

95% CI 0.98-4.27; p-value 0.06). Removing ATAD2 reduced slightly the model likelihood (likelihood p-

value 0.02).

In the Shedden cohort, the best model (likelihood p-value 1E-13) included ATAD2 and stage. High

ATAD2 (HR 1.68, 95% CI 1.22-2.32; p-value 0.002) and late disease stage (HR 3.86, 95% CI 2.74-5.42;

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 14: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

14

p-value 8E-15) were significantly associated with survival. Removing ATAD2 reduced slightly the

model likelihood (likelihood p-value 2E-12).

Discussion Our study reveals that ATAD2 is a likely driver of cell proliferation in lung adenocarcinoma. ATAD2

overexpression both explains the behavior of cell cycle genes and, most likely, results primarily from

amplification – thereby connecting proliferation of lung cancer cells to a unique genetic aberration.

Furthermore, our results suggest that the oncogene MYC, which is located 4.3 Mb distal to ATAD2

(http://genome.ucsc.edu/), is involved in that pathway as the ATAD2 associated proliferative

signature includes MYC targets involved in cell cycle. Before our study, it was known that ATAD2 is

upstream of MYC and that it can exert a role in the proliferation of normal and cancer cells, strongly

supporting our conclusions(19)(20)(21). The present study is the first to provide evidence suggesting

that amplified ATAD2 is the main driving force behind MYC contribution to uncontrolled cell

proliferation in lung adenocarcinoma. The crucial driving function shown here for ATAD2 may have

therapeutic implications. While MYC has been considered as a frequent and very relevant

therapeutic target in lung cancer, specific inhibition of MYC has not been achieved and no MYC

inhibitor is currently in the clinic. ATAD2 is worthwhile to investigate as a therapeutic target, which

appears feasible given its ATPase activity and its bromodomain(22). Moreover, ATAD2 expression

predicts the expression of mitotic spindle genes, whose products participate to a network vulnerable

to inhibition of SUMOylation (23)(24)(25).

A key factor in uncovering the contrasted phenotypes that are summarized by gene clusters in the

unsupervised analysis of gene expression is the comparison of normal lung and tumors, many of

which were from never smokers. Lung adenocarcinomas in never smokers present typically with a

bronchiolo-alveolar component with well-differentiated tumor cells, whereas in ever smokers growth

is usually ‘fully invasive’, i.e. consists exclusively of invasive components(11)(26). Consistent with

histology, gene expression in the group where never smokers were numerous resembled that of

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 15: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

15

normal lung. By contrast, the group with more ever smokers expressed proliferative or invasive gene

clusters.

Most of the frequent aberrations shown in this cohort have been previously reported(8). However,

the frequencies of aberrations in two significant regions of amplification or deletion differ according

to smoking status. These results raised the question whether the deregulation of genes overlapping

differentially altered regions explain the differences in expression profiles between tumors.

To identify driving mutations and the processes they influence, we integrated copy-number and gene

expression data using the recently developed algorithm CONEXIC(13). With this approach the starting

list of candidate drivers includes only the genes within or near significant regions of copy-number

changes. As a result, CONEXIC would not detect drivers that are typically associated with point

mutations.

We identify ATAD2 as a likely driver whose expression explains the behavior of differentially

expressed proliferation-related genes. Indeed, an ATAD2-associated module outputted by CONEXIC

contained a majority of genes of the proliferative cluster identified in the unsupervised analysis of

gene expression, an enrichment very unlikely caused by chance. A strong dose-response relationship

between ATAD2 levels and those of genes belonging to the proliferative cluster is shown in the LG

cohort and is verified in two independent validation cohorts(17)(18). The relationship between

ATAD2 and proliferation-related genes is neither affected by smoking status nor smoking status-

associated characteristics including KRAS or EGFR mutation.

ATAD2 is correctly associated by CONEXIC with genes that it is known to regulate. ATAD2 has been

identified as a co-factor for MYC-dependent transcription by Ciró and colleagues(19). Here, ATAD2-

associated genes were significantly enriched in MYC targets. Kalashnikova and colleagues using CHIP

assays demonstrated that ATAD2 occupies the proximal promoter regions of several key cell cycle

regulators (BUB1, CCNA2, KIF15, MCM10 and TOP2A), which we show linearly related to ATAD2

expression in lung adenocarcinomas(27).

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 16: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

16

ATAD2 overlaps the 8q24.12 region which was more frequently amplified in ever smokers. ATAD2

expression was associated with 8q24.12 amplification, suggesting that ATAD2 deregulation occurs

primarily through copy-number changes. Nevertheless, it was ATAD2 expression and not 8q24.12

amplification that correlated with the expression of proliferation-related genes. Discrepancies

between 8q24.12 amplification and ATAD2 overexpression point to additional genetic or epigenetic

events contributing to ATAD2 expression. It has been suggested ATAD2 deregulation may be the

consequence of the loss of RB mediated control in a subset of highly aggressive breast cancers(27).

We checked that ATAD2 expression was not associated with deletions targeting the RB pathway.

MYC and ATAD2 are frequently co-amplified in cancers (www.broadinstitute.org/tumorscape), a

consistent finding in this cohort. Co-amplification may be selected in tumors as a way to

concomitantly overexpress not too far apart cooperating genes. Like ATAD2, MYC was overexpressed

in ever smokers, and MYC overexpression was associated with 8q24.12 amplification. Increased

expression of MYC targets appears necessary to the association of ATAD2 with cell proliferation as

other ATAD2-associated modules that were not enriched in MYC targets were not enriched in

proliferation-related genes. As compared to ATAD2, expression of MYC only weakly correlated,

however, with expression of the proliferative cluster, including MYC target genes and mitotic spindle

genes. These results suggest that ATAD2 levels through MYC activity are more important than MYC

levels to drive cell proliferation in lung adenocarcinoma. Assuming that MYC protein levels are

roughly equivalent to mRNA levels, it may seem surprising that MYC activity is not directly related to

MYC expression. However, using a novel in vivo model of Myc-induced tumorigenesis, Murphy and

colleagues reported that low levels of deregulated Myc are competent to drive ectopic proliferation

of somatic cells and lung oncogenesis(28).

High ATAD2 is associated with poor survival of patients with breast cancer(19)(27). Caron and

colleagues reported that high ATAD2 (E. and C. Brambilla, unpublished data) predicts a shorter

survival of patients with lung cancer(29). In the LG cohort, ATAD2 is not significantly associated with

survival, although there is a trend when the array data are adjusted for disease stage and age. There

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 17: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

17

is much uncertainty in the results in small groups. In the larger Shedden cohort, high ATAD2 predicts

a shorter survival, which is consistent with the reported prognostic value of a cluster of cell

proliferation-related genes(18). The prognostic value of ATAD2 is independent of disease stage in

that cohort.

ATAD2 is a co-activator, which can control MYC-dependent transcription(19)(27). Through MYC and

E2F transcription factors, ATAD2 increases the expression of proliferation-related and anti-apoptotic

genes in many different types of cancer, including hormone-dependent prostate or breast

carcinomas, estrogen-receptor negative breast carcinoma, cervical carcinoma, glioblastoma,

osteosarcoma, and non-small cell lung carcinoma(19)(20)(21)(27)(29)(30). Although these data

strongly support that ATAD2 may drive cell proliferation in various cancers, more experiments are

needed to investigate the mechanisms by which ATAD2 likely influences the biological consequences

of MYC deregulation in the context of lung cancer cells.

In summary, ATAD2 is identified by a comparative and integrative approach as a likely driver of cell

proliferation in lung adenocarcinoma. MYC is co-amplified with ATAD2 and, like ATAD2, is

overexpressed in ever smokers. However, it is ATAD2 and not MYC expression that is strongly related

to the expression of proliferation-related genes, especially mitotic spindle genes. These results

suggest that the aberrant expression of MYC targets that participate in the program responsible for

uncontrolled proliferation may be attributed to ATAD2 deregulated expression. This further suggests

that ATAD2 levels may predict a MYC dependency of lung adenocarcinoma, which should be

exploited as a priority target for therapeutic purposes.

Acknowledgments: The following investigators participated to the Lung Genes (LG) project:

Centre Chirurgical Marie-Lannelongue, Le Plessis-Robinson: P Dartevelle, E Dulmet, F Leroy-Ladurie, V

de Montpreville; Centre Hospitalier Intercommunal Créteil: I Monnet; Centre Hospitalo-Universitaire

Dijon,: A Bernard, F Piard; Centre Hospitalo-Universitaire Hôtel-Dieu, Paris: M Alifano, S Camilleri-

Broët, D Damotte, JF Régnard; Centre Hospitalo-Universitaire Nice,: P Hofman, V Hofman, J Mouroux;

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 18: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

18

Centre Hospitalo-Universitaire Saint-Louis, Paris: J Trédaniel; Centre Hospitalo-Universitaire

Strasbourg,: M Beau-Faller, G Massard, A Neuville; Centre Hospitalo-Universitaire Tenon, Paris: M

Antoine, J Cadranel; Centre Hospitalo-Universitaire Toulouse,: L Brouchet, J Mazières, I Rouquette;

DCom, Télécom ParisTech, Paris: R Fouret; Hôpital d’instruction des armées Percy, Clamart: P Saint-

Blancard, F Vaylet; Institut Gustave-Roussy, Villejuif: A Berhneim, P Dessen, F Dufour, N Dorvault, P

Fouret, B Job, L Lacroix, V Lazar, C Richon, V Roux, P Saulnier, JC Soria, E Taranchon, S Toujani, A

Valent; Institut Mutualiste Montsouris, Paris: P Girard, D Gossot, P Validire; Ligue Nationale Contre le

Cancer: J Laffaire, A de Reynès. We thank D Simon (Laboratoire Probabilités et Modèles Aléatoires,

Université Pierre et Marie Curie, Paris, France) for help in performing the bootstrap.

Grants:

To Pierre Fouret

- Institut National du Cancer (Programme National d’Excellence Spécialisé Poumon)

- Ligue Nationale Contre le Cancer (Programme Carte d’Identité des Tumeurs)

- Association pour la Recherche sur le Cancer (grant number SFI20101201740).

References 1. Subramanian J, Govindan R. Lung cancer in never smokers: a review. J. Clin. Oncol

2007;25(5):561–570.

2. International Agency for Research on Cancer. Water, air, soil and food pollutants. In: Attibutable Causes of Cancer in France in the Year 2000. Geneva: WHO Press; 2007 p. 97–102.

3. Sun S, Schiller JH, Gazdar AF. Lung cancer in never smokers--a different disease. Nat. Rev. Cancer 2007;7(10):778–790.

4. Shaw AT, Yeap BY, Mino-Kenudson M, Digumarthy SR, Costa DB, Heist RS, et al. Clinical features and outcome of patients with non-small-cell lung cancer who harbor EML4-ALK. J. Clin. Oncol 2009;27(26):4247–4253.

5. Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 2007;448(7153):561–566.

6. Herbst RS, Heymach JV, Lippman SM. Lung cancer. N. Engl. J. Med 2008;359(13):1367–1380.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 19: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

19

7. Little CD, Nau MM, Carney DN, Gazdar AF, Minna JD. Amplification and expression of the c-myc oncogene in human lung cancer cell lines. Nature 1983;306(5939):194–196.

8. Weir BA, Woo MS, Getz G, Perner S, Ding L, Beroukhim R, et al. Characterizing the cancer genome in lung adenocarcinoma. Nature 2007;450(7171):893–898.

9. Wong MP, Fung L-F, Wang E, Chow W-S, Chiu S-W, Lam W-K, Ho K-K, Ma ESK, Wan TSK, Chung L-P. Chromosomal aberrations of primary lung adenocarcinomas in nonsmokers. Cancer 2003;97(5):1263–1270.

10. Hermanek P, Hutter R, Sobin L, Wagner G, Wittekind C, editors. TNM Atlas. Guide illustré de la classification TNM/pTNM des tumeurs malignes. 4th ed. Paris: Springer-Verlag France; 1998.

11. Travis WD, Brambilla E, Müller-Hermelink HK, Harris CC. International Agency for Research on Cancer. Pathology and genetics of tumours of the lung, pleura, thymus and heart. lyon: IARC; 2004.

12. Travis WD, Brambilla E, Noguchi M, Nicholson AG, Geisinger KR, Yatabe Y, et al. International association for the study of lung cancer/american thoracic society/european respiratory society international multidisciplinary classification of lung adenocarcinoma. J Thorac Oncol 2011;6(2):244–285.

13. Akavia UD, Litvin O, Kim J, Sanchez-Garcia F, Kotliar D, Causton HC, Pochanard P, Mozes E, Garraway LA, Pe’er D. An integrated approach to uncover drivers of cancer. Cell 2010;143(6):1005–1017.

14. Segal E, Shapira M, Regev A, Pe’er D, Botstein D, Koller D, et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet. 2003;34(2):166–176.

15. Lee S-I, Pe’er D, Dudley AM, Church GM, Koller D. Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl. Acad. Sci. U.S.A. 2006;103(38):14062–14067.

16. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U.S.A. 2005;102(43):15545–15550.

17. Ding L, Getz G, Wheeler DA, Mardis ER, McLellan MD, Cibulskis K, et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature 2008;455(7216):1069–1075.

18. Shedden K, Taylor JMG, Enkemann SA, Tsao M-S, Yeatman TJ, Gerald WL, et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat. Med. 2008;14(8):822–827.

19. Ciró M, Prosperini E, Quarto M, Grazini U, Walfridsson J, McBlane F, et al. ATAD2 is a novel cofactor for MYC, overexpressed and amplified in aggressive tumors. Cancer Res. 2009;69(21):8491–8498.

20. Zou JX, Revenko AS, Li LB, Gemo AT, Chen H-W. ANCCA, an estrogen-regulated AAA+ ATPase coactivator for ERalpha, is required for coregulator occupancy and chromatin modification. Proc. Natl. Acad. Sci. U.S.A. 2007;104(46):18067–18072.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 20: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

20

21. Zou JX, Guo L, Revenko AS, Tepper CG, Gemo AT, Kung H-J, et al. Androgen-induced coactivator ANCCA mediates specific androgen receptor signaling in prostate cancer. Cancer Res. 2009;69(8):3339–3346.

22. Hsia EY, Goodson ML, Zou JX, Privalsky ML, Chen H-W. Nuclear receptor coregulators as a new paradigm for therapeutic targeting. Adv. Drug Deliv. Rev. 2010;62(13):1227–1237.

23. Joukov V, Groen AC, Prokhorova T, Gerson R, White E, Rodriguez A, Walter JC, Livingston DM. The BRCA1/BARD1 heterodimer modulates ran-dependent mitotic spindle assembly. Cell 2006;127(3):539–552.

24. Kiyomitsu T, Obuse C, Yanagida M. Human Blinkin/AF15q14 is required for chromosome alignment and the mitotic checkpoint through direct interaction with Bub1 and BubR1. Dev. Cell 2007;13(5):663–676.

25. Kessler JD, Kahle KT, Sun T, Meerbrey KL, Schlabach MR, Schmitt EM, et al. A SUMOylation-Dependent Transcriptional Subprogram Is Required for Myc-Driven Tumorigenesis [Internet]. Science 2012;335(6066):348-53.

26. Miller VA, Kris MG, Shah N, Patel J, Azzoli C, Gomez J, et al. Bronchioloalveolar pathologic subtype and smoking history predict sensitivity to gefitinib in advanced non-small-cell lung cancer. J. Clin. Oncol. 2004;22(6):1103–1109.

27. Kalashnikova EV, Revenko AS, Gemo AT, Andrews NP, Tepper CG, Zou JX, et al. ANCCA/ATAD2 overexpression identifies breast cancer patients with poor prognosis, acting to drive proliferation and survival of triple-negative cells through control of B-Myb and EZH2. Cancer Res. 2010;70(22):9402–9412.

28. Murphy DJ, Junttila MR, Pouyet L, Karnezis A, Shchors K, Bui DA, et al. Distinct thresholds govern Myc’s biological output in vivo. Cancer Cell 2008;14(6):447–457.

29. Caron C, Lestrat C, Marsal S, Escoffier E, Curtet S, Virolle V, et al. Functional characterization of ATAD2 as a new cancer/testis factor and a predictor of poor prognosis in breast and lung cancers. Oncogene 2010;29(37):5171–5181.

30. Revenko AS, Kalashnikova EV, Gemo AT, Zou JX, Chen H-W. Chromatin loading of E2F-MLL complex by cancer-associated coregulator ANCCA via reading a specific histone mark. Mol. Cell. Biol. 2010;30(22):5260–5272.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 21: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

21

Legends of figures 1 to 4

Figure 1. GISTIC 2.0 analysis of copy-number changes in 121 lung adenocarcinoma. Plots of the G-

scores (top) and q-values (bottom) with respect to amplifications (A) or deletions (B) over the entire

region analyzed. The significance level for the q-value is indicated by a vertical dotted line.

Chromosome positions are indicated along the y axis with centromere positions indicated by

horizontal dotted lines. The locations of the peak regions are indicated on the right of each panel.

Figure 2. Unsupervised analysis of gene expression in 103 lung adenocarcinomas from the LG cohort

and 11 normal lung from Asian female never smokers. In the heat map, each cell represents the

expression value for a probe in a sample. The largest expression values are in red, whereas the

lowest expression values are in green. The sample clusters shown at the top are colored in red or

blue for tumor samples and in green for normal lung samples, wherein the blue colored tumor

samples and the green colored normal lung samples are clustered together. Each box below the

sample clusters represents the value for a discrete clinicopathological annotation in a sample. A black

box denotes presence of a bronchiolo-alveolar component, ever smoker status, male sex, EGFR

mutation or KRAS mutation. The p-values associated with annotations are obtained by comparing the

two groups of tumor samples using chi-square tests. The gene clusters shown on the left of the heat

map are labeled ‘a’ to ‘j’.

Figure 3. Integrated analysis of paired copy-number and gene expression data in 80 lung

adenocarcinoma belonging to the LG cohort.

A. CONEXIC analysis. Genatomy module network view of CONEXIC module 62. Each row of the heat

map corresponds to the expression of a gene across the 80 samples. Gene names indicated at the

right of the heat map are sorted by alphabetical order. Sample names are indicated above the heat

map. Samples are ordered according to the regulatory programs found by CONEXIC and shown above

sample names. The modulators include ATAD2, TUBB3, SLC25A21 and KCNMB4, wherein ATAD2

increased expression at the first order split and at the right second order split is associated with

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 22: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

22

increased expression of genes in the module. Yellow dotted lines partition the samples according to

split values of the modulators.

B, C and D. Analysis of gene expression of the proliferative cluster (cluster f) in the LG cohort

according to ATAD2 expression and according to smoking status, 8q24.12 amplification or MYC

expression. Proliferative cluster genes are those included in the final set of genes after processing of

LG gene expression data during the CONEXIC procedure. Gene names indicated at the right of the

heat map are sorted by alphabetical order. B: Samples are sorted in ever (blue) or never smokers

(white), and then sorted within each smoking status category into four groups of increasing ATAD2

expression levels (from white to red). The four ATAD2 groups are sorted using the split values found

by CONEXIC in the analysis of the LG cohort. C: Samples are sorted according to 8q24.12

amplification, comparing amplification (blue) versus no amplification (white), then within each

category into four groups of increasing ATAD2 expression levels as above. D: Samples are sorted

according to increasing MYC expression levels using quartile values as split values (from light to dark

blue), then within each category into four groups of increasing ATAD2 expression levels as above.

Figure 4. Kaplan-Meier curves of overall survival rates according to ATAD2 levels in the LG cohort,

comparing high versus low ATAD2 levels. Shown is the log-rank p-value. A. LG cohort. B. Shedden

cohort.

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 23: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

A B

Figure 1

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 24: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

Figure 2

Research.

on Novem

ber 24, 2020. © 2012 A

merican A

ssociation for Cancer

clincancerres.aacrjournals.org D

ownloaded from

Author m

anuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Author M

anuscript Published O

nlineFirst on A

ugust 22, 2012; DO

I: 10.1158/1078-0432.CC

R-12-0505

Page 25: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

BA

C

D

Figure 3

Research.

on Novem

ber 24, 2020. © 2012 A

merican A

ssociation for Cancer

clincancerres.aacrjournals.org D

ownloaded from

Author m

anuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Author M

anuscript Published O

nlineFirst on A

ugust 22, 2012; DO

I: 10.1158/1078-0432.CC

R-12-0505

Page 26: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

A B

ATAD2 lowATAD2 low

ate

ate

ATAD2 high

ATAD2 high

Surv

ival

ra

Surv

ival

ra

p-value 0.14 p-value 0.002

Time (days) Time (months)

Figure 4Figure 4

Research.

on Novem

ber 24, 2020. © 2012 A

merican A

ssociation for Cancer

clincancerres.aacrjournals.org D

ownloaded from

Author m

anuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Author M

anuscript Published O

nlineFirst on A

ugust 22, 2012; DO

I: 10.1158/1078-0432.CC

R-12-0505

Page 27: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

Aberration Cytoband Gene name Welch t-test p-value*Amplif icat ion 8q24.12 ATAD2 1.1E-04

Amplif icat ion 8q24.12 DERL1 1.5E-04

Amplif icat ion 8q24.12 DSCC1 2.2E-05

Amplif icat ion 8q24.12 FAM83A 5.4E-08

Amplif icat ion 8q24.12 FAM91A1 1.5E-09

Amplif icat ion 8q24.12 KIAA0196 1.3E-05

Amplif icat ion 8q24.12 MRPL13 1.6E-05

Amplif icat ion 8q24.12 MYC 8.0E-05

Amplif icat ion 8q24.12 NDUFB9 2.3E-04

Amplif icat ion 8q24.12 NSMCE2 0.002

Amplif icat ion 8q24.12 RNF139 0.01

Amplif icat ion 8q24.12 SQLE 1.9E-04

Amplif icat ion 8q24.12 TATDN1 9.8E-05

Amplif icat ion 8q24.12 TMEM65 8.1E-06

Delet ion 4q35.2 ACSL1 5.5E-06

Delet ion 4q35.2 ANKRD37 0.002

Delet ion 4q35.2 CCDC111 5.2E-04

Delet ion 4q35.2 CDKN2AIP 2.2E-05

Delet ion 4q35.2 CYP4V2 4.6E-04

Delet ion 4q35.2 DCTD 8.0E-09

Delet ion 4q35.2 F11 0.02

Delet ion 4q35.2 FRG1 4.0E-04

Delet ion 4q35.2 GALNT7 0.009

Delet ion 4q35.2 GPM6A 0.03

Delet ion 4q35.2 HMGB2 0.047

Delet ion 4q35.2 HPGD 3.6E-07

Delet ion 4q35.2 ING2 0.04

Delet ion 4q35.2 IRF2 1.1E-04

Delet ion 4q35.2 NEIL3 0.009

Delet ion 4q35.2 RWDD4A 7.5E-08

Delet ion 4q35.2 SNX25 1.7E-05

Delet ion 4q35.2 SORBS2 0.01

Delet ion 4q35.2 STOX2 0.04

Delet ion 4q35.2 TLR3 1.6E-06

Delet ion 4q35.2 UFSP2 3.5E-04

* Comparing amplification versus normal or deletion versus normal

Table 1. Association of gene expression with smoking-statusrelated copy-number alterations in the LG cohort

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 28: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505

Page 29: A comparative and integrative approach identifies ATPase ... · 8/22/2012  · 4 The majority of lung cancers are caused by tobacco smoking. However, even in people who have never

Published OnlineFirst August 22, 2012.Clin Cancer Res   Robert Fouret, Julien Laffaire, Paul Hofman, et al.   proliferation in lung adenocarcinomafamily, AAA domain containing 2 as a likely driver of cell A comparative and integrative approach identifies ATPase

  Updated version

  10.1158/1078-0432.CCR-12-0505doi:

Access the most recent version of this article at:

  Material

Supplementary

  http://clincancerres.aacrjournals.org/content/suppl/2012/08/23/1078-0432.CCR-12-0505.DC1

Access the most recent supplemental material at:

  Manuscript

Authoredited. Author manuscripts have been peer reviewed and accepted for publication but have not yet been

   

   

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected] at

To order reprints of this article or to subscribe to the journal, contact the AACR Publications

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://clincancerres.aacrjournals.org/content/early/2012/08/22/1078-0432.CCR-12-0505To request permission to re-use all or part of this article, use this link

Research. on November 24, 2020. © 2012 American Association for Cancerclincancerres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on August 22, 2012; DOI: 10.1158/1078-0432.CCR-12-0505