Multi-omic data integration

Integrate multi-omic cancer data representing gene expression, methylation, copy number alterations, somatic mutations, and microRNA expression

RNA-seq co-expression

Identify clusters of co-expressed genes from RNA-seq data

Network inference

Identify gene regulatory networks from transcriptomic data

Recent & Upcoming Talks

Exploring drivers of gene expression in The Cancer Genome Atlas
Dec 4, 2018 3:30 PM
Co-expression analyses of RNA-seq data in practice with the R/Bioconductor package coseq
Jun 22, 2018 4:15 PM
Exploring drivers of gene expression in The Cancer Genome Atlas
Mar 28, 2018 8:30 AM



  • Albert, I., Ancelet, S., David, O., Denis, J.-B., Makowski, D., Parent, É, Rau, A., and Soubeyrand, S. (2015). Initiation à la statistique bayésienne : Bases théoriques et applications en alimentation, environnmenet, épidémiologie et génétique. Éditions Ellipses, collection références sciences. Code Publisher link ProdInra


  • Rau, A. (2017) Statistical methods and software for the analysis of transcriptomic data. HDR (Habilitation à diriger des recherches) thesis, Université d’Évry Val-d’Essonne. PDF Slides ProdInra
    Note: HDR is a high-level (post-PhD) degree granted by French universities that provides an accreditation to supervise research.

  • Rau, A. (2010) Reverse engineering gene networks using genomic time-course data. PhD thesis, Purdue University. PDF ProdInra

Journal Articles


  • Plasterer, C., Tsaih, S.-W., Lemke, A., Schilling, R., Dwinell, M., Rau, A., Auer, P., Rui, H., Flister, M.J. (2019) Identification of a rat mammary tumor risk locus that is syntenic with the commonly amplified 8q12.1 and 8q22.1 regions in human breast cancer patients. G3: Genes|Genomes|Genetics (accepted).


  • Rau, A., Flister, M. J., Rui, H. and Livermore Auer, P. (2018) Exploring drivers of gene expression in The Cancer Genome Atlas. Bioinformatics, bty551, doi: Preprint PDF Shiny app Code

  • Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2018) Clustering transformed compositional data using K-means, with applications in gene expression and bicycle sharing system data. Journal of Applied Statistics, 46(1):47-65. Preprint PDF Code ProdInra

  • Rau, A. and Maugis-Rabusseau, C. (2018) Transformation and model choice for RNA-seq co-expression analysis. Briefings in Bioinformatics, bbw128, Preprint PDF Code ProdInra

  • Verrier, E., Genet, C., Laloë, D., Jaffrézic, J., Rau, A., Esquerre, D., Dechamp, N., Ciobataru, C., Hervet, C., Krieg, F., Quillet, E., Boudinot, P. (2018) Genetic and transcriptomic analyses provide new insights on the early antiviral response to VHSV in resistant and susceptible rainbow trout. BMC Genomics, 19:482. PDF ProdInra

  • Maroilley, T., Berri, M., Lemonnier, G., Esquerré, D., Chevaleyre, C., Mélo, S., Meurens, F., Coville, J.L., Leplat, J.J, Rau, A., Bed’hom, B., Vincent-Naulleau, S., Mercat, M.J., Billon, Y., Lepage, P., Rogel-Gaillard, C., and Estellé, J. (2018). Immunome differences between porcine ileal and jejunal Peyer’s patches revealed by global transcriptome sequencing of gut-associated lymphoid tissues. Scientific Reports, 8:9077. PDF ProdInra

  • Mondet, F., Rau, A., Klopp, C., Rohmer, M. Severac, D., Le Conte, Y., and Alaux, C. (2018). Transcriptome profiling of the honeybee parasite Varroa destructor provides new biological insights into the mite adult life cycle. BMC Genomics, 19:328. PDF ProdInra

  • He, B., Tjhung, K., Bennett, N., Chou, Y., Rau, A., Huang, J., and Derda, R. (2018). Compositional bias in naïve and chemically-modified phage-displayed libraries uncovered by paired-end deep sequencing. Scientific Reports, 8:1214. PDF ProdInra


  • Monneret, G., Jaffrézic, F., Rau, A., Zerjal, T. and Nuel, G. (2017) Identification of marginal causal relationships in gene networks from observational and interventional expression data. PLoS One 12(3): e0171142. PDF Code ProdInra

  • Sauvage, C., Rau, A., Aichholz, C., Chadoeuf, J., Sarah, G., Ruiz, M., Santoni, S., Causse, M., David, J., Glémin, S. (2017) Domestication rewired gene expression and nucleotide diversity patterns in tomato. The Plant Journal 91(4):631-645. PDF ProdInra


  • Rigaill, G., Balzergue, S., Brunaud, V., Blondet, E., Rau, A., Rogier, O., Caius, J., Maugis-Rabusseau, C., Soubigou-Taconnat, L., Aubourg, S., Lurin, C., Martin-Magniette, M.-L., and Delannoy, E. (2016) Synthetic datasets for the identification of key ingredients for RNA-seq differential analysis. Briefings in Bioinformatics, doi: PDF ProdInra


  • Gallopin, M., Celeux, G., Jaffrézic, F., Rau, A. (2015) A model selection criterion for model-based clustering of annotated gene expression data. Statistical Applications in Genetics and Molecular Biology, 14(5): 413-428. PDF Code ProdInra

  • Monneret, G., Jaffrézic, F., Rau, A., Nuel, G. (2015) Estimation d’effets causaux dans les réseaux de régulation génique : vers la grande dimension. Revue d’intelligence artificielle, 29(2): 205-227. PDF Code ProdInra

  • Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux, G. (2015) Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9): 1420-1427. Preprint PDF Code ProdInra


  • Rau, A., Marot, G. and Jaffrézic, F. (2014) Differential meta-analysis of RNA-seq data from multiple studies. BMC Bioinformatics, 15:91. PDF Code ProdInra

  • Endale Ahanda, M.-L., Zerjal, T., Dhorne-Pollet, S., Rau, A., Cooksey, A., and Giuffra, E. (2014) Impact of the genetic background on the composition of the chicken plasma miRNome in response to a stress. PLoS One, 9(12): e114598. PDF Code ProdInra


  • Nuel, G., Rau, A., and Jaffrézic, F. (2013) Using pairwise ordering preferences to estimate causal effects in gene expression from a mixture of observational and intervention experiments. Quality Technology and Quantitative Management 11(1):23-37. PDF ProdInra

  • Rau, A., Jaffrézic, F., and Nuel, G. (2013) Joint estimation of causal effects from observational and intervention gene expression data. BMC Systems Biology 7:111. PDF Code ProdInra

  • Gallopin, M. Rau, A., and Jaffrézic, F. (2013). A hierarchical Poisson log-normal model for network inference from RNA sequencing data. PLoS One 8(10): e77503. PDF ProdInra

  • Rau, A., Gallopin, M., Celeux, G., and Jaffrézic, F. (2013). Data-based filtering for replicated high-throughput transcriptome sequencing experiments. Bioinformatics 29(17): 2146-2152. PDF Code ProdInra

  • Dillies, M.-A.¹, Rau, A.¹, Aubert, J.¹, Hennequet-Antier, C.¹, Jeanmougin, M.¹, Servant, N.¹, Keime, C.¹, Marot, G., Castel, D., Estelle, J., Guernec, G., Jagla, B., Jouneau, L., Laloë, D., Le Gall, C., Schaëffer, B., Charif, D., Le Crom, S.¹, Guedj, M.¹, and Jaffrézic, F¹. (2013). A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Briefings in Bioinformatics, doi:10.1093/bib/bbs046. PDF ProdInra
    ¹These authors contributed equally to this work.

  • Brenault, P., Lefevre, L. Rau, A., Laloë, D., Pisoni, G., Moroni, P., Bevilacquia, C. and Martin, P. (2013) Contribution of mammary epithelial cells to the immune response during early stages of a bacterial infection to Staphylococcus aureus. Veterinary Research, 45:16. PDF ProdInra

2012 and before

  • Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2012). Reverse engineering gene regulatory networks using approximate Bayesian computation. Statistics and Computing, 22: 1257-1271. Preprint PDF ProdInra

  • Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2010). An empirical Bayesian method for estimating biological networks from temporal microarray data. Statistical Applications in Genetics and Molecular Biology: Vol. 9: Iss. 1, Article 9. PDF Code ProdInra

  • Furth, A., Mandrekar, S., Tan, A. Rau, A., Felten, S., Ames, M. Adjei, A. Erlichman, C. and Reid, J. (2008). A limited sample model to predict area under the drug concentration curve for 17-(allylamino)-17-demethoxygeldanamycin and its active metabolite 17-(amino)-17-demethoxygeldanomycin. Cancer Chemotherapy Pharmacology, 61(1): 39-45.

Book chapters

  • Martin-Magniette, M.-L., Maugis-Rabusseau, C. and Rau, A. (2017) Clustering of co-expressed genes. In: Model Choice and Model Aggregation. Ed. F. Bertrand, J.-J. Droesbeke, G. Saporta, C. Thomas-Agnan. Publisher link HAL

Submitted and in preparation

  • Revilla, M., Rau, A., Crespo-Piazuelo, D., Ramayo-Caldas, Y., Estellé, J., INIA, Ballester, M., Folch, J. M. (2019) An integrative gene network analysis of the genetic determination of pig fatty acid composition based on adipose tissue RNA sequencing. Submitted.

  • Foissac, S., Djebali, S., Munyard, K., Villa-Vialaneix, N., Rau, A., Muret, K., Esquerre, D., Zytnicki, M., Derrien, T., Bardou, P., Blanc, F., Cabau, C., Crisci, E., Dhorne-Pollet, S., Drouet, F., Gonzales, I., Goubil, A., Lacroix-Lamande, S., Laurent, F., Marthey, S., Marti-Marimon, M., Momal-Leisenring, R., Mompart, F., Quere, P., Robelin, D., San Cristobal, M., Tosser-Klopp, G., Vincent-Naulleau, S., Fabre, S., Pinard-Van der Laan, M.-H., Klopp, C., Tixier-Boichard, M., Acloque, H., Lagarrigue, S., Giuffra, E. (2018) Livestock genome annotation: transcriptome and chromatin structure profiling in cattle, goat, chicken, and pig. bioRxiv, doi: Submitted. Preprint ProdInra

  • Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2018) Multi-view cluster aggregation and splitting, with an application to multi-omic breast cancer data. Submitted. Preprint Code

  • Tsaih, S.-W., Plasterer, C., Lemke, A., Ran, S. Rau, A., Auer, P., Rui, H. and Flister, M. J. (2018) Genetic mapping of pathophysiological modifiers in the breast tumor microenvironment. Submitted.

  • Jehl, F., Klopp, C., Brenet, M., Rau, A., Désert, C., Boutin, M., Leroux, S., Muret, K., Esquerré, D., Gourichon, D., Burlot, T., Pitel, F., Zerjal, T., Lagarrigue, S. (2018) Phenotype and multi-tissue transcriptome response to diet energy change in laying hens. In preparation.


  • maskmeans: Multi-view aggregation/splitting K-means clustering algorithm.
  • Edge in TCGA: An R/Shiny interactive web application for the exploration of drivers of gene expression in The Cancer Genome Atlas.
  • coseq: Co-expression analysis of sequencing data.
  • ICAL: Model selection for model based clustering of annotated data.
  • metaRNASeq: Meta-analysis of RNA-seq data.
  • HTSDiff: Differential analysis for RNA-seq data.
  • HTSFilter: Filter for replicated high-throughput sequencing data.
  • HTSCluster: Clustering high-throughput sequencing data with Poisson mixture models.
  • ebdbNet: Empirical Bayes estimation for dynamic Bayesian networks.

Advising & Teaching

I am an adjunct instructor for the following graduate course at the Medical College of Wisconsin in Spring 2019:

  • MCW Physiological Genomics: Bioinformatics module

I was a teaching instructor for the following course at the University of Wisconsin-Milwaukee in Spring 2018:

  • UWM PH718: Data management and visualization


  • Dr. Gilles Monneret (2014-2018 Ph.D.): “Estimation of causal effects in gene networks from observational and intervention data” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Raphaëlle Momal-Leisenring (2017 M2 internship): “Integrative statistical analysis of multi-omics data”
  • Frédéric Jehl (2017 M2 internship): “Impact of heat stress on liver and blood transcriptomes of laying hens” (co-supervision with Tatiana Zerjal)
  • Dr. Manuel Revilla Sanchez (2016 3-month Ph.D. Erasmus+ Learning Mobility): “An integrative gene network analysis of the genetic determination of pig fatty acid composition” (co-supervison with Jordi Estelle and Yuliaxis Ramayo Caldas)
  • Babacar Ciss (2016 M2 internship): “Constructing predictive models for ovine production data” (co-supervision with Eli Sellem, Allice)
  • Dr. Mélina Gallopin (2012-2015 Ph.D.): “Clustering and network inference for RNA-seq data” (co-supervision with Gilles Celeux and Florence Jaffrézic) Currently Assistant Professor (maître de conférences) at I2BC, Université Paris-Saclay
  • Audrey Hulot (2015 M1 internship): “Incorporating a priori biological knowledge into gene network inference from observational and intervention gene expression data” (with Florence Jaffrézic)
  • Meriem Benabbas (2015 M1 internship): “Identifying differentially expressed genes from RNA-seq data using mixture models”
  • Rémi Bancal (2012 M2 internship): “Gene network estimation by adaptive knockout experiments” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Mélina Gallopin (2012 M2 internship): “Gene network inference from RNA sequencing expression data” (co-supervision with Gilles Celeux and Florence Jaffrézic)


Find my full CV in PDF here.


Like many (most?) users of the ggplot2 visualization package, I often find myself (re-)looking up how to do specific tasks. In an effort to streamline by Googling and avoid searching over and over again for solutions to the same issues, this post will gather together some of the assorted tips and tricks that I’ve recently looked up. Including an inset graph I found this tip here, using the cowplot package.


The start of a new year is always a nice time to look back and take stock of the past year, and look forward and set some goals for the coming year. I spent the entirety of 2018 as an AgreenSkills+ Visiting Scholar at UWM in Milwaukee, Wisconsin, which has been (and continues to be!) a very rich experience that has given me the chance to broaden my understanding of statistical genetics and genomics and expand my skill set.


This is a short post to provide details on how I created the visual CV that is included on my homepage. I got the idea for doing this from a tweet from the awesome Mara Averick about an R package called VisualResume by Nathaniel Phillips: OMG, I love this! (I miss Breaking Bad so much) 📦 “VisualResume: An R package for creating a visual resume” by @YaRrrBook #rstats pic.


tl;dr: Use I() to treat a numeric variable in a data.frame “as is” and avoid unintended conversion when mapping to transparency in a ggplot2 aesthetic. Today I ran into a ggplot2 plotting problem involving mapping the transparency aesthetic to a numeric variable – this drove me crazy until I figured it out. Here’s the basic set-up: I wanted to plot a scatterplot of two variables, but have the transparency of the points be controlled by a third (numeric) variable.


I recently decided that I wanted to move my professional homepage from a free page set up on WordPress to GitHub Pages using blogdown by Yihui Xi. There were basically two reasons for this: (1) Because I only sprang for the free WordPress site, there are gigantic, ugly ads that appear on every single page. I only recently realized this as I was usually viewing my WordPress site while being logged on – and apparently, the ads only appear for other people.