Applications in human, animal, and plant genomics

Multi-omic data integration

Methodological contributions to integrative multi-omic analyses

RNA-seq co-expression

Identify and visualize clusters of co-expressed genes from RNA-seq data

Network inference

Identify gene regulatory networks from transcriptomic data

Recent & Upcoming Talks

Happy 20th Birthday, R!
May 18, 2020 10:00 AM
Exploring drivers of gene expression in The Cancer Genome Atlas
Dec 4, 2018 3:30 PM
Co-expression analyses of RNA-seq data in practice with the R/Bioconductor package coseq
Jun 22, 2018 4:15 PM
Exploring drivers of gene expression in The Cancer Genome Atlas
Mar 28, 2018 8:30 AM



  • Albert, I., Ancelet, S., David, O., Denis, J.-B., Makowski, D., Parent, É, Rau, A., and Soubeyrand, S. (2015). Initiation à la statistique bayésienne : Bases théoriques et applications en alimentation, environnmenet, épidémiologie et génétique. Éditions Ellipses, collection références sciences. Code Publisher link ProdInra


  • Rau, A. (2017) Statistical methods and software for the analysis of transcriptomic data. HDR (Habilitation à diriger des recherches) thesis, Université d’Évry Val-d’Essonne. PDF Slides ProdInra
    Note: HDR is a high-level (post-PhD) degree granted by French universities that provides an accreditation to supervise research.

  • Rau, A. (2010) Reverse engineering gene networks using genomic time-course data. PhD thesis, Purdue University. PDF ProdInra

Journal Articles


  • Cho, Y., Rau, A., Reiner, A., Auer, P. L. (2020) Mendelian randomization analysis with survival outcomes. Genetic Epidemiology (accepted).

  • Sellem, E., Marthey, S., Rau, A., Jouneau, L., Bonnet, A., Perrier, J.-P., Fritz, S., Le Danvic, C. Boussaha, M., Kiefer, H., Jammes, H., Schiblier, L. (2020) A comprehensive overview of bull sperm-borne small non-coding RNAs and their diversity in six breeds. Epigenetics and Chromatin 13:19. doi: 10.1186/s13072-020-00340-0.

  • Rau, A., Manansala, R., Flister, M. J., Rui, H., Jaffrézic, F., Laloë, D.¹, and Auer, P. L.¹ (2020) Individualized multi-omic pathway deviation scores using multiple factor analysis. Biostatistics (accepted),
    Preprint Code Software
    PDF ¹These authors contributed equally to this work.

  • Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2020) Multi-view cluster aggregation and splitting, with an application to multi-omic breast cancer data. Annals of Applied Statistics, 14:2, 752-767. Preprint Code PDF ProdInra


  • Jehl, F., Désert, C., Klopp, C., Brenet, M., Rau, A., Leroux, S., Boutin, M., Muret, K., Blum, Y., Esquerré, D., Gourichon, D., Burlot, T., Collin, A., Pitel, F., Benani, A., Zerjal, T., Lagarrigue, S. (2019) Chicken adaptive response to low energy diet: main role of the hypothalamic lipid metabolism revealed by a phenotypic and multi-tissue transcriptomic approach. BMC Genomics, 20. doi: 10.1186/s12864-019-6384-8.

  • Foissac, S., Djebali, S., Munyard, K., Villa-Vialaneix, N., Rau, A., Muret, K., Esquerre, D., Zytnicki, M., Derrien, T., Bardou, P., Blanc, F., Cabau, C., Crisci, E., Dhorne-Pollet, S., Drouet, F., Gonzales, I., Goubil, A., Lacroix-Lamande, S., Laurent, F., Marthey, S., Marti-Marimon, M., Momal-Leisenring, R., Mompart, F., Quere, P., Robelin, D., San Cristobal, M., Tosser-Klopp, G., Vincent-Naulleau, S., Fabre, S., Pinard-Van der Laan, M.-H., Klopp, C., Tixier-Boichard, M., Acloque, H., Lagarrigue, S., Giuffra, E. (2019) Multi-species annotation of transcriptome and chromatin structure in domesticated animals. BMC Biology 17: 108. Preprint PDF ProdInra Website

  • Dhara, S., Rau, A., Flister, M., Recka, N., Laiosa, M., Auer, P., and Udvadia, A. (2019) Cellular reprogramming for successful CNS axon regeneration is driven by a temporally changing cast of transcription factors. Scientific Reports 9:14198, doi: 10.1038/s41598-019-50485-6. Preprint Shiny app Code PDF ProdInra

  • Rau, A., Dhara, S., Udvadia, A., and Auer, P. (2019) Regeneration Rosetta: An interactive web application to explore regeneration-associated gene expression and chromatin accessibility. G3: Genes|Genomes|Genetics, doi: 10.1534/g3.119.400729..
    Shiny app Code PDF ProdInra

  • Plasterer, C., Tsaih, S.-W., Lemke, A., Schilling, R., Dwinell, M., Rau, A., Auer, P., Rui, H., Flister, M.J. (2019) Identification of a rat mammary tumor risk locus that is syntenic with the commonly amplified 8q12.1 and 8q22.1 regions in human breast cancer patients. G3: Genes|Genomes|Genetics 9(5):1739-1743. doi: 10.1534/g3.118.200873. PDF ProdInra

  • Ramayo-Caldas, Y., Zingaretti, L., Bernard, A., Estellé, J. Popova, M., Pons, N., Bellot, P., Mach, N., Rau, A., Roume, H., Perez-Encisco, M., Faverdin, P., Edouard, N., Dusko, S., Morgavi, D.P. and Renand, G. (2019) Identification of rumen microbial biomarkers linked to methane emission in Holstein dairy cows. Journal of Animal Breeding and Genetics, doi: 10.1111/jbg.12427. PDF ProdInra

  • Rau, A., Flister, M. J., Rui, H. and Livermore Auer, P. (2018) Exploring drivers of gene expression in The Cancer Genome Atlas. Bioinformatics, 35(1): 62-68. doi: Preprint PDF Shiny app Code ProdInra


  • Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2018) Clustering transformed compositional data using K-means, with applications in gene expression and bicycle sharing system data. Journal of Applied Statistics, 46(1):47-65. Preprint PDF Code ProdInra

  • Rau, A. and Maugis-Rabusseau, C. (2018) Transformation and model choice for RNA-seq co-expression analysis. Briefings in Bioinformatics, bbw128, Preprint PDF Code ProdInra

  • Verrier, E., Genet, C., Laloë, D., Jaffrézic, J., Rau, A., Esquerre, D., Dechamp, N., Ciobataru, C., Hervet, C., Krieg, F., Quillet, E., Boudinot, P. (2018) Genetic and transcriptomic analyses provide new insights on the early antiviral response to VHSV in resistant and susceptible rainbow trout. BMC Genomics, 19:482. PDF ProdInra

  • Maroilley, T., Berri, M., Lemonnier, G., Esquerré, D., Chevaleyre, C., Mélo, S., Meurens, F., Coville, J.L., Leplat, J.J, Rau, A., Bed’hom, B., Vincent-Naulleau, S., Mercat, M.J., Billon, Y., Lepage, P., Rogel-Gaillard, C., and Estellé, J. (2018). Immunome differences between porcine ileal and jejunal Peyer’s patches revealed by global transcriptome sequencing of gut-associated lymphoid tissues. Scientific Reports, 8:9077. PDF ProdInra

  • Mondet, F., Rau, A., Klopp, C., Rohmer, M. Severac, D., Le Conte, Y., and Alaux, C. (2018). Transcriptome profiling of the honeybee parasite Varroa destructor provides new biological insights into the mite adult life cycle. BMC Genomics, 19:328. PDF ProdInra

  • He, B., Tjhung, K., Bennett, N., Chou, Y., Rau, A., Huang, J., and Derda, R. (2018). Compositional bias in naïve and chemically-modified phage-displayed libraries uncovered by paired-end deep sequencing. Scientific Reports, 8:1214. PDF ProdInra


  • Monneret, G., Jaffrézic, F., Rau, A., Zerjal, T. and Nuel, G. (2017) Identification of marginal causal relationships in gene networks from observational and interventional expression data. PLoS One 12(3): e0171142. PDF Code ProdInra

  • Sauvage, C., Rau, A., Aichholz, C., Chadoeuf, J., Sarah, G., Ruiz, M., Santoni, S., Causse, M., David, J., Glémin, S. (2017) Domestication rewired gene expression and nucleotide diversity patterns in tomato. The Plant Journal 91(4):631-645. PDF ProdInra


  • Rigaill, G., Balzergue, S., Brunaud, V., Blondet, E., Rau, A., Rogier, O., Caius, J., Maugis-Rabusseau, C., Soubigou-Taconnat, L., Aubourg, S., Lurin, C., Martin-Magniette, M.-L., and Delannoy, E. (2016) Synthetic datasets for the identification of key ingredients for RNA-seq differential analysis. Briefings in Bioinformatics, doi: PDF ProdInra


  • Gallopin, M., Celeux, G., Jaffrézic, F., Rau, A. (2015) A model selection criterion for model-based clustering of annotated gene expression data. Statistical Applications in Genetics and Molecular Biology, 14(5): 413-428. PDF Code ProdInra

  • Monneret, G., Jaffrézic, F., Rau, A., Nuel, G. (2015) Estimation d’effets causaux dans les réseaux de régulation génique : vers la grande dimension. Revue d’intelligence artificielle, 29(2): 205-227. PDF Code ProdInra

  • Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux, G. (2015) Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9): 1420-1427. Preprint PDF Code ProdInra


  • Rau, A., Marot, G. and Jaffrézic, F. (2014) Differential meta-analysis of RNA-seq data from multiple studies. BMC Bioinformatics, 15:91. PDF Code ProdInra

  • Endale Ahanda, M.-L., Zerjal, T., Dhorne-Pollet, S., Rau, A., Cooksey, A., and Giuffra, E. (2014) Impact of the genetic background on the composition of the chicken plasma miRNome in response to a stress. PLoS One, 9(12): e114598. PDF Code ProdInra


  • Nuel, G., Rau, A., and Jaffrézic, F. (2013) Using pairwise ordering preferences to estimate causal effects in gene expression from a mixture of observational and intervention experiments. Quality Technology and Quantitative Management 11(1):23-37. PDF ProdInra

  • Rau, A., Jaffrézic, F., and Nuel, G. (2013) Joint estimation of causal effects from observational and intervention gene expression data. BMC Systems Biology 7:111. PDF Code ProdInra

  • Gallopin, M. Rau, A., and Jaffrézic, F. (2013). A hierarchical Poisson log-normal model for network inference from RNA sequencing data. PLoS One 8(10): e77503. PDF ProdInra

  • Rau, A., Gallopin, M., Celeux, G., and Jaffrézic, F. (2013). Data-based filtering for replicated high-throughput transcriptome sequencing experiments. Bioinformatics 29(17): 2146-2152. PDF Code ProdInra

  • Dillies, M.-A.¹, Rau, A.¹, Aubert, J.¹, Hennequet-Antier, C.¹, Jeanmougin, M.¹, Servant, N.¹, Keime, C.¹, Marot, G., Castel, D., Estelle, J., Guernec, G., Jagla, B., Jouneau, L., Laloë, D., Le Gall, C., Schaëffer, B., Charif, D., Le Crom, S.¹, Guedj, M.¹, and Jaffrézic, F¹. (2013). A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Briefings in Bioinformatics, doi:10.1093/bib/bbs046. PDF ProdInra
    ¹These authors contributed equally to this work.

  • Brenault, P., Lefevre, L. Rau, A., Laloë, D., Pisoni, G., Moroni, P., Bevilacquia, C. and Martin, P. (2013) Contribution of mammary epithelial cells to the immune response during early stages of a bacterial infection to Staphylococcus aureus. Veterinary Research, 45:16. PDF ProdInra

2012 and before

  • Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2012). Reverse engineering gene regulatory networks using approximate Bayesian computation. Statistics and Computing, 22: 1257-1271. Preprint PDF ProdInra

  • Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2010). An empirical Bayesian method for estimating biological networks from temporal microarray data. Statistical Applications in Genetics and Molecular Biology: Vol. 9: Iss. 1, Article 9. PDF Code ProdInra

  • Furth, A., Mandrekar, S., Tan, A. Rau, A., Felten, S., Ames, M. Adjei, A. Erlichman, C. and Reid, J. (2008). A limited sample model to predict area under the drug concentration curve for 17-(allylamino)-17-demethoxygeldanamycin and its active metabolite 17-(amino)-17-demethoxygeldanomycin. Cancer Chemotherapy Pharmacology, 61(1): 39-45. ProdInra

Book chapters

  • Martin-Magniette, M.-L., Maugis-Rabusseau, C. and Rau, A. (2017) Clustering of co-expressed genes. In: Model Choice and Model Aggregation. Ed. F. Bertrand, J.-J. Droesbeke, G. Saporta, C. Thomas-Agnan. Publisher link HAL ProdInra

Submitted works

  • Devogel, N., Auer, P. L., Manansala, R., Rau, A., and Wang, T. (2020) A unified linear mixed model for familial relatedness and population structure in genetic association studies. Submitted.

  • Cazals, A., Estellé, J., Bruneau, N., Coville, J.-L., Menanteau, P., Rossignol, M.-N., Jardet, D., Bevilacqua, C., Rau, A., Bed’Hom, B., Velge, P., and Calenge, F. (2020) Impact of host genetics on caecal microbiota composition and on Salmonella carriage in chicken. Submitted.

  • Mollandin, F., Rau, A., and Croiseau, P. (2020) An evaluation of the interpretability and predictive performance of the BayesR model for genomic prediction. Submitted.


  • padma: Pathway deviation scores using multiple factor analysis
  • Invest Astuces: An R/Shiny interactive web application for financial and real estate loan simulations
  • Regeneration Rosetta: An R/Shiny interactive web application to explore regeneration-associated gene expression and chromatin accessibility
  • maskmeans: Multi-view aggregation/splitting K-means clustering algorithm.
  • Edge in TCGA: An R/Shiny interactive web application for the exploration of drivers of gene expression in The Cancer Genome Atlas.
  • coseq: Co-expression analysis of sequencing data.
  • ICAL: Model selection for model based clustering of annotated data.
  • metaRNASeq: Meta-analysis of RNA-seq data.
  • HTSDiff: Differential analysis for RNA-seq data.
  • HTSFilter: Filter for replicated high-throughput sequencing data.
  • HTSCluster: Clustering high-throughput sequencing data with Poisson mixture models.
  • ebdbNet: Empirical Bayes estimation for dynamic Bayesian networks.

Advising & Teaching

  • Fanny Mollandin (2019-2022 Ph.D.): “Incorporating known functional annotations into Bayesian genomic prediction models” (co-supervision with Pascal Croiseau, co-funding from EU Horizon 2020 RIA grant GENE-SWitCH)


  • Dr. Gilles Monneret (2014-2018 Ph.D.): “Estimation of causal effects in gene networks from observational and intervention data” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Raphaëlle Momal-Leisenring (2017 M2 internship): “Integrative statistical analysis of multi-omics data”
  • Frédéric Jehl (2017 M2 internship): “Impact of heat stress on liver and blood transcriptomes of laying hens” (co-supervision with Tatiana Zerjal)
  • Dr. Manuel Revilla Sanchez (2016 3-month Ph.D. Erasmus+ Learning Mobility): “An integrative gene network analysis of the genetic determination of pig fatty acid composition” (co-supervison with Jordi Estelle and Yuliaxis Ramayo Caldas)
  • Babacar Ciss (2016 M2 internship): “Constructing predictive models for ovine production data” (co-supervision with Eli Sellem, Allice)
  • Dr. Mélina Gallopin (2012-2015 Ph.D.): “Clustering and network inference for RNA-seq data” (co-supervision with Gilles Celeux and Florence Jaffrézic) Currently Assistant Professor (maître de conférences) at I2BC, Université Paris-Saclay
  • Audrey Hulot (2015 M1 internship): “Incorporating a priori biological knowledge into gene network inference from observational and intervention gene expression data” (with Florence Jaffrézic)
  • Meriem Benabbas (2015 M1 internship): “Identifying differentially expressed genes from RNA-seq data using mixture models”
  • Rémi Bancal (2012 M2 internship): “Gene network estimation by adaptive knockout experiments” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Mélina Gallopin (2012 M2 internship): “Gene network inference from RNA sequencing expression data” (co-supervision with Gilles Celeux and Florence Jaffrézic)

We are offering two Master’s level internships in early 2021 for candidates with a biostatistics/bioinformatics background. Please get in touch if you’re interested in either of these opportunities! A 3-month M1 internship on “Knowledge transfer using multivariate gene expression projections onto a large-scale reference database” (location: Inrae Hauts-de-France research center in Estrées-Mons; full description here). A 6-month M2 internship on “Improving and extending functionality for multi-omic outlier detection software” (location: Inrae Ile-de-France research center in Jouy-en-Josas or Inrae Hauts-de-France research center in Estrées-Mons; full description here).


Well, here I am again, exactly one year after my last year-in-review post (apparently January 3 is peak procastination time when returning from the holiday break?). The past year represented a whirlwind of change for me and my family. In the first half of 2019, I wrapped up the final few months of my AgreenSkills+ sabbatical stay as a Visiting Scholar at UWM in Milwaukee, Wisconsin. My time at UWM was exciting, rich in new and continued collaborations, and a wonderful way to expand my research horizons while delving into the world of genomics applied to human health.


Like many (most?) users of the ggplot2 visualization package, I often find myself (re-)looking up how to do specific tasks. In an effort to streamline by Googling and avoid searching over and over again for solutions to the same issues, this post will gather together some of the assorted tips and tricks that I’ve recently looked up. Including an inset graph I found this tip here, using the cowplot package.


The start of a new year is always a nice time to look back and take stock of the past year, and look forward and set some goals for the coming year. I spent the entirety of 2018 as an AgreenSkills+ Visiting Scholar at UWM in Milwaukee, Wisconsin, which has been (and continues to be!) a very rich experience that has given me the chance to broaden my understanding of statistical genetics and genomics and expand my skill set.


This is a short post to provide details on how I created the visual CV that is included on my homepage. I got the idea for doing this from a tweet from the awesome Mara Averick about an R package called VisualResume by Nathaniel Phillips: OMG, I love this! (I miss Breaking Bad so much) 📦 “VisualResume: An R package for creating a visual resume” by @YaRrrBook #rstats pic.



Find my full CV in PDF here.