Presentations
=============

- Variant calling and bcbio training for the `Harvard Chan Bioinformatics Core
  In Depth NGS Data Analysis Course
  <https://hbctraining.github.io/In-depth-NGS-Data-Analysis-Course/>`_
  (10 October 2018): `slides <https://github.com/chapmanb/bcbb/blob/master/talks/ngscourse2018_teaching/ngscourse2018_teaching.pdf>`_

- Building a diverse set of validations; lightning talk at `the GCCBOSC2018 Bioinformatics Community Conference
  <https://gccbosc2018.sched.com/>`_: `slides <https://github.com/chapmanb/bcbb/blob/master/talks/bosc2018_bcbio_validate/chapman_bcbio_validate.pdf>`_

- bcbio training at `the GCCBOSC2018 Bioinformatics Community Conference
  <https://gccbosc2018.sched.com/>`_, focusing on bcbio CWL integration with
  examples of variant calling analyses on Personal Genome Project examples (26
  June 2018): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/bosc2018_bcbio_training/bosc2018_bcbio_training.pdf>`_;;
  `video <https://www.youtube.com/watch?v=ukWhAetvNKE>`_

- Description of bcbio and Common Workflow integration with a focus on
  parallelization strategies. From a bcbio discussion with `Peter Park's lab at Harvard Medical School
  <https://compbio.hms.harvard.edu/index>`_ (26 January 2018): `slides <https://github.com/chapmanb/bcbb/blob/master/talks/park2018_bcbio/park2018_bcbio.pdf>`_

- In depth description of bcbio and Common Workflow Language integration,
  including motivation and practical examples of running on clusters, DNAnexus,
  SevenBridges and Arvados. From the
  `Boston Bioinformatics Interest Group meeting <https://gist.github.com/chapmanb/8ee026fd85d07518570ac5a0cd7239f5>`_
  (2 November 2017):
  `slides <https://github.com/chapmanb/bcbb/blob/master/talks/big2017_bcbio_cwl/big2017_bcbio_cwl.pdf>`_;
  `video <https://youtu.be/nJEDS9Qol8M>`_

- bcbio practical interoperability with the Common Workflow Language at
  `BOSC 2017 <http://www.open-bio.org/wiki/BOSC_2017>`_
  (22 July 2017):
  `slides <https://github.com/chapmanb/bcbb/blob/master/talks/bosc2017_bcbio_interoperate/chapmanb_bcbio_interoperate.pdf>`_; 
  `video <https://youtu.be/S7bu17GQHqk>`_

- :ref:`teaching` variant calling, bcbio and GATK4 validation at the `Summer 2017 NGS Data Analysis Course
  at Harvard Chan School <http://bioinformatics.sph.harvard.edu/training/>`_
  (6 July 2017): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/ngscourse2017_teaching/ngscourse2017_teaching.pdf>`_

- Training course for the `Cancer Genomics Cloud
  <http://www.cancergenomicscloud.org/>`_, describing how bcbio uses the Common
  Workflow Language to run in multiple infrastructures (1 May 2017): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/cgc2017_bcbio_cwl/cgc2017_bcbiocwl.pdf>`_

- `MIT Bioinformatics Interest Group
  <http://openwetware.org/wiki/BioMicroCenter:BIG_meeting#2016-2017_academic_year>`_
  about how Common Workflow Language
  `enables interoperability with multiple workflow engines <https://gist.github.com/chapmanb/f1ccdd2e2e23b0383b6e6857b59a431b>`_
  (3 November 2016): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/big2016_bcbio_cwl/big2016_bcbiocwl.pdf>`_
  and `video <https://youtu.be/375QSYmaidk>`_

- `Broad Institute <http://www.broadinstitute.org/>`_ software engineering
  seminar about bcbio validation and integration with Common Workflow Language
  and Workflow Definition Language (28 September 2016): `slides <https://github.com/chapmanb/bcbb/blob/master/talks/broad_engineering2016_bcbio/broad2016_bcbio.pdf>`_

- Materials from :ref:`teaching` at the `Summer 2016 NGS Data Analysis Course
  at Harvard Chan School <http://bioinformatics.sph.harvard.edu/training/>`_
  (11 August 2016): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/ngscourse2016b_teaching/ngscourse2016b_teaching.pdf>`_

- `Bioinformatics Open Source Conference (BOSC) 2016
  <http://www.open-bio.org/wiki/BOSC_2016>`_ lightning talk on bcbio and common
  workflow language (8 July 2016): `slides
  <http://f1000research.com/slides/5-1639>`_ and `video <https://youtu.be/kMoAWjHhOVc>`_.

- Materials from :ref:`teaching` from the `Spring 2016 NGS Data Analysis Course
  at Harvard Chan School
  <https://wiki.harvard.edu/confluence/display/hbctraining/NGS+Data+Analysis+Course+Application%2C+Spring+2016>`_
  (28 April 2016): `slides
  <https://github.com/chapmanb/bcbb/raw/master/talks/ngscourse2016_teaching/ngscourse2016_teaching.pdf>`_

- Statistical Genetics and Network Science Meeting at `Channing Division of
  Network Medicine
  <http://www.brighamandwomens.org/Research/depts/Medicine/Channing/default.aspx>`_
  (23 March 2016): `slides <https://github.com/chapmanb/bcbb/blob/master/talks/cdnm2016_bcbio/cdnm2016_bcbio.pdf>`_

- Presentation at `Curoverse <https://curoverse.com/>`_ Brown Bag Seminar on
  bcbio and in progress integration work with `Common Workflow Language
  <http://www.commonwl.org/>`_ and `Arvados <https://arvados.org/>`_
  (11 January 2016):
  `slides <https://github.com/chapmanb/bcbb/blob/master/talks/curoverse2016bb_bcbio/curoverse2016bb_bcbio.pdf>`_

- Materials from :ref:`teaching` oriented example at Cold Spring Harbor
  Laboratory's `Advanced Sequencing Technology and Applications course
  <http://meetings.cshl.edu/courses.aspx?course=C-SEQTEC&year=15>`_.
  (18 November 2015): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/cshl2015_bcbio/cshl2015_bcbio.pdf>`_

- Supporting the common workflow language and Docker in bcbio
  `Bio in Docker symposium
  <http://core.brc.iop.kcl.ac.uk/events/compbio-docker-symposium-2015/>`_
  (9 November 2015): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/bioindocker2015_bcbio/chapman_bioindocker.pdf>`_

- Validation on human build 38, HLA typing, low frequency cancer calling and
  structural variation for `Boston Bioinformatics Interest Group (BIG) meeting
  <http://openwetware.org/wiki/BioMicroCenter:BIG_meeting>`_
  (5 November 2015):
  `slides <https://github.com/chapmanb/bcbb/blob/master/talks/big2015_bcbio/big2015_bcbio.pdf>`_

- Presentation on Research Scientist Careers for `Iowa State Bioinformatics
  Course <https://bcbio.las.iastate.edu/>`_ (23 September 2015): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/2015_iowast_career/chapman_career.pdf>`_

- Prioritization of structural variants based on known biological information at
  `BOSC 2015 <http://www.open-bio.org/wiki/BOSC_2015>`_ (10 July 2015): `slides
  <https://github.com/chapmanb/bcbb/blob/master/talks/bosc2015_bcbio_prioritize/bosc2015_bcbio_prioritize.pdf>`_;
  `video <https://www.youtube.com/watch?v=JZnF_6UnajY&feature=youtu.be>`_

- Overview of variant calling for `NGS Data Analysis Course at Harvard Medical School <https://wiki.harvard.edu/confluence/display/hbctraining/NGS+Data+Analysis+Course+Application%2C+Spring+2015>`_
  (19 May 2015): `slides <https://github.com/chapmanb/bcbb/blob/master/talks/ngscourse2015_teaching/variant_ngscourse.pdf>`_

- `NGS Glasgow <http://biotexcel.com/event/ngs-2015-glasgow/>`_ (23 April 2015):
  `slides <https://dl.dropboxusercontent.com/u/407047/Work/Presentations/20150420%20NGS%20Glasgow.pdf>`_

- `Boston Computational Biology and Bioinformatics meetup
  <http://www.meetup.com/Boston-Computational-Biology-and-Bioinformatics-Meetup/events/220328870/>`_
  (1 April 2015): `slides <https://github.com/chapmanb/bcbb/blob/master/talks/bcbb2015_bcbio/chapman_bcbio.pdf>`_

- `Program in Genetic Epidemiology and Statistical Genetics seminar series
  <http://www.hsph.harvard.edu/program-molecular-genetic-epidemiology/journal-club/>`_ at
  Harvard Chan School (6 February 2015): `slides <https://github.com/chapmanb/bcbb/raw/master/talks/pgsg2015_bcbio/chapman_bcbio.pdf>`_

- Talk at `Good Start Genetics <https://www.goodstartgenetics.com/>`_ (23
  January 2015): `slides <https://github.com/chapmanb/bcbb/raw/master/talks/gsg2015_bcbio_nextgen/chapman_bcbio.pdf>`_

- Boston area `Bioinformatics Interest Group <http://openwetware.org/wiki/BioMicroCenter:BIG_meeting>`_ (15 October 2014):
  `slides <https://github.com/chapmanb/bcbb/raw/master/talks/big2014_bcbio_val/chapman_bcbio.pdf>`_

- University of Georgia `Institute of Bioinformatics
  <http://iob.uga.edu/event/bioinformatics-seminar-12/>`_ (12 September 2014):
  `slides <https://github.com/chapmanb/bcbb/raw/master/talks/uga2014_bcbio_open/chapman_bcbio.pdf>`_

- Intel Life Sciences discussion (7 August 2014): `slides <https://github.com/chapmanb/bcbb/raw/master/talks/intel2014_bcbio/chapman_bcbio.pdf>`_

- Bioinformatics Open Source Conference (BOSC) 2014: `slides
  <https://github.com/chapmanb/bcbb/raw/master/talks/bosc2014_bcbio/chapman_bcbio.pdf>`_,
  `conference website <http://www.open-bio.org/wiki/BOSC_2014>`_

- Galaxy Community Conference 2014: `slides
  <https://github.com/chapmanb/bcbb/raw/master/talks/gcc2014_bcbio/chapman_bcbio.pdf>`_,
  `conference website <https://wiki.galaxyproject.org/Events/GCC2014>`_

- `bcbio hackathon at Biogen`_ (3 June 2014)

- `Harvard ABCD group slides`_ (17 April 2014)

- `BIG meeting`_ (February 2014)

- `Novartis slides`_ (21 January 2014)

- Mt Sinai: Strategies for accelerating the genomic sequencing pipeline: `Mt Sinai workshop slides`_,
  `Mt Sinai workshop website`_

- Genome Informatics 2013 `GI 2013 Presentation slides`_

- Bioinformatics Open Source Conference 2013: `BOSC 2013 Slides`_, `BOSC 2013
  Video`_, `BOSC 2013 Conference website`_

- Arvados Summit 2013: `Arvados Summit Slides`_, `Arvados Summit website`_

- Scientific Python 2013: `SciPy 2013 Video`_, `SciPy 2013 Conference website`_

Feel free to reuse any images or text from these talks. The `slides are on GitHub`_.

Abstract
~~~~~~~~

**Community Development of Validated Variant Calling Pipelines**

*Brad Chapman, Rory Kirchner, Oliver Hofmann and Winston Hide Harvard
School of Public Health, Bioinformatics Core, Boston, MA, 02115*

Translational research relies on accurate identification of genomic
variants. However, rapidly changing best practice approaches in
alignment and variant calling, coupled with large data sizes, make it a
challenge to create reliable and reproducible variant calls. Coordinated
community development can help overcome these challenges by sharing
testing and updates across multiple groups. We describe bcbio-nextgen, a
distributed multi-architecture pipeline that automates variant calling,
validation and organization of results for query and visualization. It
creates an easily installable, reliable infrastructure from
best-practice open source tools with the following goals:

-  **Quantifiable:** Validates variant calls against known reference
   materials developed by the `Genome in a Bottle`_ consortium. The
   `bcbio.variation`_ toolkit automates scoring and assessment of calls
   to identify regressions in variant identification as calling
   pipelines evolve. Incorporation of multiple variant calling
   approaches from `Broad's GATK best practices`_ and the `Marth lab's
   gkno software`_ enables informed comparisons between current and
   future algorithms.

-  **Scalable:** bcbio-nextgen handles large population studies with
   hundreds of whole genome samples by parallelizing on a wide variety
   of schedulers and multicore machines, setting up different ad hoc
   cluster configurations for each workflow step. Work in progress
   includes integration with virtual environments, including `Amazon Web
   Services`_ and `OpenStack`_.

-  **Accessible:** Results automatically feed into tools for query and
   investigation of variants. The `GEMINI framework`_ provides a
   queryable database associating variants with a wide variety of genome
   annotations. The `o8`_ web-based tool visualizes the work of variant
   prioritization and assessment.

-  **Community developed:** bcbio-nextgen is widely used in multiple
   sequencing centers and research laboratories. We actively encourage
   contributors to the code base and make it easy to get started with a
   fully automated installer and updater that prepares all third party
   software and reference genomes.

Links from the presentation
~~~~~~~~~~~~~~~~~~~~~~~~~~~

-  `HugeSeq`_
-  `Genome Comparison & Analytic Testing`_ at Bioplanet
-  `Peter Block’s “Community” book`_
-  `CloudBioLinux`_ and `Homebrew Science`_ as installation frameworks;
   `Conda`_ as Python environment
-  bcbio `documentation`_ at ReadTheDocs
-  `Arvados framework`_ for meta data tracking, NGS processing and data
   provenance
-  Notes on `improved scaling for NGS workflows`_
-  Genomic Reference Materials from `Genome in a Bottle`_
-  Comparison of `aligners and callers`_ using NIST reference materials
-  Callers and `minimal BAM preparation workflows`_
-  `Coverage assessment`_

.. _BOSC 2013 Slides: http://chapmanb.github.io/bcbb/talks/bosc2013_bcbio_nextgen/chapmanb_bosc2013_bcbio.html#/
.. _BOSC 2013 Video: http://www.youtube.com/watch?v=dT5UEU0xF1Q
.. _BOSC 2013 Conference website: http://www.open-bio.org/wiki/BOSC_2013
.. _Arvados Summit Slides: https://github.com/chapmanb/bcbb/raw/master/talks/arvados2013_bcbio_nextgen/chapman_arvadossum_bcbio.pdf
.. _Arvados Summit website: https://arvados.org/projects/arvados/wiki/Arvados_Summit_-_Fall_2013
.. _SciPy 2013 Video: https://www.youtube.com/watch?v=qNMPh0pIpBE
.. _SciPy 2013 Conference website: https://conference.scipy.org/scipy2013/
.. _GI 2013 Presentation slides: https://dl.dropboxusercontent.com/u/407047/Work/Presentations/20131102%20CSHL%20Genome%20Informatics/20131101%20CSHL%20GI2013%20bcbio.pdf
.. _Genome in a Bottle: http://www.genomeinabottle.org/
.. _bcbio.variation: https://github.com/chapmanb/bcbio.variation
.. _Broad's GATK best practices: http://gatkforums.broadinstitute.org/discussion/1186/best-practice-variant-detection-with-the-gatk-v4-for-release-2-0
.. _Marth lab's gkno software: http://gkno.me/
.. _Amazon Web Services: https://aws.amazon.com/
.. _OpenStack: http://www.openstack.org/
.. _GEMINI framework: https://github.com/arq5x/gemini#readme
.. _o8: https://github.com/chapmanb/o8#readme
.. _HugeSeq: http://github.com/StanfordBioinformatics/HugeSeq
.. _Genome Comparison & Analytic Testing: http://www.bioplanet.com/gcat
.. _Peter Block’s “Community” book: http://www.amazon.com/Community-Structure-Belonging-Peter-Block/dp/1605092770
.. _CloudBioLinux: http://cloudbiolinux.org/
.. _Homebrew Science: https://github.com/Homebrew/homebrew-science
.. _Conda: http://www.continuum.io/blog/conda
.. _documentation: bcbio-nextgen.readthedocs.org
.. _Arvados framework: https://arvados.org/
.. _improved scaling for NGS workflows: http://bcb.io/2013/05/22/scaling-variant-detection-pipelines-for-whole-genome-sequencing-analysis/
.. _aligners and callers: http://bcb.io/2013/05/06/framework-for-evaluating-variant-detection-methods-comparison-of-aligners-and-callers/
.. _minimal BAM preparation workflows: http://bcb.io/2013/10/21/updated-comparison-of-variant-detection-methods-ensemble-freebayes-and-minimal-bam-preparation-pipelines/
.. _Coverage assessment: https://github.com/chapmanb/bcbio.coverage
.. _Mt Sinai workshop website: http://www.hpcwire.com/event/strategies-accelerating-genomic-sequencing-pipeline/
.. _Mt Sinai workshop slides: https://github.com/chapmanb/bcbb/raw/master/talks/mtsinai2013_bcbio_nextgen/chapman_mtsinai_bcbio.pdf
.. _Novartis slides: https://github.com/chapmanb/bcbb/raw/master/talks/novartis2014_bcbio_nextgen/chapman_bcbio.pdf
.. _BIG meeting: https://github.com/roryk/spliced-blog/blob/master/talks/BIG-meeting-feb-2014.pdf
.. _Harvard ABCD group slides: https://github.com/chapmanb/bcbb/raw/master/talks/abcd2014_bcbio_nextgen/chapman_bcbio.pdf
.. _bcbio hackathon at Biogen: https://github.com/chapmanb/bcbb/raw/master/talks/biogen2014_bcbio_nextgen/chapman_bcbio.pdf
.. _slides are on GitHub: https://github.com/chapmanb/bcbb/tree/master/talks
