From June 13th to 15th, NCBI will assist the University of California Davis in hosting a biomedical data science hackathon at the School of Veterinary Medicine in Davis, CA, focusing on advanced bioinformatics analysis of next generation sequencing data and metadata. This event is for students, postdocs, investigators and other researchers already engaged in the use of pipelines for genomic analyses from next-generation sequencing data or metadata.*
*Some projects are available to other non-scientific developers, mathematicians or librarians.
Researchers and/or data scientists from the west coast of the United States are especially encouraged to apply, but the event is open to anyone selected for the hackathon, and able to travel to Davis. Working groups of 5-6 individuals will be formed into five or six teams. These teams will build pipelines and tools to analyze large datasets within a cloud infrastructure. The potential subjects for this iteration are:
- Medical informatics
- Cancer immunogenicity
- Workflow languages
- Sequencing contamination
- Closing bacterial genomes
Please see the application for specific team projects.
After a brief organizational session, teams will spend three days analyzing a challenging set of scientific problems related to a group of datasets. Participants will analyze and combine datasets in order to work on these problems.
Datasets will come from the public repositories housed at the NCBI. During the course, participants will have an opportunity to include other datasets and tools for analysis. Please note, if you use your own data during the course, we ask that you submit it to a public database within six months of the end of the event.
All pipelines and other scripts, software and programs generated in this course will be added to a public GitHub repository designed for that purpose. A manuscript outlining the design and usage of the software tools constructed by each team will be submitted to an appropriate journal.
To apply, complete this form (approximately 10 minutes to complete). Applications are due May 5, 2016 by 5PM Eastern. Participants will be selected from a pool of applicants based on the experience and motivation they provide on the form. Prior participants and applicants are especially encouraged to reapply.
Accepted applicants will be notified on May 9, 2016 by 2PM Eastern, and have until May 12, 2016 at 9AM Eastern to confirm their participation. If you confirm, please make sure it is highly likely you can attend, as confirming and not attending bars other data scientists from attending this event. Please include a monitored email address, in case there are follow-up questions.
Note: Participants will need to bring their own laptop to this program. A working knowledge of scripting (e.g., Shell, Python) is necessary to be successful in this event. Employment of higher level scripting or programming languages may also be useful. Applicants must be willing to commit to all three days of the event. No financial support for travel, lodging or meals is available for this event. Also note that the course may extend into the evening hours on Monday and/or Tuesday. Please make any necessary arrangements to accommodate this possibility.