Staying Current in Bioinformatics & Genomics: 2017 Edition
Getting Genetics Done
by Stephen Turner
4y ago
A while back I wrote this post about how I stay current in bioinformatics & genomics. That was nearly five years ago. A lot has changed since then. A few links are dead. Some of the blogs or Twitter accounts I mentioned have shifted focus or haven’t been updated in years (guilty as charged). The way we consume media has evolved — Google thought they could kill off RSS (long live RSS!), there are many new literature alert services, preprints have really taken off in this field, and many more scientists are engaging via social media than before. People still frequently ask me how I stay cur ..read more
Visit website
RStudio Conference 2017 Recap
Getting Genetics Done
by Stephen Turner
4y ago
The first ever RStudio conference was held January 11-14, 2017 in Orlando, FL. For anyone else like me who spends hours each working day staring into an RStudio session, the conference was truly excellent. The speaker lineup was diverse and covered lots of areas related to development in R, including the tidyverse, the RStudio IDE, Shiny, htmlwidgets, and authoring with RMarkdown. This is not a complete list by any means — with split sessions I could only go to half the talks at most. Here are some noncomprehensive notes and links to slides and resources for some of the awesome things are do ..read more
Visit website
Primers in computational biology
Getting Genetics Done
by Stephen Turner
4y ago
I recently stumbled across this collection of computational biology primers in Nature Biotechnology. Many of these are old, but they're still great resources to get a fundamental understanding of the topic. Here they are in no particular order. ... How does multiple testing correction work? http://www.nature.com/nbt/journal/v27/n12/full/nbt1209-1135.html What is principal component analysis? http://www.nature.com/nbt/journal/v26/n3/full/nbt0308-303.html SNP imputation in association studies http://www.nature.com/nbt/journal/v27/n4/full/nbt0409-349.html How does gene expression clustering ..read more
Visit website
Syntax Highlight Code in Keynote or Powerpoint
Getting Genetics Done
by Stephen Turner
4y ago
I came across this awesome gist explaining how to syntax highlight code in Keynote. The same trick works for Powerpoint. Mac only. Install homebrew if you don’t have it already and brew install highlight. highlight -O rtf myfile.ext | pbcopy to highlight code to a formatted text converter in RTF output format, and copy the result to the system clipboard. Paste into Keynote or Powerpoint. If I’ve got some code in a file called eset_pca.R: I can simply highlight -O rtf eset_pca.R | pbcopy and then paste it right into Keynote or Powerpoint. ​ Getting Genetics Done by Stephen Turner is lic ..read more
Visit website
Covcalc: Shiny App for Calculating Coverage Depth or Read Counts for Sequencing Experiments
Getting Genetics Done
by Stephen Turner
4y ago
How many reads do I need? What's my sequencing depth? These are common questions I get all the time. Calculating how much sequence data you need to hit a target depth of coverage, or the inverse, what's the coverage depth given a set amount of sequencing, are both easy to answer with some basic algebra. Given one or the other, plus the genome size and read length/configuration, you can calculate either. This was inspired by a similar calculator written by James Hadfield, and was an opportunity for me to create my first Shiny app. Check out the app here: http://apps.bioconnector.virginia.edu/c ..read more
Visit website
Shiny Developer Conference 2016 Recap
Getting Genetics Done
by Stephen Turner
4y ago
This is a guest post from VP Nagraj, a data scientist embedded within UVA’s Health Sciences Library, who runs our Data Analysis Support Hub (DASH) service. Last weekend I was fortunate enough to be able to participate in the first ever Shiny Developer Conference hosted by RStudio at Stanford University. I’ve built a handful of apps, and have taught an introductory workshop on Shiny. In spite of that, almost all of the presentations de-mystified at least one aspect of the how, why or so what of the framework. Here’s a recap of what resonated with me, as well as some code and links out to my ..read more
Visit website
Repel overlapping text labels in ggplot2
Getting Genetics Done
by Stephen Turner
4y ago
A while back I showed you how to make volcano plots in base R for visualizing gene expression results. This is just one of many genome-scale plots where you might want to show all individual results but highlight or call out important results by labeling them, for example, with a gene name. But if you want to annotate lots of points, the annotations usually get so crowded that they overlap one another and become illegible. There are ways around this - reducing the font size, or adjusting the position or angle of the text, but these usually don’t completely solve the problem, and can even mak ..read more
Visit website
GRUPO: Shiny App For Benchmarking Pubmed Publication Output
Getting Genetics Done
by Stephen Turner
4y ago
This is a guest post from VP Nagraj, a data scientist embedded within UVA’s Health Sciences Library, who runs our Data Analysis Support Hub (DASH) service. The What GRUPO (Gauging Research University Publication Output) is a Shiny app that provides side-by-side benchmarking of American research university publication activity. The How The code behind the app is written in R, and leverages the NCBI Eutils API via the rentrez package interface. The methodology is fairly simple: Build the search query in Pubmed syntax based on user input parameters. Extract total number of articles from results ..read more
Visit website
Tutorial: RNA-seq differential expression & pathway analysis with Sailfish, DESeq2, GAGE, and Pathview
Getting Genetics Done
by Stephen Turner
4y ago
Background This tutorial shows an example of RNA-seq data analysis with DESeq2, followed by KEGG pathway analysis using GAGE. Using data from GSE37704, with processed data available on Figshare DOI: 10.6084/m9.figshare.1601975. This dataset has six samples from GSE37704, where expression was quantified by either: (A) mapping to to GRCh38 using STAR then counting reads mapped to genes with featureCounts under the union-intersection model, or (B) alignment-free quantification using Sailfish, summarized at the gene level using the GRCh38 GTF file. Both datasets are restricted to protein-coding g ..read more
Visit website
Annotables: R data package for annotating/converting Gene IDs
Getting Genetics Done
by Stephen Turner
4y ago
I work with gene lists on a nearly daily basis. Lists of genes near ChIP-seq peaks, lists of genes closest to a GWAS hit, lists of differentially expressed genes or transcripts from an RNA-seq experiment, lists of genes involved in certain pathways, etc. And lots of times I’ll need to convert these gene IDs from one identifier to another. There’s no shortage of tools to do this. I use Ensembl Biomart. But I do this so often that I got tired of hammering Ensembl’s servers whenever I wanted to convert from Ensembl to Entrez gene IDs for pathway mapping, get the chromosomal location for some BED ..read more
Visit website

Follow Getting Genetics Done on FeedSpot

Continue with Google
Continue with Apple
OR