Table of Contents
- Open science
- Data Analysis
- Making figures nice
- Artificial vs. Biological neural nets
- Biophysical modeling
- Lit search
- Writing papers
- Giving talks
- Grant Writing
- Meta Neuro Papers
- Survival Guides
- Science Blogs
Pillars of open science
- Version control: Git and GitHub
- Programming: Python and R
- Data analysis: Jupyter Lab
- Documentation: Sphinx/Doxygen
- Software testing: PyTest,
- Continuous integration: Travis, CicrcleCI
- Reproducible containers: Docker and Binder
Here is a Berkeley course which provides many tutorials for each of these pillars. https://berkeley-stat159-f17.github.io/stat159-f17/
For better or worse, you probably want a website at this stage of human evolution, where you can link to free pdfs of your manuscripts, reference code, publish datasets, point people to X project, etc. The fastest (<1 hr), simplest (4 steps), and most elegant way I have come across is through github pages using jekyll themes. The following steps walk you through hosting your new website on your github account, which you will create in step 1 if you don't already have one. N.B. You do not need to install anything locally on your machine (and it is likely preferable not to) regardless if you are using Linux, OSX or Windows. The following steps are sufficient.
- Sign up for github if you do not have an account.
- Fork a jekyll repository to get an academic template on your account (e.g. fork this repo and your site will look like this).
- Rename the repository you just forked to (go to settings in upper right) [username].github.io. [username] is your username from step 1.
- Edit the data in the pages directory and config.yml files to suit your needs.
- Go to your website (which will live at https://[username].github.io).
It helps to start any new project with a standard project structure for other and your future self to follow along. Here is an example you might find useful which can be set up with just a couple commands. If you do not yet have python installed see the progamming section.
pip install cookiecutter cookiecutter https://github.com/drivendata/cookiecutter-data-science
Need a paper but behind a paywall? Try sci-hub
Download Atom. It is a very powerful and free! editor that integrates nicely with github. Use it for writing text, markup, code, scripts, etc.
Make an "autopilot" script for your analyses, so that figures can be updated in real time. Then write a cron job to execute the analysis script so that newly collected data is automatically integrated perhaps with an email summarizing the results sent to you or your advisor. Example here.
Make a startup file for your jupyter notebooks that preloads modules like numpy and scipy and figure specifications so they are consistent and pub ready. The config file can specify font sizes, legends, color themes etc.
Learning data science? Here is an extremely well curated series of quick references for data science in python (numpy, scipy, pandas), ML algorithms, probability and more
Diving into deep learning? Here is a similar reference for machine learning (ML) and deep learning.
Start using github. It is excellent for version control and for sharing. Consider how many times you have written a script called analysis_v5_final_reallyfinal_thistime_final.py. With github you will just have analysis.py. With github, other researchers can replicate exactly what you did. This will ultimately save you time, if someone emails you for example.
If you write software for the use of the greater scientific community, it will be a lot easier for others to port your code and collaborate if you follow a standard set of guidelines when packaging your project for release (e.g. on github). Here is a template to follow written by Ariel Rokem.
A lot of open software that is developed for neuroscience runs on either Linux or OSX but not Windows. So consider installing Linux. Ubuntu is a popular distribution that has extensive support if you get stuck.
After installing Linux, learn the art of the command line
Do you use Matlab? It is worth considering a switch to Python. Python offers simpler syntax, enables system wide interfacing, is open source, free and for these reasons is being used by more and more scientists. Replication is far easier with Python than Matlab.
Now want to learn Python?
- Start by installing Anaconda which is a scientific distribution of python that enables high performance computing and analysis.
- Everyone in our lab learned the syntax with Learn Python the Hard Way. 52 exercises spanning installing Python to building a web app.
- Here is a Python Bootcamp notebook that provides excellent advice on learning Python, written by Tom Donoghue.
- Read the style guide to write "pythonic" code.
- Package your python project with this amazing guide by Vicki Boykis
- Learn numpy (a package for scientific computing) with these 100 exercises written by Nicolas Rougier.
- Become a python data ninja. Thomas Wiecki provides a great introduction to data science in python.
Not sure how to code something? It may have an answer on stack overflow. Even professional programmers use stack overflow.
Learn how to simulate data to ensure that your analysis works the way you think it does.
A basic understanding of data structures is useful for optimizing larger scale projects.
Need to sync files across your various lab computers/clusters and laptop you use at home and don't want to use Dropbox? Use rsync instead. e.g:
rsync -zavr -e ssh --delete --include '*/' --include='*include_these_files.[ext]' --exclude='*' [local_dir] [remote_server]:[remote_dir]
Generating publication quality figures
Use your plotting software of choice (e.g. seaborn) to get your figure as close to final as possible. Avoid having to make post-edits in illustrator/inkscape which can be a huge time sink as a graduate student.
Carefully consider the colors and colormaps of your figures. How would color blind readers interpret your figures?
If you use Matlab, try out the gramm toolbox, inspired by R's ggplot2.
Have a look at the tutorials on flowingdata for excellent data visualization.
Save your figures in svg, or eps, not png.
Learning statistics or want to brush up? Here are three textbooks (available online) that you can choose from depending on the depth you want to explore and mathematical background.
- Introduction to Statistics -- cover the fundamentals, requires little mathematical background. Another great introduction is Statistics Without Tears by Rowntree.
- All of Statistics -- more detailed than above, requires calculus and linear algebra
- Advanced data analysis -- if you dream about distributions, requires substantial statistical background
If you are teaching statistics, here are excellent visualizations of core concepts.
See this tutorial on machine learning concepts.
Looking for a Bayesian analysis package? Try JASP.
Beware of p-values and null hypothesis significance testing (NHST) the de facto standard in neurobiology, cognitive neuroscience and much of biomedical research:
Do you have multi-level data? E.g. do you have some cells from one animal and some other cells from a different animal? Are you pooling the data because they have similar distributions/variance? Instead you might want to consider hierarchical aka mixed effect models. Here is a really beautiful demonstration of this concept.
As soon as possible, understand:
Do not let your test data into your training data (i.e. double dipping)
Rob Kass, @CMU statistics, has written the extremely useful Ten Simple Rules for Effective Statistical Practice.
Understand the bias/variance trade off
Do you know what the chris rock effect is in statistics? Expand your statistical lexicon here.
Are you still not using hierarchical Bayes? Thomas Wiecki will show you the way. Everything is a trivial case of hierarchical Bayesian inference.
Frequent datatau for interesting news on data analysis.
Your stats question may have an answer over here on cross-validated.
Wagenmakers new blog! Bayesian Spectacles
Artifical vs Biologicial Neural Networks
- If you are a CS person you may want to know the main theoretical and practical differences between biological and artificial neural networks. Here is a incredibly well put together summary from the Erlich lab.
Omit needless words, suppress the encyclopedic impulse, don't try to sound smart, and other sound advice from a mathematician.
Andrew Gelman provides some more general advice for academic writing here.
Publish your paper to one of the arXivs. If your PI doesn't support that, convince them.
If you are frustrated with writing, read this
Share your work with your friends as well as your enemies, the latter might give you even better criticism.
Steven Pinker has some interesting thoughts on how to make academic writing better
If you are struggling to write scientific papers in word, e.g. embedding equations, consider using Latex (pronounced "Lay-Tech"). Latex allows you to focus on writing rather than formatting.
Check out 'The David Attenborough style of scientific presentations' Give a talk by treating your work as a cool story that people will naturally be curious to hear.
You need to choose some medium of presenting your slides. It would be nice to always have access to them, to be able to share them with others who might not have your software (e.g. powerpoint) and to be easily viewable on mobile. Here is a cross-platform tool that meets those needs.
It is very challenging to give high-quality talks and everyone struggles with it. A lot of academics do not receive training on how to give talks and do not know the most effective ways of presenting information - but this has been looked at. Here are some incredibly useful notes how to prepare the actual content of the slides, and here are some notes on the speaking portion.
Here is one talk that might be a design inspiration. How Github uses Github to build Github
Know your neuroanatomy. Julian Caspers, a neuroradiologist, provided a great set of guidelines at the 2017 Organization for Human Brain Mapping conference. You also may find this interactive brain explorer useful.
Best practices for reporting fMRI studies.
Neuroimaging is easy to do wrong and still get a result. Here are common pitfalls to avoid when running your analysis.
It is absolutely critical to know what kind of power you have and what you can conclude from the kind of analysis that you are doing. Here is a useful guide.
Standardize your imaging data set using the BIDS format - this will make your data more accessible to both your collaborators and the field at large.
As a benchmark, you should be able to write down the general linear model you are using from scratch and solve it in closed form.
Understand the difference between univariate and multivariate approaches to fMRI
You will need a visualization tool. A lot of labs have success with MRIcroGL or the connectome workbench. Recently James Gao written an indredibly powerful new tool called PyCortex which uses WebGL to render the flat maps and fiducial surfaces in your browser, you can even project movies on the surface.
Improve your understanding of anatomy with the web based user interface for exploring the human brain called Cortical Explorer.
Before you get really deep in your design, check out NeuroSynth (written by Tal Yarkoni) to run a meta analysis on your covariates of interest to see what has been done before.
- MNE-python is the go-to for source localization and sensor space data processing
- For localizing surface electrodes see this python toolbox
Analyzing ephys data
- Most ephys lab use in house analysis routines in (sometimes) relatively closed source and (oftentimes) expensive applications. Pavan Ramkumar @KordingLab has written an excellent open source package for spike data analysis and visualization in Python.
Biophysical and molecular modeling
Check out CellBlender for visualization and simulation of realistic 3D cellular models.
Keep a digital lab notebook with Benchling, free for academics.
Try ApE for creating plasmid maps/visualization of restriction sites and planning experiments.
Recreate expensive hardware on the cheap with labrigger
Fiji is a free and easy to use image processsor.
You will need a citation manager early on, PaperPile is a good one that is well integrated with Pubmed
- Mendeley is a free alternative
Find articles before they are officially published on arxiv
You can search the literature with Pubmed & Google-scholar. Now is a good time to make your own google scholar account if you don't have one. Also, stay on top of your favorite authors' publications with Google-scholar's alerts.
Before you start down some major project that you will be committed to for years, understand the current literature in your topic. Understand very clearly why you are going to do what you are going to do.
Be skeptical of author's use of the word prediction, often what they really mean is in-sample linear correlation, and not what prediction actually means, out-of-sample generalization of a model. Here Tal Yarkoni provides some insights.
- Be aware of what grants have been funded in your field, by searching nih reporter. This will tell you what was funded, the program officer, the PI, etc.
Twitter is a great resource for identifying new papers, events, tips etc.
- @KordingLab neural data science, computational modeling
- @StatModeling statistics, hierarchical bayesian modeling, R
- @thefreemanlab Director of computational biology at CZI
- @NKriegeskorte fMRI, RSA, deep learning applied to neuroimaging
- @jakevdp python, data analysis, astrophysics
- @fonnesbeck statistical analysis in python
- @bioRxiv Neuroscience the Rxiv for neuro
- @Neuro_Skeptic neuroscience in general
- @BLAMlab stroke recovery, motor control
- @diedrichsenlab computational neuroimaging
- @talyarkoni imaging, meta
- @flowingdata data visualization
- @tdverstynen cognitive neuroscience, theoretical neuroscience, imaging (DSI/fMRI)
- Neuroscience Needs Behavior
- Could a Neuroscientist Understand a Microprocessor. A carefuly evaluations about eh tools that neuroscientists use to attempt to understand the brain. And check out the talk based on this paper covering unknown unknowns in neuroscience.
- Never show up empty handed to meetings with your PI.
- Have a clear objective to all meetings that everyone else knows as well.
- Be able to show some evidence of your productivity.
- You will have some days or weeks where nothing worked. I found that in those cases it is productive to have a "rainy day" folder containing interesting analyses/figures you have not yet shown.
- The Three Golden Rules for Successful Scientific Research
- How to pick a graduate advisor
- Learn how to learn with this coursera course
- Have a long look at this Survival Guide for PH.d students, by Andrej Karpathy, CS Ph.D, and the current director of AI at Tesla.
- Here is course called personal finance for engineers from Stanford.
- Ronald Azuma's retrospective on graduate school
- Randy Pausch on time management
- Know when and when not to say no
- Find (neuro)hackathons in your area and go to them. You get to meet all kinds of people and produce something at an incredibly fast rate, thanks to the symbiosis of working on a team. Can be especially refreshing if you are spending years on your projects.
- Slides on making mistakes and being wrong.
- Guide on reading math.
Pillars of open science
- The three pillars of open science are open data, open code, and open papers.
- Mark Humphries will blow up your world at the Spike
Facts about Brains
Thanks to contributions from Ran Liu, Annie Homan, Rory Flemming, Daniel Borek, and Matt Boring for making this page more useful.