A couple of weeks ago, as I was preparing to teach our incoming graduate students about data analysis, I ran across the following assertion in my notes: neuroscientists spend more time on data analysis than any other research activity.
I had zero proof for this, but it felt truthy.
So I did a little experiment. I asked my colleagues in the Center for Cognitive Neuroscience and the Department of Neurobiology to fill out a short, one-question survey:
By the way, that link is still up. If you haven’t taken the survey but work in neuroscience, go take it now.
Indeed, data analysis is the single most time-consuming activity in the research process
This based on about 95 responses.
As we see in the plot, the median time spent on data analysis, at 26%, just edges out data collection at 24%. It could also be noted that these are the two most variable allocations.
But there’s also a lot of variability overall. My intuition is that subfields like neuroimaging require less time to gather data (but more to analyze it) than, say, molecular neuroscience.
Even though I didn’t ask respondents to report their subfields, I was curious whether the data were perhaps multimodal, suggesting clusters of responses, but the violin plot didn’t bear that out. However it’s still interesting to ask how correlated the allocations were with each other:
Experimental Design | Piloting | Data Collection | Data Analysis | Writing Results | |
---|---|---|---|---|---|
Piloting | 0.07 | ||||
Data Collection | -0.38 | -0.15 | |||
Data Analysis | -0.38 | -0.10 | -0.23 | ||
Writing Results | -0.05 | -0.36 | -0.44 | -0.19 | |
Review Process | 0.12 | -0.24 | -0.38 | -0.38 | 0.36 |
So, even though all these numbers are required to add to 1, and so we expect a negative correlation between them (roughly -14% based on a uniform Dirichlet prior with K=6), we still notice a couple of interesting features: