Apache Spark is replacing MATLAB in the domain of computational neuroscience. The constraints of running MATLAB on a single machine can’t support the demands of neuroscience, which has huge collections of images and time-series data sets.
Jeremy Freeman is a computational neuroscientist who is adopting Apache Spark to be able to analyze these giant data sets that do not fit on a single machine. But Apache Spark was not designed with neuroscience in mind. For this reason, Jeremy has helped to build several libraries on top of Spark. Thunder is a library for standard, distributed representation of data. Lightning is an API for reproducible web visualizations. These abstractions sit on top of Spark, and add a layer of usability. As it turns out, solving these problems for neuroscience have produced tools that are useful in a variety of other domains. In our discussion with Jeremy Freeman, we talk about Apache Spark, neuroscience, and the technological and cultural problems faced by traditional academic research.