About the DeLINEATE Toolbox   Download   Documentation   Contact/Contribute 

 

 

NEWS FLASH: New toolbox version (0.3b) released on 29 Aug 2018! Get it on the Download page.

 

Hi.

Welcome, new Internet friend. We're glad that you have found your way to the homepage of our little ol' toolbox. We call it the "Deep Learning In Neuroimaging: Exploration, Analysis, Tools, and Education" package, or DeLINEATE for short. The intent is to provide a set of tools that make it easier to use "deep" neural networks for data analysis in research -- although we also have support for PyMVPA, in order to facilitate more conventional multivariate pattern analyses (MVPA) and make it easier to compare "deep learning" approaches to conventional MVPA. And the primary intended use case is for analysis of neuroimaging datasets (e.g., fMRI, EEG, MEG), although there's nothing to stop people from using it for all kinds of other classification tasks and data types.

 

Snakes. Why did it have to be snakes?

This is a Python toolbox, so we hope you all like sneks. This decision is mainly based on the fact that our primary backend for deep learning, Keras, is a Python toolbox. Python is also the language of choice for PyMVPA and a bunch of other neat open-source data analysis stuff out there. So it makes sense, even though some of us (e.g. the guy writing this paragraph) have not fully drunk the Python Kool-Aid yet and maybe still believe in their heart of hearts that C is the One True Language and everyone else should just suck it up and learn that.

But I digress.

Anyway, we did agree that if we want to stay up to date with developments in the deep learning field, Python is probably the place to be. So to get started, you're going to need some kind of Python environment. This is easy on macOS and Linux, where it is generally pre-installed, but might take a little more fiddling around on Windows (KK note: Just install Anaconda, whatever your OS. Linux people would yell at you for doing work with the system Python anyway). You won't necessarily need to know how to write Python code to use this thing -- it does not require any programming per se to configure and run an analysis job -- but you will need to be able to run stuff in a Python environment.

For more info, keep reading on to the "Design Philsophy" and "Getting Started" sections below, and/or see the documentation. At time of writing we are still fleshing out the documentation on the site, so you might want to download the release version and check out the README.md file there. Or if you don't want to commit to an actual download, just click through to the Bitbucket repository from the Download page and the contents of the README.md file should be what you see on the Overview page.

 

A quick note to our pals

So, full disclosure. We're pushing up against a deadline to get this toolbox into public release. While the toolbox itself is in pretty decent shape (though there are lots of tasks yet to do in order to make it super-great and not merely functional), it's likely going to take a bit longer than we have to get the rest of this website and documentation written. Thus, right now this website is kind of a promise of things-to-be-soon, and some of the content *might* be more placeholder than actual useful information.

If you happen to stumble upon this and want to try it out before we update all of this text with more useful content, just head to the Download and/or Contact/Contribute pages linked at the top, where you can find a release version to download and a link to the code repository on Bitbucket. Actually using the toolbox might take a little help from the dev team until the documentation is more fleshed out, so please feel free to get in touch and we can give you a hand. Your questions will also help inform which parts of the docs and website need fleshing out most.

 

Why bother with all this deep learning nonsense?

Good question. There are many different answers of varying quality and sincerity. Probably the best reason is that most people don't have any idea what it is but they know all the big cool tech companies are doing it, so you can get suckers to mentally file you next to Google. This makes them more likely to throw money at your project. It also gives you a license to say lots of extremely stupid and/or intentionally misleading things about artificial intelligence, if you're into that. We think this should be enough to satisfy most people.

However, we are not most people.

So if you are one of those poor souls with a sincere desire to extract signal from the sad and disgusting hairball of neuroimaging data we might also be able to help you out a little. Detailed explanation of the approach's virtues relative to the forms of null-hypothesis significance testing commonly encountered in neuroscience would require us to first provide you with a correct understanding of null-hypothesis significance testing, which sounds tedious and mostly pointless. The super short version is that consistently classifying your cross-validated test set data with better than chance accuracy is a substantially more powerful and less error-prone way to answer questions that are morally identical to those addressed by your t-tests and ANOVAs. If that's what you're doing now, deep learning with cross-validation represents a strict upgrade for you.

If you are doing something substantially more sensible/Bayesian your case is rare enough not to be covered by our landing page, and you should instead develop your own ideas and/or chat with us about how the techniques might be useful to you. The answer is likely to be exploratory in nature.

If you don't fall into any of these camps or you are still not convinced, we would direct you to any one of several recent opinion pieces by more prominent neuroscientists talking about how promising and important the techniques are. Unfortunately many of those papers are in the Journal of Neuroscience, famous for having the least functional website in all of academia (MJ note: wow, KK, that's a bold claim -- the competition is pretty stiff out there), so you're going to have to find them on your own.

Addendum: If the above aren't good enough for you, how about this for the pragmatists? If you have $1000 or so to shell out for a PC with a semi-decent NVIDIA GPU, and you are currently doing some kind of CPU-bound MVPA that you wish could run faster, and you don't particularly care about which specific classification algorithm you use... you could probably get at least as good performance out of a deep-learning classifier and run it anywhere from maybe 20-100x faster, depending on specific hardware and implementation details. We have heard the sages say that time is money, so if you want to spend a little bit of money and save a good chunk of time -- step right this way.

 

Design Philosophy

There are several principles that have guided our work thus far on the DeLINEATE toolbox. You may not agree with our choices in all cases, but in those cases, you are probably wrong.

Simplicity. We have tried to keep our code as simple as possible. This doesn't just mean fewer lines of code, although that's part of it. We have also tried to write code that is straightforward to read and figure out (with minimal syntactic weirdness). This sometimes means we end up writing MORE lines of code than if each line were a snarl of crazy tricks, but we would rather write something that someone with only medium familiarity with the Python language can easily understand. (It helps that most of us also have only medium familiarity with the Python language, so oftentimes we don't know enough l33t h4x0r tricks to write anything truly horrifying-looking.)

We also strive for simplicity in terms of the level of abstraction we target. We have opted to provide a thin layer of abstraction over our Keras and PyMVPA backends. Basically, we wanted SOMETHING to make it easy to batch up analyses without having to re-write a bunch of lines of very similar Python code each time, but we didn't want to create ANOTHER weird, sprawling hierarchy of objects and classes that people would have to learn -- what we mainly wanted to do was try to CONTAIN the weird, sprawling hierarchies of our backends into a manageable format. So, the result is that we have generalized the things we can generalize -- for example, the possible schemes for splitting your data into training, validation, and/or test subsets are pretty similar regardless of what kind of analysis you're doing, so we provide one-size-fits-all functions for that stuff. But for the implementation details of a particular analysis -- like a support vector machine (SVM) in PyMVPA or a convolutional neural network (CNN) in Keras -- we mostly just take in the parameters you'd normally give to PyMVPA or Keras and pass them straight along. The main difference is that you don't have to write any actual code to use our stuff, and sometimes we figure out some of the reasonable default parameters for you to streamline the process.

Simplicity, to us, also means minimizing dependencies. It is very fashionable these days to just import code packages from all over the place so that you (the hypothetical programmer) has to write as little code as possible to get the job done. The problem with this approach is that it makes it very hard to write code that is generalizable and maintainable. Maybe some of the packages you used don't run well on another operating system that you never bothered to test. Maybe the user has a version of one of your dependencies that is too old or too new. There is a tendency in open-source software to pass the buck when something breaks -- if it's not MY code that is breaking on your system, but a bug in one of the packages I rely on, then it's not MY fault, is it? Except it is, partially, because I chose to write code that relied on something that turned out to be unreliable.

Now, you have to have dependencies somewhere. You can't exactly invent your own custom operating system and programming language for every project (believe us, we've looked into it). But you can minimize the NUMBER of dependencies and write code that assumes as little as possible about what exact version of things the end user has.

Transparency, ease-of-use. We group these together because sometimes they sometimes go along with each other and other times they conflict. When they conflict, transparency tends to win around here. Meaning that: When you design an analysis with DeLINEATE, we want to be sure that you are doing what you think you're doing. We want our file formats to be easy for humans to read, and if you know enough Python to look through our code, we want that to be easy to take in at a glance.

With that said -- a lot of the analyses you might do with this toolbox are complex. We don't want to make things TOO easy on you by oversimplifying the situation, because then folks might assume something works one way and be very sad when they find out that it actually works another way. So we don't tend to hide a lot of parameters or provide super-lazy default options in an attempt to sweep all the complexity under the rug -- we'd rather put all our cards on the table. But we will try our best to make the writing on those cards nice and big and easy to read, so at least we aren't ADDING any more complexity or confusion than we absolutely have to.

Flexibility. This is a virtue unto itself, but it partly arises from our emphasis on simplicity. Specifically: The simplicity of our architecture makes it pretty straightforward to use this toolbox in two ways. Beginners can use it by configuring some relatively easy-to-read, text-based job files that don't require any Python coding; however, folks with more specialized needs can just use the toolbox as a code library and write their own Python to do all kinds of crazy things that are not currently possible with the basic job file format.

Throughout the toolbox, we have tried to keep the code pretty modular so that it is easy to write your own plugin-style functions for things like alternate data formats or different cross-validation schemes. Those more advanced abilities do require you to write your own Python code, but not very much of it. And, of course, we would be happy to accept any custom functions people write into the permanent code base, if you write anything that you think would be helpful to the more general community; check out the Contact/Contribute page for details.

 

Getting Started

For full details: See the README.md file in the code repository and the Documentation page on this site. (Note: These things are still currently works in progress but they are coming along rapidly. If you want to get started and there isn't enough documentation to get you where you need to be, get in touch with us -- we'll be happy to help out, as long as you promise to pay it forward by contributing what you can to the documentation effort.)

Minimum requirements: For bare minimum operation, all you should need is a computer that has some semi-recent version of Python on it (Python 2.7+ or any flavor of Python 3). And the administrative privileges to install toolboxes and such as needed. Any major OS (Linux, Windows, Mac) should work. If all you are doing is using the PyMVPA backend to make PyMVPA analyses easier, read no further. However, if you actually want to do deep learning analyses in any kind of reasonable time frame:

Minimum requirements for deep learning if you are sane: Add to the above: Some kind of CUDA-capable NVIDIA-brand GPU (including the many manufacturers of NVIDIA-designed GPUs -- MSI, Gigabyte, EVGA, etc., are all fine). Most recent models should work; for good performance, you probably want something along the lines of a GeForce GTX 1060 or better. All analyses can run in CPU-based mode, but deep-learning analyses will be substantially sped up (on the order of 20x or more, depending on many small details) if you run them on a GPU. Unfortunately this rules out most recent Macs. Windows and Linux should both work. Of course we recommend Linux because *nix OSes are better in general for almost all types of scientific computing -- but if you MUST use Windows, we can make that work.

Next steps: If you have your hardware, and you have Python on your computer, the next steps will basically be to:

  • Get current NVIDIA drivers, including drivers for CUDA operations
  • Also get NVIDIA's cuDNN library (requires joining their developer program, which is free)
  • Install the Python packages for PyMVPA and/or Keras (depending on whether you want to run traditional MVPA, deep learning, or both)
  • If using Keras / deep learning, edit a few configuration files
  • Download and install DeLINEATE (see the Download page for that)
  • Try running some of our sample analyses to test things out
  • Go to town on your own data/analyses!

Fair warning: Each of the steps above can be a little complicated / annoying, as anyone who has ever had to configure any kind of scientific computing environment can probably surmise. We have tried to make our toolbox the LEAST annoying step but there is still a lot of nonsense that goes along with this stuff. It's not too bad once you figure it all out the first time, but that first time can be a bit of a doozy. We are happy to help (again, with the request that in return, you contribute back some info to our documentation to help it grow and improve). So, see how far you can get with our documentation, and get in touch with us (via the Contact page) for more details.

 

Where We're Going...

We don't need... roads. Boy, Back to the Future was a great movie, huh?

The serious answer. There are NUMEROUS features we are working on adding to the toolbox in the near future. Probably the highest priority for now is improving this website and documentation. But there are a bunch more, from little stuff (e.g., functions for loading alternate data types, different cross-validation schemes) to slightly more ambitious stuff (e.g., adding a simple GUI) to big architectural changes (e.g., adding entirely new backends to supplement the current Keras/PyMVPA options). If you want to see our current wish list, check out the Issues page of our code repository (which is linked from the Download page). Feel free to make your own suggestions (either on the repository, or via contacting us) -- we are certainly open to user feedback with regard to which issues get tackled first!