Getting Started¶

What is DeepForge?¶

Deep learning is a promising, yet complex, area of machine learning. This complexity can both create a barrier to entry for those wanting to get involved in deep learning as well as slow the development of those already comfortable in deep learning.

DeepForge is a development environment for deep learning focused on alleviating these problems. Leveraging principles from Model-Driven Engineering, DeepForge is able to reduce the complexity of using deep learning while providing an opportunity for integrating with other domain specific modeling environments created with WebGME.

Design Goals¶

As mentioned above, DeepForge focuses on two main goals:

Improving the efficiency of experienced data scientists/researchers in deep learning
Lowering the barrier to entry for newcomers to deep learning

It is important to highlight that although one of the goals is focused on lowering the barrier to entry, DeepForge is intended to be more than simply an educational tool; that is, it is important not to compromise on flexibility and effectiveness as a research/industry tool in order to provide an easier experience for beginners (that’s what forks are for!).

Overview and Features¶

DeepForge provides a collaborative, distributed development environment for deep learning. The development environment is a hybrid visual and textual programming environment. Higher levels of abstraction, such as creating architectures, use visual environments to capture the overall structure of the task while lower levels of abstraction, such as defining custom training functions, utilize text environments. DeepForge contains both a pipelining language and editor for defining the high level operations to perform while training or evaluating your models as well as a language for defining neural networks (through installing a DeepForge extension such as DeepForge-Keras).

Concepts and Terminology¶

Operation - essentially a function written in Python (such as training a model, visualizing results, etc)
Pipeline - directed acyclic graph composed of operations - e.g., a training pipeline may retrieve and normalize data, train an architecture and return the trained model
Execution - when a pipeline is run, an “execution” is created and reports the status of each operation as it is run (distributed over a number of worker machines)
Artifact - an artifact represents some data (either user uploaded or created during an execution)
Resource - a domain specific model (provided by a DeepForge extension) to be used by a pipeline such as a neural network architecture