In this article you will learn about three ways to create deep neural network models using Keras and Tensorflow 2.0 You will learn about their advantages and limitations, and see examples of how these methods are used.
Keras and Tensorflow 2.0 provide three ways to build different deep neural network architectures:
- Sequential API
- Functional API
- Subclassing API
The Sequential API is the simplest possible way to create deep neural networks. As the name says using this method we can build neural networks that are just composed of a single stack of layers connected sequentially.
Below you can see how to create a simple neural network architecture using Sequential API for well known Fashion MNIST problem:
model = keras.models.Sequential()
Ass you see above we add layers sequentially step by step. First layer is Flatten layer that convert our input images into 1D arrays. This layer does not have any parameters and its purpose is to do some preprocessing. Flatten is the first layer, so we need to specify the input_shape and in our case for Fashion MNIST it is input_shape=[28, 28]. Next two layers are Dense hidden layers with 300 neurons and 200 neurons, respectively. Our Dense hidden layers will use ReLU activation function. Finally we add a Dense output layer with 10 neurons (one per class) and this time we use the softmax activation function, because the classes are exclusive (we have ten classes in Fashion MNIST dataset that are different type of clothing).
However, while sequential models are very common, it is sometimes useful or even necessary to build neural networks with more complex topologies, or with multiple inputs or outputs. For this purpose, Keras provides a Functional API.
With Functional API it is possible:
- Easily sharing layers inside the architecture
- Creating more complex models
- Designing directed acyclic graphs
- Having multiple inputs and multiple outputs
It is worth emphasizing that any possible sequential model can also be created using Functional API.
Wide and Deep neural network is one of the examples of non-sequential architecture. This architecture allows to learn deep patterns using the deep path and simple rules using the short path, but let’s say that you want to send two subsets of the features one through the short path and a second possibly overlapping through the deep path. As you see in the picture below we have short and deep path of sending features to DNN. In contrast, a regular sequential MLP forces all the data to flow through the full stack of layers.
Suppose you want to send five features through the wide path and six features through the deep path. Below you can see possible theoretical architecture for example showed in the picture using Functional API:
input_A = keras.layers.Input(shape=, name="wide_input")
input_B = keras.layers.Input(shape=, name="deep_input")
hidden1 = keras.layers.Dense(30, activation="relu")(input_B)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.concatenate([input_A, hidden2])
output = keras.layers.Dense(1, name="output")(concat)
model = keras.Model(inputs=[input_A, input_B], outputs=[output])
The code is self explanatory except for one thing from which you can tell why this way is called functional. Take a look on the first Dense hidden layer, as soon as it is created, notice that we call it like a function, passing it the input. The same you can observe in second hidden layer and output layer where we passes concatenation of input A and results from second hidden layer.
The last method so far to implement a model architecture using Keras and TensorFlow 2.0 is model subclassing. The Model class is the root class in Keras used to define a model architecture. Since Keras utilizes OOP we can subclass the Model class and create our custom architecture definition.
Simply subclass the Model class and then create all layers you need in the constructor, next you use them for the computations you want in the call() method. Below code gives us same model as the one we just built with the Functional API:
def __init__(self, units=30, activation="relu", **kwargs):
self.hidden1 = keras.layers.Dense(
self.hidden2 = keras.layers.Dense(
self.output = keras.layers.Dense(1)
def call(self, inputs):
input_A, input_B = inputs
hidden1 = self.hidden1(input_B)
hidden2 = self.hidden2(hidden1)
concat = keras.layers.concatenate([input_A, hidden2])
output = self.main_output(concat)
model = WideAndDeepModel()
Subclassing API is the most flexible way to build neural networks, but unfortunately it comes at a cost. Model subclassing is much more harder to utilize than any other way. Summary method gives you only a list of layers with no information of how they are connected. It is much more harder to debug it and Keras cannot save it, so unless you really need that extra flexibility, you should probably stick to the Sequential API or the Functional API.
Why is it actually used if there are so many disadvantages of using it?
Exotic architectures created especially by researchers are very challenging or even not possible at all, to implement it using Sequential and Functional API. Researchers wish to have control over everything that they create, every possible nuance of the network and training process. Most of the time it is not the case in the companies where you use common and sometimes simple architectures.