some background
Traveling towards the Gobi desert, Mongolia 2011.

Porting trained models using different machine-learning frameworks.

A word of background

One problem frequently encountered in machine learning pipeline development is the need to distribute the tasks between training of the model and using a trained model for making new prediction. While training requires much more heavy computation power, once the weights have been found, using the model for prediction is a relatively simple task. Naturally, it seems fairly useful to export it at that point, and import it later (not necessarily on the same machine) and use it for predictions.

In this post, we are going to demonstrate how models in Scikit-Learn and Tensorflow can be exported after training by giving a simple one-dimensional linear regression problem example. By giving this example, which is a sort-of classical “hello world” application within machine learning language, we are going to show differences between the two frameworks. Once our model is trained and exported, we will later import it in a different script and and give it new examples.

Scikit-Learn

Simple model definition

First, let's have a look at our core model.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import numpy as np
import sklearn as sk
from sklearn import linear_model

# Our training set (perfect linear function).
x = np.reshape(np.array([0, 1, 2]), (-1, 1))
y = np.reshape(np.array([1, 3, 5]), (-1, 1))
TrainingSet = {'x':x, 'y':y}

# Definition of model and training.
model = sk.linear_model.LinearRegression()
model.fit(TrainingSet['x'], TrainingSet['y'])
print ("That's it!")
print ("Model: y = {}*x + {}".format(model.coef_, model.intercept_))

This is a very simple model and it takes hardly any time to train.

Exporting

If we wish to save it now, we may do it at the level of the entire model itself, simply by serializing it's class.

1
2
3
4
import pickle
filename = "sklearn_model.pkl"
pickle.dump(model, open(filename, 'wb'))
print ("Model saved under: {}".format(filename))

Importing

Importing it is even easier.

1
2
3
4
5
6
7
8
9
10
11
12
import numpy as np
import pickle

# Importing.
filename = "sklearn_model.pkl"
print ("Loading model form: {}".format(filename))
model = pickle.load(open(filename, 'rb'))
print ("Model: y = {}*x + {}".format(model.coef_, model.intercept_))

# New data
X = np.reshape(np.array([-1, 2, 4, 6]), (-1, 1))
print (model.predict(X)) 

In fact, since we serialized the whole class, it is not necessary to import sklearn module again. The class takes care of it all. It "remembers".

Still, pickling should be used with caution, especially when working with different versions of either python of the libraries... or when unpickling somebody else files.

Tensorflow

The logic behind it

The logic of Tensorflow is quite different, and it may appear more difficult at first. Tensorflow defines all operations in terms of graphs and sessions. The first concept - graphs - can be compared to making a "skeleton" for the model. Given the constrains, such as different unknowns, mathematical operations, neural network architecture, etc. it maps the problem for us. Having it defined, it then executes different operations such as training the model, updating the weights, reading a value, or setting a new one, which is referred to as running a session. A session, can simply be imagined as a particular variables/operations state of a graph. Although it does sound much more complicated, it is in fact more CPU-efficient, since graphs are in a sense declared.

If we wish to reuse our model, we need to find a way to store and restore both of these. First, we need to save the graph and then we need to save the corresponding session, which holds trained weights.

Model definition

Let’s start again with building our linear regression model.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import tensorflow as tf
import numpy as np
import os

# Our training set (perfect linear function).
x = np.reshape(np.array([0, 1, 2]), (-1, 1))
y = np.reshape(np.array([1, 3, 5]), (-1, 1))

# Definition of the model and training (GRAPH).
W = tf.Variable([0.0], tf.float32)
B = tf.Variable([0.0], tf.float32)
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
model = W*X + B

# How you train is also a part of GRAPH.
cost = tf.reduce_sum(tf.square(model - Y))
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(cost)
init = tf.global_variables_initializer()

# Execute training through SESSION.
sess = tf.Session()
sess.run(init)
for epoch in range(1000):
    sess.run(train, {X:x, Y:y})

os.system('clear') # Flush the screen.
# Preditcion
Weight = sess.run(W)
Bias = sess.run(B)
print ("Weight: {}".format(Weight))
print ("Bias: {}".format(Bias))

print ("Making prediction.")
X_example = np.array([0.5, 1.0, 1.5, 2.0])
X_example = np.reshape(X_example, (-1, 1))

print ("X_example:")
print ("{}".format(X_example))
print ("Prediction:")
print (sess.run(model, {X:X_example}))

As you see, the snippet contains declarations of variables and the so-called placeholders (Tensorflow’s way of reserving a place for “something to come”) together with a few operations to be executed on them. From sess = tf.Session() onward, we tell it what to do with this “situation” (graph) through the session. Note that we use sess.run(W) and sess.run(B) to evaluate the trained parameters. Calling W and B directly, would return object that belongs to the session.

Exporting the weights

Now, when we wish to store the variables, we need to use the tf.train.Saver() object.

1
2
3
4
5
6
7
8
9
10
11
12
# Adding Saving possibility.
toSave = {"Weights":W, "Biases":B}
saver = tf.train.Saver(toSave)
sess = tf.Session()
sess.run(init)
for epoch in range(1000):
    sess.run(train, {X:x, Y:y})

os.system('clear') # Flush the screen.
# Now it is time to save.
spath = saver.save(sess, './model.ckpt')
print ("Model saved under: {}".format(spath))

Here, we choose to pass toSave dictionary to the Saver() object. Although we could skip this argument and hence let Saver() export all of our variables by default, it can help us to indicate what we actually want to store from the model.

Importing the weights

Basic importing of the weights can be done the following way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import tensorflow as tf
import os

# Definition of the model and training (GRAPH).
W = tf.Variable([0.0], tf.float32, name='weights')
B = tf.Variable([0.0], tf.float32, name='bias')
toLoad = {"Weights":W, "Biases":B}

saver = tf.train.Saver(toLoad)
sess = tf.Session()

os.system('clear') # Flush the screen.

# Let's load the variables.
spath = './model.ckpt'
saver.restore(sess, spath)
print ("Model loaded from: {}".format(spath))

print ("Weight: {}".format(sess.run(W)))
print ("Bias: {}".format(sess.run(B)))

Again, we use the dictionary to indicate the variables of interest. Note that toSave and toLoad must be consistent. One more thing in addition to the Saver() object is the necessity of declaring W and B again. Although, it barely matters what we set as their initial values, the Saver() needs to be “aware” of their existence. Otherwise it will flag an error, ironically complaining that it has nothing to save.

Keras

As you can see, doing something that seems relatively simple in Tensorflow requires quite much more work. You may as yourself, what is the point in going the "hard way" at all? Well, Tensorflow was actually designed to be computationally efficient, which is a real plus when working with complex models. Yet, whenever working on a new project, one of the hardest challenges is to balance program efficiency against programmer's efficiency. In other words, what if you can make a really good model, but the time it will take to get there it is long or, what's worse, hard to even estimate? On the other hand, what if you choose the opposite?

Fortunately, there is a middle path here: Keras , which lets us use the power of Tensorflow (or Theano), but provide convenient APIs that make the work really easy. Let's revisit the problem once again and see how we can go about exporting and importing of the model using this library.

Exporting

First of all, Keras defines every model using the concept of layers. Starting off with input layer that defines the feature vector coming in to the model. Later, solving each task can usually be divided into four steps:

  • Defining the model architecture.
  • Compiling of this model using a specific optimizer.
  • Training it, by providing input-output pairs (examples).
  • Finally, using it to predict new data, which can be used to evaluate the model.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import numpy as np
from keras.optimizers import SGD # SGD -> Standard Gradient Descent
from keras.layers import Input, Dense
from keras.models import Model

# Again, using the the same set of data:
x = np.array([0, 1, 2]).reshape(-1, 1)
y = np.array([1, 3, 5]).reshape(-1, 1)

# Definition of the model architecture:
inputs = Input(shape(1, ))
outputs = Dense(1, activation='linear')(inputs)
model = Model(inputs=inputs, outputs=outputs)

# Compiling
model.compile(optimizer=SGD(), loss='mse', metrics=['mse'])

# Training
model.fit(x, y, batch_size=1, epochs=250, shuffle=False)

Now, the object model is going to be our trained model. There is a bunch of operations we can perform. For example, we may convert its architecture to JSON by calling model.to_json() and extract its weights using model.get_weights(). We can also save the object "as is" and use it later.

model.save('simple_model.h5') That’s it. 'simple_model.h5' will contain everything.

Importing

To reuse our trained model, all we need to do is:

1
2
3
4
5
6
7
import numpy as np
from keras.models import load_model

model = load_model('simple_model.h5')
# New data:
X = np.array([-1, 2, 4, 6]).reshape(-1, 1)
Y = model.predict(X)

Y is now going to contain new prediction points.

Summary

In this post, we have seen examples of how importing and exporting of models can be realized using three, probably most popular, machine learning libraries. Given a very easy example, we saw differences in complexity of the task from user perspective. Comparing the three, it seems that Keras is probably the best way to start on machine learning, as it combines the simplicty of use with power coming from Tensorflow.