Porting trained models using different machine-learning frameworks.
A word of background
One problem frequently encountered in machine learning pipeline development
is the need to distribute the tasks between training of the model and using
a trained model for making new prediction. While training requires much more
heavy computation power, once the weights have been found, using the model
for prediction is a relatively simple task. Naturally, it seems fairly
useful to export it at that point, and import it later (not necessarily on
the same machine) and use it for predictions.
In this post, we are going to demonstrate how models in Scikit-Learn
and Tensorflow can be exported after
training by giving a simple one-dimensional linear regression problem
example. By giving this example, which is a sort-of classical “hello
world” application within machine learning language, we are going to
show differences between the two frameworks. Once our model is trained and
exported, we will later import it in a different script and
and give it new examples.
Scikit-Learn
Simple model definition
First, let's have a look at our core model.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import numpy as np
import sklearn as sk
from sklearn import linear_model
# Our training set (perfect linear function).
x = np.reshape(np.array([0, 1, 2]), (-1, 1))
y = np.reshape(np.array([1, 3, 5]), (-1, 1))
TrainingSet = {'x':x, 'y':y}
# Definition of model and training.
model = sk.linear_model.LinearRegression()
model.fit(TrainingSet['x'], TrainingSet['y'])
print ("That's it!")
print ("Model: y = {}*x + {}".format(model.coef_, model.intercept_))
This is a very simple model and it takes hardly any time to train.
Exporting
If we wish to save it now, we may do it at the level of the entire model itself, simply by serializing it's class.
1
2
3
4
import pickle
filename = "sklearn_model.pkl"
pickle.dump(model, open(filename, 'wb'))
print ("Model saved under: {}".format(filename))
Importing
Importing it is even easier.
1
2
3
4
5
6
7
8
9
10
11
12
import numpy as np
import pickle
# Importing.
filename = "sklearn_model.pkl"
print ("Loading model form: {}".format(filename))
model = pickle.load(open(filename, 'rb'))
print ("Model: y = {}*x + {}".format(model.coef_, model.intercept_))
# New data
X = np.reshape(np.array([-1, 2, 4, 6]), (-1, 1))
print (model.predict(X))
In fact, since we serialized the whole class, it is not necessary to
import sklearn
module again. The class takes care of it
all. It "remembers".
Still, pickling should be used with caution, especially when working with different versions of either python of the libraries... or when unpickling somebody else files.
Tensorflow
The logic behind it
The logic of Tensorflow is quite different, and it may appear more difficult at first. Tensorflow defines all operations in terms of graphs and sessions. The first concept - graphs - can be compared to making a "skeleton" for the model. Given the constrains, such as different unknowns, mathematical operations, neural network architecture, etc. it maps the problem for us. Having it defined, it then executes different operations such as training the model, updating the weights, reading a value, or setting a new one, which is referred to as running a session. A session, can simply be imagined as a particular variables/operations state of a graph. Although it does sound much more complicated, it is in fact more CPU-efficient, since graphs are in a sense declared.
If we wish to reuse our model, we need to find a way to store and restore both of these. First, we need to save the graph and then we need to save the corresponding session, which holds trained weights.
Model definition
Let’s start again with building our linear regression model.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import tensorflow as tf
import numpy as np
import os
# Our training set (perfect linear function).
x = np.reshape(np.array([0, 1, 2]), (-1, 1))
y = np.reshape(np.array([1, 3, 5]), (-1, 1))
# Definition of the model and training (GRAPH).
W = tf.Variable([0.0], tf.float32)
B = tf.Variable([0.0], tf.float32)
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
model = W*X + B
# How you train is also a part of GRAPH.
cost = tf.reduce_sum(tf.square(model - Y))
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(cost)
init = tf.global_variables_initializer()
# Execute training through SESSION.
sess = tf.Session()
sess.run(init)
for epoch in range(1000):
sess.run(train, {X:x, Y:y})
os.system('clear') # Flush the screen.
# Preditcion
Weight = sess.run(W)
Bias = sess.run(B)
print ("Weight: {}".format(Weight))
print ("Bias: {}".format(Bias))
print ("Making prediction.")
X_example = np.array([0.5, 1.0, 1.5, 2.0])
X_example = np.reshape(X_example, (-1, 1))
print ("X_example:")
print ("{}".format(X_example))
print ("Prediction:")
print (sess.run(model, {X:X_example}))
As you see, the snippet contains declarations of
variables and the so-called placeholders (Tensorflow’s way of reserving a
place for “something to come”) together with a few operations to be executed
on them. From sess = tf.Session()
onward, we tell it what to
do with this “situation” (graph) through the session. Note that we
use sess.run(W)
and sess.run(B)
to evaluate the
trained parameters. Calling W
and B
directly,
would return object that belongs to the session.
Exporting the weights
Now, when we wish to store the variables, we need to use the tf.train.Saver()
object.
1
2
3
4
5
6
7
8
9
10
11
12
# Adding Saving possibility.
toSave = {"Weights":W, "Biases":B}
saver = tf.train.Saver(toSave)
sess = tf.Session()
sess.run(init)
for epoch in range(1000):
sess.run(train, {X:x, Y:y})
os.system('clear') # Flush the screen.
# Now it is time to save.
spath = saver.save(sess, './model.ckpt')
print ("Model saved under: {}".format(spath))
Here, we choose to pass toSave
dictionary to the Saver()
object. Although we could skip this argument and hence let Saver()
export all of our variables by default, it can help us to indicate what we
actually want to store from the model.
Importing the weights
Basic importing of the weights can be done the following way:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import tensorflow as tf
import os
# Definition of the model and training (GRAPH).
W = tf.Variable([0.0], tf.float32, name='weights')
B = tf.Variable([0.0], tf.float32, name='bias')
toLoad = {"Weights":W, "Biases":B}
saver = tf.train.Saver(toLoad)
sess = tf.Session()
os.system('clear') # Flush the screen.
# Let's load the variables.
spath = './model.ckpt'
saver.restore(sess, spath)
print ("Model loaded from: {}".format(spath))
print ("Weight: {}".format(sess.run(W)))
print ("Bias: {}".format(sess.run(B)))
Again, we use the dictionary to indicate the variables of interest. Note
that toSave
and toLoad
must be consistent. One more thing in addition to the
Saver()
object is the necessity of declaring W
and B
again. Although, it
barely matters what we set as their initial values, the Saver()
needs to be
“aware” of their existence. Otherwise it will flag an error, ironically
complaining that it has nothing to save.
Keras
As you can see, doing something that seems relatively simple in Tensorflow requires quite much more work. You may as yourself, what is the point in going the "hard way" at all? Well, Tensorflow was actually designed to be computationally efficient, which is a real plus when working with complex models. Yet, whenever working on a new project, one of the hardest challenges is to balance program efficiency against programmer's efficiency. In other words, what if you can make a really good model, but the time it will take to get there it is long or, what's worse, hard to even estimate? On the other hand, what if you choose the opposite?
Fortunately, there is a middle path here: Keras , which lets us use the power of Tensorflow (or Theano), but provide convenient APIs that make the work really easy. Let's revisit the problem once again and see how we can go about exporting and importing of the model using this library.
Exporting
First of all, Keras defines every model using the concept of layers. Starting off with input layer that defines the feature vector coming in to the model. Later, solving each task can usually be divided into four steps:
- Defining the model architecture.
- Compiling of this model using a specific optimizer.
- Training it, by providing input-output pairs (examples).
- Finally, using it to predict new data, which can be used to evaluate the model.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import numpy as np
from keras.optimizers import SGD # SGD -> Standard Gradient Descent
from keras.layers import Input, Dense
from keras.models import Model
# Again, using the the same set of data:
x = np.array([0, 1, 2]).reshape(-1, 1)
y = np.array([1, 3, 5]).reshape(-1, 1)
# Definition of the model architecture:
inputs = Input(shape(1, ))
outputs = Dense(1, activation='linear')(inputs)
model = Model(inputs=inputs, outputs=outputs)
# Compiling
model.compile(optimizer=SGD(), loss='mse', metrics=['mse'])
# Training
model.fit(x, y, batch_size=1, epochs=250, shuffle=False)
Now, the object model
is going to be our trained model. There is a bunch
of operations we can perform. For example, we may convert its architecture to JSON
by calling model.to_json()
and extract its weights using
model.get_weights()
. We can also save the object "as is" and use it later.
model.save('simple_model.h5')
That’s it. 'simple_model.h5'
will contain everything.
Importing
To reuse our trained model, all we need to do is:
1
2
3
4
5
6
7
import numpy as np
from keras.models import load_model
model = load_model('simple_model.h5')
# New data:
X = np.array([-1, 2, 4, 6]).reshape(-1, 1)
Y = model.predict(X)
Y
is now going to contain new prediction points.