Open In Colab

Step 9: Making Functions More Useful

The code implemented in the previous step

[1]:
import numpy as np


class Variable:
    def __init__(self, data):
        self.data = data
        self.grad = None
        self.creator = None

    def set_creator(self, func):
        self.creator = func

    def backward(self):
        funcs = [self.creator]
        while funcs:
            f = funcs.pop()
            x, y = f.input, f.output
            x.grad = f.backward(y.grad)

            if x.creator is not None:
                funcs.append(x.creator)


class Function:
    def __call__(self, input):
        x = input.data
        y = self.forward(x)
        output = Variable(y)
        output.set_creator(self)
        self.input = input
        self.output = output
        return output

    def forward(self, x):
        raise NotImplementedError()

    def backward(self, gy):
        raise NotImplementedError()


class Square(Function):
    def forward(self, x):
        y = x ** 2
        return y

    def backward(self, gy):
        x = self.input.data
        gx = 2 * x * gy
        return gx


class Exp(Function):
    def forward(self, x):
        y = np.exp(x)
        return y

    def backward(self, gy):
        x = self.input.data
        gx = np.exp(x) * gy
        return gx

Our DeZero can now do the calculations with back-propagation. Furthermore, it has a feature called Define-by-Run, which creates a computational connection at runtime. Here are three improvements to DeZero’s functions that will make DeZero easier to use than it is now.

9.1 Use as a Python function

So far, we have implemented the functions we use in DeZero as “Python classes”. So, for example, to perform a calculation using the Square class, we had to write the following code

[2]:
x = Variable(np.array(0.5))
f = Square()
y = f(x)

To calculate the squares, as described above, we take two steps: create an instance of the Square class and call that instance. From the user’s point of view, however, this two-step process is a bit of a hassle (you can also write y = Square()(x), but that would be unattractive too). More preferably, it would be available as a Python function. So, we add the following implementation.

[3]:
def square(x):
    f = Square()
    return f(x)

def exp(x):
    f = Exp()
    return f(x)

We have implemented two functions, square and exp, as described above. You can now use DeZero functions as Python functions. Incidentally, the above code can also be written in one line as follows.

[4]:
def square(x):
    return Square()(x)  # one line.

def exp(x):
    return Exp()(x)

You can also write Square()(x) directly instead of referring to it by the variable name f, as in f = Square(). Let’s use the two functions we’ve implemented here.

[5]:
x = Variable(np.array(0.5))
a = square(x)
b = exp(a)
y = square(b)

y.grad = np.array(1.0)
y.backward()
print(x.grad)
3.297442541400256

As you can see, if you wrap the first np.array(0.5) with Variable, you can code it as if you were doing a normal numerical calculation – as if you were doing a calculation with NumPy. Note that the above code can also be used to apply functions in succession. In that case, you can write the following

[6]:
x = Variable(np.array(0.5))
y = square(exp(square(x)))  # Apply consecutively
y.grad = np.array(1.0)
y.backward()
print(x.grad)
3.297442541400256

Now you can do the calculations with more natural code. This is the first improvement.

9.2 Simplify the backward method

The second improvement is to reduce the user’s hassle in backward propagation. Concretely, omit y.grid = np.array(1.0) from the code you wrote earlier. This is because you write y.grid = np.array(1.0) each time you do the back propagation. Add the following two lines to the backward method of Variable to skip that task.

[7]:
class Variable:
    def __init__(self, data):
        self.data = data
        self.grad = None
        self.creator = None

    def set_creator(self, func):
        self.creator = func

    def backward(self):
        if self.grad is None:                    # Added code
            self.grad = np.ones_like(self.data)  # Added code

        funcs = [self.creator]
        while funcs:
            f = funcs.pop()
            x, y = f.input, f.output
            x.grad = f.backward(y.grad)

            if x.creator is not None:
                funcs.append(x.creator)

As shown above, if the variable grad is None, it will automatically generate a derivative. Here, np.ones_like(self.data) creates an ndarray instance with the same shape and data type as self.data and with an element of 1. If self.data is a scalar, then self.grad is also a scalar.

NOTE

Previously, we used np.array(1.0) to differentiate the output, but in the above code we used np.ones_like(). The reason for this is to make data in Variable and grad have the same data type. For example, if the type of data is a 32-bit floating-point number, then the type of grad is also a 32-bit floating-point number. Incidentally, if you write np.array(1.0), its data type will be a 64-bit floating-point number.

Now, once you’ve done some calculations, all you have to do is call the backward method on the final output variable to get the derivative. If you actually try it out, here’s what you’ll see

[8]:
x = Variable(np.array(0.5))
y = square(exp(square(x)))
y.backward()
print(x.grad)
3.297442541400256

9.3 Handling only ndarray

DeZero’s Variable is a specification for handling only ndarray instances as data. However, it is quite possible that some users accidentally use data types such as float or int, such as Variable(1.0) or Variable(3). In anticipation of this, we will add a twist so that Variable is a “box” of only ndarray instances. Concretely, if you put any data other than the ndarray instance into the Variable, it will raise an error immediately (but you can keep the None instance). By doing so, you can expect early detection of the problem. Now, add the following code to the initialization part of the Variable class

[9]:
class Variable:
    def __init__(self, data):
        if data is not None:                                               # Added code
            if not isinstance(data, np.ndarray):                           # Added code
                raise TypeError('{} is not supported'.format(type(data)))  # Added code

        self.data = data
        self.grad = None
        self.creator = None

    def set_creator(self, func):
        self.creator = func

    def backward(self):
        if self.grad is None:
            self.grad = np.ones_like(self.data)

        funcs = [self.creator]
        while funcs:
            f = funcs.pop()
            x, y = f.input, f.output
            x.grad = f.backward(y.grad)

            if x.creator is not None:
                funcs.append(x.creator)

As shown above, if the data given as an argument is not None and is not an instance of ndarray, it raises an exception called TypeError. At this time, the string to be output as an error is also prepared as above. Now we can use Variable as follows.

[10]:
x = Variable(np.array(1.0))  # OK
x = Variable(None)  # OK

x = Variable(1.0)  # NG
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-3d59bd5dea78> in <module>()
      2 x = Variable(None)  # OK
      3
----> 4 x = Variable(1.0)  # NG

<ipython-input-9-30bb065c8700> in __init__(self, data)
      3         if data is not None:                                               # Added code
      4             if not isinstance(data, np.ndarray):                           # Added code
----> 5                 raise TypeError('{} is not supported'.format(type(data)))  # Added code
      6
      7         self.data = data

TypeError: <class 'float'> is not supported

You can create Variable with ndarray or None as shown above without any problem. However, all other data types – in the example above, in the case of float – raise an exception. It will instantly tell you that you are using the wrong data type.

Now, with this change, there is one more thing to keep in mind. This is due to NumPy’s unique way of doing things. To illustrate this, let’s first look at the following NumPy code.

[ ]:
x = np.array([1.0])
y = x ** 2
print(type(x), x.ndim)
print(type(y))

Here, x is a one-dimensional ndarray. In this case, the data type of y resulting from x ** 2 (squared calculation) is ndarray. This is the expected result. The case in question is the following.

[ ]:
x = np.array(1.0)
y = x ** 2
print(type(x), x.ndim)
print(type(y))

where x is a 0-dimensional ndarray. In this case, the result of x ** 2 will be np.float64. This is due to the NumPy specification. That is, when computing with a 0-dimensional ndarray instance, the result will be a data type other than the ndarray instance – such as numpy.float64 or numpy.float32. This means that the output of a DeZero function may contain data of type numpy.float64 or numpy.float32. However, the data of Variable is a specification with only ndarray instances. To deal with this, we first prepare the following functions as useful functions.

[ ]:
def as_array(x):
    if np.isscalar(x):
        return np.array(x)
    return x

The np.iscalar function can be used to determine the type of a scalar, such as numpy.float64 (it can also determine Python’s int and float). In fact, using the np.iscalar function, we get the following.

[ ]:
print(np.isscalar(np.float64(1.0)))
print(np.isscalar(2.0))
print(np.isscalar(np.array(1.0)))
print(np.isscalar(np.array([1, 2, 3])))

Thus, with np.iscalar(x) we can determine whether x is an instance of ndarray or not. The as_array function takes advantage of this and converts it to an ndarray instance if it is not an ndarray instance. Now that we have a convenient function called as_array, we can add the following code to the Function class.

[ ]:
class Function:
    def __call__(self, input):
        x = input.data
        y = self.forward(x)
        output = Variable(as_array(y))  # Added code
        output.set_creator(self)
        self.input = input
        self.output = output
        return output

    def forward(self, x):
        raise NotImplementedError()

    def backward(self, gy):
        raise NotImplementedError()

As shown above, let as_array(y) when wrapping the result of forward propagation, y, with Variable. That way, we can ensure that the output result output is an instance of ndarray. Now all the data is an ndarray instance, even if the calculation uses a 0-dimensional ndarray instance.

This concludes the work in this step. In the next step, we’re going to talk about testing DeZero.