The code implemented in the previous step
[1]:
import numpy as np
class Variable:
def __init__(self, data):
self.data = data
self.grad = None
self.creator = None
def set_creator(self, func):
self.creator = func
def backward(self):
funcs = [self.creator]
while funcs:
f = funcs.pop()
x, y = f.input, f.output
x.grad = f.backward(y.grad)
if x.creator is not None:
funcs.append(x.creator)
class Function:
def __call__(self, input):
x = input.data
y = self.forward(x)
output = Variable(y)
output.set_creator(self)
self.input = input
self.output = output
return output
def forward(self, x):
raise NotImplementedError()
def backward(self, gy):
raise NotImplementedError()
class Square(Function):
def forward(self, x):
y = x ** 2
return y
def backward(self, gy):
x = self.input.data
gx = 2 * x * gy
return gx
class Exp(Function):
def forward(self, x):
y = np.exp(x)
return y
def backward(self, gy):
x = self.input.data
gx = np.exp(x) * gy
return gx
Our DeZero can now do the calculations with back-propagation. Furthermore, it has a feature called Define-by-Run, which creates a computational connection at runtime. Here are three improvements to DeZero’s functions that will make DeZero easier to use than it is now.
So far, we have implemented the functions we use in DeZero as “Python classes”. So, for example, to perform a calculation using the Square
class, we had to write the following code
[2]:
x = Variable(np.array(0.5))
f = Square()
y = f(x)
To calculate the squares, as described above, we take two steps: create an instance of the Square
class and call that instance. From the user’s point of view, however, this two-step process is a bit of a hassle (you can also write y = Square()(x)
, but that would be unattractive too). More preferably, it would be available as a Python function. So, we add the following implementation.
[3]:
def square(x):
f = Square()
return f(x)
def exp(x):
f = Exp()
return f(x)
We have implemented two functions, square
and exp
, as described above. You can now use DeZero functions as Python functions. Incidentally, the above code can also be written in one line as follows.
[4]:
def square(x):
return Square()(x) # one line.
def exp(x):
return Exp()(x)
You can also write Square()(x)
directly instead of referring to it by the variable name f
, as in f = Square()
. Let’s use the two functions we’ve implemented here.
[5]:
x = Variable(np.array(0.5))
a = square(x)
b = exp(a)
y = square(b)
y.grad = np.array(1.0)
y.backward()
print(x.grad)
3.297442541400256
As you can see, if you wrap the first np.array(0.5)
with Variable
, you can code it as if you were doing a normal numerical calculation – as if you were doing a calculation with NumPy. Note that the above code can also be used to apply functions in succession. In that case, you can write the following
[6]:
x = Variable(np.array(0.5))
y = square(exp(square(x))) # Apply consecutively
y.grad = np.array(1.0)
y.backward()
print(x.grad)
3.297442541400256
Now you can do the calculations with more natural code. This is the first improvement.
The second improvement is to reduce the user’s hassle in backward propagation. Concretely, omit y.grid = np.array(1.0)
from the code you wrote earlier. This is because you write y.grid = np.array(1.0)
each time you do the back propagation. Add the following two lines to the backward
method of Variable
to skip that task.
[7]:
class Variable:
def __init__(self, data):
self.data = data
self.grad = None
self.creator = None
def set_creator(self, func):
self.creator = func
def backward(self):
if self.grad is None: # Added code
self.grad = np.ones_like(self.data) # Added code
funcs = [self.creator]
while funcs:
f = funcs.pop()
x, y = f.input, f.output
x.grad = f.backward(y.grad)
if x.creator is not None:
funcs.append(x.creator)
As shown above, if the variable grad
is None
, it will automatically generate a derivative. Here, np.ones_like(self.data)
creates an ndarray
instance with the same shape and data type as self.data
and with an element of 1. If self.data
is a scalar, then self.grad
is also a scalar.
NOTE
Previously, we used np.array(1.0)
to differentiate the output, but in the above code we used np.ones_like()
. The reason for this is to make data
in Variable
and grad
have the same data type. For example, if the type of data
is a 32-bit floating-point number, then the type of grad
is also a 32-bit floating-point number. Incidentally, if you write np.array(1.0)
, its data type will be a 64-bit floating-point number.
Now, once you’ve done some calculations, all you have to do is call the backward
method on the final output variable to get the derivative. If you actually try it out, here’s what you’ll see
[8]:
x = Variable(np.array(0.5))
y = square(exp(square(x)))
y.backward()
print(x.grad)
3.297442541400256
DeZero’s Variable
is a specification for handling only ndarray
instances as data. However, it is quite possible that some users accidentally use data types such as float
or int
, such as Variable(1.0)
or Variable(3)
. In anticipation of this, we will add a twist so that Variable
is a “box” of only ndarray
instances. Concretely, if you put any data other than the ndarray
instance into the Variable
, it will raise an error immediately (but you can keep the
None
instance). By doing so, you can expect early detection of the problem. Now, add the following code to the initialization part of the Variable
class
[9]:
class Variable:
def __init__(self, data):
if data is not None: # Added code
if not isinstance(data, np.ndarray): # Added code
raise TypeError('{} is not supported'.format(type(data))) # Added code
self.data = data
self.grad = None
self.creator = None
def set_creator(self, func):
self.creator = func
def backward(self):
if self.grad is None:
self.grad = np.ones_like(self.data)
funcs = [self.creator]
while funcs:
f = funcs.pop()
x, y = f.input, f.output
x.grad = f.backward(y.grad)
if x.creator is not None:
funcs.append(x.creator)
As shown above, if the data
given as an argument is not None
and is not an instance of ndarray
, it raises an exception called TypeError
. At this time, the string to be output as an error is also prepared as above. Now we can use Variable
as follows.
[10]:
x = Variable(np.array(1.0)) # OK
x = Variable(None) # OK
x = Variable(1.0) # NG
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-10-3d59bd5dea78> in <module>()
2 x = Variable(None) # OK
3
----> 4 x = Variable(1.0) # NG
<ipython-input-9-30bb065c8700> in __init__(self, data)
3 if data is not None: # Added code
4 if not isinstance(data, np.ndarray): # Added code
----> 5 raise TypeError('{} is not supported'.format(type(data))) # Added code
6
7 self.data = data
TypeError: <class 'float'> is not supported
You can create Variable
with ndarray
or None
as shown above without any problem. However, all other data types – in the example above, in the case of float
– raise an exception. It will instantly tell you that you are using the wrong data type.
Now, with this change, there is one more thing to keep in mind. This is due to NumPy’s unique way of doing things. To illustrate this, let’s first look at the following NumPy code.
[ ]:
x = np.array([1.0])
y = x ** 2
print(type(x), x.ndim)
print(type(y))
Here, x
is a one-dimensional ndarray
. In this case, the data type of y
resulting from x ** 2
(squared calculation) is ndarray
. This is the expected result. The case in question is the following.
[ ]:
x = np.array(1.0)
y = x ** 2
print(type(x), x.ndim)
print(type(y))
where x
is a 0-dimensional ndarray
. In this case, the result of x ** 2
will be np.float64
. This is due to the NumPy specification. That is, when computing with a 0-dimensional ndarray
instance, the result will be a data type other than the ndarray
instance – such as numpy.float64
or numpy.float32
. This means that the output of a DeZero function may contain data of type numpy.float64
or numpy.float32
. However, the data of Variable
is a specification
with only ndarray
instances. To deal with this, we first prepare the following functions as useful functions.
[ ]:
def as_array(x):
if np.isscalar(x):
return np.array(x)
return x
The np.iscalar
function can be used to determine the type of a scalar, such as numpy.float64
(it can also determine Python’s int
and float
). In fact, using the np.iscalar
function, we get the following.
[ ]:
print(np.isscalar(np.float64(1.0)))
print(np.isscalar(2.0))
print(np.isscalar(np.array(1.0)))
print(np.isscalar(np.array([1, 2, 3])))
Thus, with np.iscalar(x)
we can determine whether x
is an instance of ndarray
or not. The as_array
function takes advantage of this and converts it to an ndarray
instance if it is not an ndarray
instance. Now that we have a convenient function called as_array
, we can add the following code to the Function
class.
[ ]:
class Function:
def __call__(self, input):
x = input.data
y = self.forward(x)
output = Variable(as_array(y)) # Added code
output.set_creator(self)
self.input = input
self.output = output
return output
def forward(self, x):
raise NotImplementedError()
def backward(self, gy):
raise NotImplementedError()
As shown above, let as_array(y)
when wrapping the result of forward propagation, y
, with Variable
. That way, we can ensure that the output result output
is an instance of ndarray
. Now all the data is an ndarray
instance, even if the calculation uses a 0-dimensional ndarray
instance.
This concludes the work in this step. In the next step, we’re going to talk about testing DeZero.