Deep Learning Pearls

Wednesday, December 25, 2019

Pitfalls of using Jupyter Notebook

Jupyter Notebook is no doubt providing an convenient interactive development environment for coding in Python. The deployment of it in theory is independent of the machine. However, when deployed on servers, especially the servers are hosted in IDC, several pitfalls should get noticed.

Since server might employ more strict rules, so not all ports are available. Usually port 22 is always available. When do local tunnelling, remember the host and port must be explicitly specified. Otherwise, it will fail.
The following is an example:

"C:\Program Files\PuTTY\putty.exe" -ssh username@server_url -L localhost:local_port:localhost:remote_port

After that, notebook can be launched remotely. It's cunning to remember some options to the command to facilitate usage, such as the working directory, the port number, etc. Following is an example:

jupyter notebook --notebook-dir=working_directory --no-browser --port=remote_port

Wednesday, March 27, 2019

Another way to using Github

I usually just clone repo from Github, however today my friend shows me another way, here I just log it for later reference.

The first step is to make a folder to host the repo, like:
mkdir $HOME/scalable_agent

Then enter the folder to initialize the repo and link to the remote repo on github:
git init
git remote add origin https://github.com/deepmind/scalable_agent.git
git pull -v origin master

The next step is to create a branch to host your modification:
git checkout -b py3encondingIssue
vim py_process.py

After that, commit the change and push to the remote repo:
git add py_process.py
git commit -m "fixing the encoding issue of the name for class property"
git push -u origin py3encondingIssue

Happy coding!

Thursday, September 6, 2018

Solve the C++ library incompatibility problem when using matlab

I tend to mix the Python code and Matlab code together. The most convenient way is to expose the Matlab as a computation engine for Python. However Matlab comes up with itself specific version of standard C++ library which is probably incompatible with the system's one. The following way can overcome such conflict:

export LD_PRELOAD=/usr/local/matlab2016b/sys/os/glnxa64/libstdc++.so.6.0.20

Wednesday, July 18, 2018

A wrapper around batch_normalization

Usually I am using Sonnet, however recently an overlook of the document in-lined with source code made me thought there is a potential bug in the implementation. But when I turned to the implementation provided by TensorFlow, there is no better off. Lots of pitfalls here and there.

The following is a wrapper by me to demonstrate a user case of the routine, hope it will be useful. And I believe you know how to save and restore the variables, yes?

Enjoy coding no matter how frustrating.

import numpy as np
import tensorflow as tf
import sonnet as snt

from tensorflow.python.layers import normalization

class MyBatchNorm(object):
def __init__(self):
self._bn = normalization.BatchNormalization(axis = 1,
epsilon = np.finfo(np.float32).eps, momentum = 0.9)

def __call__(self, inputs, is_training = True, test_local_stats = False):
outputs = self._bn(inputs, training = is_training)

self._add_variable(self._bn.moving_mean)
self._add_variable(self._bn.moving_variance)

return outputs

def _add_variable(self, var):
if var not in tf.get_collection(tf.GraphKeys.MOVING_AVERAGE_VARIABLES):
tf.add_to_collection(tf.GraphKeys.MOVING_AVERAGE_VARIABLES, var)

t = tf.truncated_normal([2, 4, 4, 2])

bn = MyBatchNorm()
bn2 = MyBatchNorm()

n = bn(t)
n2 = bn2(t)

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
n = tf.identity(n)

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

n_v, n2_v = sess.run([n, n2])

print(tf.trainable_variables())
print(tf.moving_average_variables())

Friday, July 6, 2018

A tricky error regarding multiple-GPU training

A must undergoing step for utilizing multiple GPU to train model is to average gradients computed by different GPUs. A typical error could happen when the gradient is partial available or stop_gradient is called into the graph. The error message is like this:

ValueError: Tried to convert 'input' to a tensor and failed. Error: None values not supported.

If it happens, try to explicitly disable trainable property of the variables.

Monday, July 2, 2018

Dynamic Programming code in TensorFlow

Following is the code implemented in TensorFlow for dynamic programming of example 4.1 from the great book: Reinforcement Learning: An Introduction. As promised at last all the pseudo code will be implemented in TensorFlow.

Enjoy it and welcome further discussion.

import tensorflow as tf

num_iters = 1000
num_states = 16

V = [tf.get_variable("V%d" % i, [], tf.float64, initializer = tf.zeros_initializer()) for i in range(num_states)]

V0 = V[0]
V1 = -0.25 * (1 - V[0] + 1 - V[1] + 1 - V[2] + 1 - V[5])
V2 = -0.25 * (1 - V[1] + 1 - V[2] + 1 - V[3] + 1 - V[6])
V3 = -0.25 * (1 - V[2] + 1 - V[3] + 1 - V[3] + 1 - V[7])
V4 = -0.25 * (1 - V[4] + 1 - V[0] + 1 - V[5] + 1 - V[8])
V5 = -0.25 * (1 - V[4] + 1 - V[1] + 1 - V[6] + 1 - V[9])
V6 = -0.25 * (1 - V[5] + 1 - V[2] + 1 - V[7] + 1 - V[10])
V7 = -0.25 * (1 - V[6] + 1 - V[3] + 1 - V[7] + 1 - V[11])
V8 = -0.25 * (1 - V[8] + 1 - V[4] + 1 - V[9] + 1 - V[12])
V9 = -0.25 * (1 - V[8] + 1 - V[5] + 1 - V[10] + 1 - V[13])
V10 = -0.25 * (1 - V[9] + 1 - V[6] + 1 - V[11] + 1 - V[14])
V11 = -0.25 * (1 - V[10] + 1 - V[7] + 1 - V[11] + 1 - V[15])
V12 = -0.25 * (1 - V[12] + 1 - V[8] + 1 - V[13] + 1 - V[12])
V13 = -0.25 * (1 - V[12] + 1 - V[9] + 1 - V[14] + 1 - V[13])
V14 = -0.25 * (1 - V[13] + 1 - V[10] + 1 - V[15] + 1 - V[14])
V15 = V[15]

delta_lst = []
for i in range(num_states):
verbose_op = tf.Print(V[i], [tf.round(V[i])], message = "value of V(%d) = " % i)
delta_lst.append(tf.abs(V[i] - eval("V%d" % i)))
with tf.control_dependencies([verbose_op]):
V[i] = tf.assign(V[i], eval("V%d" % i))

delta = tf.reduce_max(delta_lst)

stop_op = tf.cond(tf.less(delta, 0.0001), lambda: True, lambda: False)

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

for i in range(num_iters):
if sess.run(stop_op):
print("\ncurrent iteration {}".format(i))
break

for j in range(num_states):
sess.run(V[j])

Sunday, June 3, 2018

Steps to invoke PDB to debug python scripts

Actually this is the replica of https://stackoverflow.com/questions/35496298/pdb-automatically-append-to-sys-path

Probably it is tedious, but at least it works:

1. switch from python [my-script] to python -m pdb [my-script].
2. import sys
3. sys.path.append([full path to subdirectory where [module-XY] lies])
4. b [module-XY]:[line]