Introduction to Tensorflow – Part 1


Hello friends, in this and upcoming tutorials, we will try to understand TensorFlow and how we can use it to generate some interesting neural network infrastructures.

TensorFlow is an open source Python software library for high-performance numerical computation created and released by Google under Apache 2.0 open source license. It is a foundation library that can be used to create Deep Learning models directly or by using wrapper libraries like Keras that simplify the process built on top of TensorFlow. It can run on single CPU systems, GPUs as well as mobile devices and large-scale distributed systems of hundreds of machines. Thus TensorFlow is widely used not only in research purpose but also in production level softwares. This makes TensorFlow a library so popular that it has more than 100,000 stars in Github.

TensorFlow can be considered as an interface for expressing machine learning algorithms, and an implementation for executing such algorithms.

Now to achieve this, tensorflow uses computational graph in which flow of Tensors is defined. Before moving on to actual programing, we will cover some basic terminology.

  • Tensor:

Just like a vector can be thought of as being an array, or a list, of scalars (Numbers like 1, 2, PI), and matrices can be thought of as arrays of vectors, a tensor can be thought of as an array of matrices. So a tensor is really just an n-dimensional matrix. Since a Tensor is an N-dimensional vector, means a Tensor can be used to represent N-dimensional datasets. This architecture helps a lot when working with machine learning.

  • Computational graphs (or Flow of Tensor):

The flow is how tensors are passed around in the network. When the tensors are passed around, their values and shapes are updated by the graph operations. Each node of the graph contains one mathematical operation which will be performed over tensor when tensor passes through it. Always remember, unlike Graph in the data structure of Computer Science, Computational Graph in TensorFlow can never be cyclic.

The installation of TensorFlow is quite easy and can found on their official website. I won’t be discussing installation but if you face any problem, please feel free to ask me inside the comment section.

Now let’s go to the most exciting part of this tutorial, actual programming. We will follow the programming guide provided by Aymeric Damien in the GitHub repository. The tutorial is divided into Introduction, Basic Machine Learning Models, Supervised Neural Networks, Unsupervised Neural networks and some regular Utilities, Data Management techniques and Multi GPU uses.

 Introduction to Tensorflow Programming:

As mentioned above, a tensorflow program is the flow of tensor (a graph). This flow through this graph is controlled by session. Thus, first of all, define the computational graph, then create a session to run parts of the graph across a set of local and remote devices.

Let us start with a simple Hello World program in tensorflow. The problem statement is –  print Hello World using tensorflow. Hello World is a simple constant string. So we just need a node where hello world is constant. Hence computational graph will be a single node with “Hello World” as constant. This can be done by using constant tensor API i.e. tf.constant. Now using tf.Session, just run the tensor to print Hello World. You can find my code here.

Now let’s see how to perform some arithmetic operations in tensorflow over some scalars. Two tensors are created using tf.tensor. Tensorflow has separate functions for simple arithmetic operations like add, subtract, multiply, divide. You can also use standard operation symbols. You can find my code here.

Till now, we have seen basic operations related to tensorflow. But we don’t need tensorflow to print some statement or to perform some basic operations. We will try to build Linear regression model from scratch. You can follow my code here and output can be seen here.

Linear Regression is nothing but simple equation of line in the form of Y = W * X + b, where W is weight and b is bias. The X can be matrix and Y is vector. In our scenerio, we will consider X and Y both as vectors. So the equation can be summarised as

 vector Y is output of X when vector X is multiplied by weight W and bias b

and task will be to predict value of W and b for which the given equation gets satisfied. To generate train and test set, consider W=3.14 (closey related to pi) and bias value 5.6. You can freely choose any other numeric value. Since the generated values will be placed in tensorflow’s session, we need to create placeholder for it i.e. X and Y. While running session, these placeholders will get replaced by assigned data. Similarly, since W and b values will get modified in session, they are taken as variable. Then we create predition equation and cost function i.e. mean squared difference. This cost will be fed to gradient descent algorithm for minimization provided by tensorflow. We will go through Gradient Descent and other optimization technique in some posts afterwards. In short, Gradient Descent will try to optimize values of weight and bias, so that the predicted values and original values of Y will be as close as possible. After running session, the predicted values of W and b are W = 3.4420657 and b = 5.401249 which are quite close to our original W and b values. The graph of test data for original and predicted values is at the end of this page.

In upcoming post, we will try to implement some other machine learning algorithms using tensorflow.

About the author: sagarjain2030

Leave a Reply

Your email address will not be published.Email address is required.