In this tutorial, we will cover some basics of linear algebra. We will use ND4S – Scala bindings for ND4J, scientific computing library for the JVM.
Linear Algebra building blocks
We begin by discussing the building blocks of linear algebra: scalars, vectors, matrices and tensors:
- Scalar – is a single number
- Vector – is an ordered array of single numbers
- Matrix – is a two dimensional array of numbers, arranged in rows and columns
- Tensor – is a multidimensional array
Why do we need Linear Algebra for Machine Learning?
Machine learning is all about data and data can be represented as a vector, matrix or tensor. To be effective machine learning often requires large amounts of data, computations on large matrices can be performed very efficiently using highly optimized libraries for matrix operations like ND4j.
ND4J
The core data structure in Nd4j is the NDArray, which is a multi-dimensional array of numbers: a vector, matrix or tensor.
Internally, it may store single precision or double precision floating point values for each entry.
Let’s add dependencies.
1 2 3 |
val nd4jVersion = "0.9.1" libraryDependencies += "org.nd4j" % "nd4j-native-platform" % nd4jVersion libraryDependencies += "org.nd4j" %% "nd4s" % nd4jVersion |
Add the following import statements.
1 2 |
import org.nd4j.linalg.factory.Nd4j import org.nd4s.Implicits._ |
NDArrays operations
We will use Nd4j class, which exposes many static methods to help us with the creation and manipulation of NDArrays.
Creating Vectors
We can create NDArray which represents a vector from an Array.
1 2 |
scala> val vec = Nd4j.create(Array(3d, 5d)) vec: org.nd4j.linalg.api.ndarray.INDArray = [3.00, 5.00] |
Now, let’s draw our newly created 2-dimensional vector.
To create a vector of all zeros we can use the zeros function.
1 2 |
scala> val vec = Nd4j.zeros(10) vec: org.nd4j.linalg.api.ndarray.INDArray = [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00] |
Adding two vectors
Two vectors of the same size can be added by adding the corresponding elements.
1 2 3 4 5 6 |
val one = Nd4j.create(Array(1d,2d,3d)) one: org.nd4j.linalg.api.ndarray.INDArray = [1.00, 2.00, 3.00] scala> val two = Nd4j.create(Array(3d,5d,6d)) two: org.nd4j.linalg.api.ndarray.INDArray = [3.00, 5.00, 6.00] scala> one + two res7: org.nd4j.linalg.api.ndarray.INDArray = [4.00, 7.00, 9.00] |
Scalar-vector multiplication
Scalar-vector multiplication is an operation in which every element of the vector is multiplied by a scalar.
1 2 3 4 5 |
scala> val vec = Nd4j.create(Array(3d,5d,6d)) vec: org.nd4j.linalg.api.ndarray.INDArray = [3.00, 5.00, 6.00] scala> vec * 2 res8: org.nd4j.linalg.api.ndarray.INDArray = [6.00, 10.00, 12.00] |
Vector transpose
The transpose of a vector changes a column vector to a row vector or vice versa.
1 2 3 4 5 6 7 8 |
val vec = Nd4j.create(Array(3d,5d,6d)) vec: org.nd4j.linalg.api.ndarray.INDArray = [3.00, 5.00, 6.00] scala> vec.shape res8: Array[Int] = Array(1, 3) scala> vec.T.shape res9: Array[Int] = Array(3, 1) |
We used shape method to check the size of each dimension, the shape has changed from 1×3 (1 row 3 columns) to 3X1 (3 rows 1 column).
Vector dot product
Vector dot product is one of the most important operations in the whole machine learning.
It’s defined as a sum of corresponding elements of two vectors of the same size. We can think of a dot product as a measure of similarity between two vectors.
1 2 3 4 5 6 7 8 |
scala> val vec1 = Nd4j.create(Array(3d,5d,6d)) vec1: org.nd4j.linalg.api.ndarray.INDArray = [3.00, 5.00, 6.00] scala> val vec2 = Nd4j.create(Array(1d,2d,3d)) vec2: org.nd4j.linalg.api.ndarray.INDArray = [1.00, 2.00, 3.00] scala> vec1.dot(vec2.T) res12: org.nd4j.linalg.api.ndarray.INDArray = 31.00 |
Creating matrices
Matrix is a two-
1 2 3 4 |
scala> val matrix = Nd4j.create(Array(Array(1d,2d, 3d), Array(4d, 5d, 6d))) matrix: org.nd4j.linalg.api.ndarray.INDArray = [[1.00, 2.00, 3.00], [4.00, 5.00, 6.00]] |
We can now check the shape of our newly created matrix.
1 2 |
scala> matrix.shape res0: Array[Int] = Array(2, 3) |
Accessing and setting matrix elements
NDArray supports multidimensional indexing for multidimensional arrays, to access or set a particular element we need to specify its row and column number.
1 2 3 4 5 6 7 8 |
scala> val matrix = Nd4j.create(Array(Array(1d,2d, 3d), Array(4d, 5d, 6d))) matrix: org.nd4j.linalg.api.ndarray.INDArray = scala> matrix(1,2) res1: Double = 6.0 scala> matrix(0,0) = 100 res2: org.nd4j.linalg.api.ndarray.INDArray = [[100.00, 2.00, 3.00], [4.00, 5.00, 6.00]] |
Matrix addition and subtraction
Two matrices can be added or subtracted if, and only if, they have the same dimensions. To add (or subtract) two matrices of the same dimensions, just add (or subtract) the corresponding entries.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
scala> val matrix1 = Nd4j.create(Array(Array(1d,2d, 3d), Array(4d, 5d, 6d))) matrix1: org.nd4j.linalg.api.ndarray.INDArray = [[1.00, 2.00, 3.00], [4.00, 5.00, 6.00]] scala> val matrix2 = Nd4j.create(Array(Array(1d,2d, 3d), Array(4d, 5d, 6d))) matrix2: org.nd4j.linalg.api.ndarray.INDArray = [[1.00, 2.00, 3.00], [4.00, 5.00, 6.00]] scala> matrix1 + matrix2 res0: org.nd4j.linalg.api.ndarray.INDArray = [[2.00, 4.00, 6.00], [8.00, 10.00, 12.00]] scala> matrix1 - matrix2 res1: org.nd4j.linalg.api.ndarray.INDArray = [[0.00, 0.00, 0.00], [0.00, 0.00, 0.00]] |
Matrix product
The matrix dot product is an operation that produces a matrix from two matrices. The number of columns of the 1st matrix must equal the number of rows of the 2nd. This is how we define the dot product of two matrices, A \((2 \times 3)\) and B \((3 \times 2)\).
$$
\begin{pmatrix}
a_{11} & a_{12} & a_{13} \\
a_{21}& a_{22} & a_{23} \\
\end{pmatrix} \circ \begin{pmatrix}
b_{11} & b_{12} \\
b_{21}& b_{22} \\
b_{31}& b_{32} \\
\end{pmatrix} =
$$
$$
\begin{pmatrix}
a_{11} b_{11} + a_{12} b_{21} + a_{13} b_{31} & a_{11} b_{12} + a_{12} b_{22} + a_{13} b_{32} \\
a_{21} b_{11} + a_{22} b_{21} + a_{23} b_{31} & a_{21} b_{12} + a_{22} b_{22} + a_{23} b_{32} \\
\end{pmatrix}
$$
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
scala> val matrix1 = Nd4j.create(Array(Array(1d,2d, 3d), Array(4d, 5d, 6d))) matrix1: org.nd4j.linalg.api.ndarray.INDArray = [[1.00, 2.00, 3.00], [4.00, 5.00, 6.00]] scala> val matrix2 = Nd4j.create(Array(Array(10d,20d), Array(30d, 40d), Array(50d, 60d))) matrix2: org.nd4j.linalg.api.ndarray.INDArray = [[10.00, 20.00], [30.00, 40.00], [50.00, 60.00]] scala> matrix1.dot(matrix2) res2: org.nd4j.linalg.api.ndarray.INDArray = [[220.00, 280.00], [490.00, 640.00]] |
Matrix element-wise multiplication
Element-wise matrix multiplication takes two matrices of the same dimensions and produces another matrix with elements that are a product of corresponding elements.
1 2 3 4 5 6 7 8 9 10 11 12 |
scala> val matrix1 = Nd4j.create(Array(Array(1d,2d, 3d), Array(4d, 5d, 6d))) matrix1: org.nd4j.linalg.api.ndarray.INDArray = [[1.00, 2.00, 3.00], [4.00, 5.00, 6.00]] scala> val matrix2 = Nd4j.create(Array(Array(10d,20d, 30d), Array(40d, 50d, 60d))) matrix2: org.nd4j.linalg.api.ndarray.INDArray = [[10.00, 20.00, 30.00], [40.00, 50.00, 60.00]] scala> matrix1 * matrix2 res8: org.nd4j.linalg.api.ndarray.INDArray = [[10.00, 40.00, 90.00], [160.00, 250.00, 360.00]] |