Why choose IP?

This post is intended towards class 10th students but the ideas can also be applied to others who are interested in field of computer science.

My introduction

My name is Udit Gulati. I am currently a final year CS student at Indian Institute of Information Technology, Una. I did my 12th from DAV, Rohtak and took IP as optional subject.

Target audience

I intend to write this post for 10th class students who are deciding on which optional subject to choose during their 11th and 12th class. Especially for those students who are interested in taking non-medical and then do Engineering after 12th class. This post will give you details on the pros and cons of taking IP as optional subject.

Possibilities and their benefits

There a few subjects that can be taken as optional subject like IP, fine arts, music, psychology, physical education etc. Each one has some benefits and some are primarily taken by students of some streams. For example, psychology is primarily taken by medical students as this subject has relevance in field of medical science.

The purpose of optional subject is to give an idea of the field one wants to study in college. If you want to study Computer science, studying IP will give you a perspective of it. If you want to study performing arts, studying music will give you that perspective.

How IP is beneficial before college

IP stands for Information Practices. It touches some basic concepts of computer programming (Java or Python). It teaches you how to think in terms of computer programming. One gets to learn how to convert logic into program and also gets to learn about basic algorithms and data structures.

Taking IP as optional subject will help a lot if one is a non-medical student as most non-medical students end up opting Engineering and most of those wants to go for Computer science, some get it but those who don’t end up taking some other specialization that does not interest them much and still learn computer science as a self-study. So, if anyone who has interest in studying Computer science or have plan to study it should definitely take IP as optional subject. This will not only give you an idea of the field but will also get you comfortable with basic programming paradigm which will give you a head start during first year of college and will understand those concepts even better during college.

Why learn to code?

If you want to pursue engineering then computer programming is the most helpful skill to learn during the college period. It not only helps in securing a good job but also helps to improve logical thinking. It is very important to be fluent in coding when the specialization is Computer science. Even with specialization other than Computer science may it be Electronic, Electrical, Civil or Mechanical, most students end up working in tech companies where most of their work involves using computer programming skills. So, if one wants to be successful as an engineering, learning to code is the most sane thing to do.

For fields other than engineering, like biology and commerce, the skill of coding still helps a lot as if one knows how to code can easily write small scripts to automate most of the work. In both of these fields, there is a lot of work related to processing and analyzing data and such work can be done easily and quickly using simple python scripts.

Final thought

It may be that you are not sure about if you want to do engineering or not or if you want to pursue Computer science or not. It is best to atleast giving IP a try during 11th if there is a possibility of you ending up taking engineering. You’ll also get the understanding that if Computer science is really one of your interests. If that is not the case, then you can change to some other optional subject in 12th. Even if you do not end up taking engineering or working in a tech company, the skill of coding would still be helpful in other fields as logical thinking applies to essentially every field.

»


GSoC 2019 | Work Product

The GSoC coding period has come to an end. It’s been quite a learning experience for me and has given me an opportunity to come out of my comfort zone and explore new things. I feel quite delighted knowing that I have contributed a significant amount to the project. This post summarizes my work on NumRuby during the GSoC period.

GSoC 2019 proposal - https://docs.google.com/document/d/1MR01QZeX_8h7a16nmkOYlyrVt–osB1Yg9Vo0xXYtSw/edit?usp=sharing

Pull requests

Blog Posts

Future work

  • As these features are core to NMatrix, they need to be tested more tightly and benchmarked. Also, since this is the first implementation of the features, some modifications can make these features more reliable and faster.
  • More iterators are to be implemented in order to have a richer NMatrix API.
  • The issue with slicing needs to be fixed soon. Also, need to standardize the formatting used for slice range specification.
  • Broadcasting needs to be tested and benchmarked. This will let us improve it’s reliability and performance. A few APIs could be exposed which would let a user manually broadcast a matrix to a given shape.
  • NumRuby::Lapack needs to have proper exception handling for LAPACK routines. This will be done by reading the info value returned by routine and raising exception accordingly.
  • NumRuby::Linalg can be extended to have even more methods.

Acknowledgements

I wanna thank Ruby Science Foundation, all mentors and org admins for providing me this wonderful opportunity to enhance my knowledge and work on a project. I would also like to thank Google for organizing such a wonderful program due to which I got introduced to Open-source and Ruby Science Foundation. I especially want to thank my mentor Prasun Anand for guiding me through this period, keeping me motivated and for tolerating my procrastination.

»


GSoC 2019 | Week 9, 10

Introduction

During these period, I continued the work on Linear algebra functionalities of NumRuby to write NumRuby::Linalg. I implemented most of the required wrapper functions and these will now be used to implement NumRuby::Linalg methods. Source code of numpy.linalg and Numo::Linalg acted as a great resource to get to a good enough implementation of these methods.

Linear Algebra

TODO

NumRuby::Linalg

TODO

Corresponding pull request can be found here at https://github.com/SciRuby/numruby/pull/30.

After this, I’ll work on adding better support sparse matrices and work on cleaning the code, writing tests and fixing minor bugs. Feel free to share your thoughts in the comments section. Thanks for reading.

»


GSoC 2019 | Week 7, 8

Introduction

During these weeks, I started work on Linear algebra functionalities of NumRuby. This is being done by implementing Linear algebra methods under the module Numruby::Linalg and these methods then makes use of LAPACK and BLAS routines but these can’t be directly called from Ruby code, so there needs to be some wrappers functions in C which expose these functions to be able to be called from Ruby code. I implemented most of the required wrapper functions and will later use them to implement NumRuby::Linalg methods. I went through source code of numpy.linalg and Numo::Linalg to have a gist of ways the Linear algebra capabilities could be implemented.

BLAS and LAPACK

LAPACK and BLAS are libraries for linear algebra functionalities written in FORTRAN and are used as underlying for linear algebra capabilities in nearly every numerical library in modern languages be int NMatrix, Numo in ruby or NumPy in Python and the reason primarily being their performance and efficiency.

Wikipedia defines BLAS as following, “Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication. They are the de facto standard low-level routines for linear algebra libraries; the routines have bindings for both C and Fortran. Although the BLAS specification is general, BLAS implementations are often optimized for speed on a particular machine, so using them can bring substantial performance benefits. BLAS implementations will take advantage of special floating point hardware such as vector registers or SIMD instructions.” an defines LAPACK as following, “LAPACK (Linear Algebra Package) is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It also includes routines to implement the associated matrix factorizations such as LU, QR, Cholesky and Schur decomposition. LAPACK was originally written in FORTRAN 77, but moved to Fortran 90 in version 3.2 (2008). The routines handle both real and complex matrices in both single and double precision.”. One can explore BLAS and LAPACK here at http://www.netlib.org/blas/ and http://www.netlib.org/lapack/.

NumRuby::Linalg

NumRuby::Linalg is the module of NumRuby which provides linear algebra functionalities. It provides methods for matrix decomposition, inverse, dot product, matrix norm & determinant and linear equations solution. My initial approach to implementing these methods was to write the module completely in C as I thought it would be the most optimal in terms of speed if implemented this way. But, there were some problems with this approach which I later realized when I started implementing QR decomposition. The main problem was that it took too long to implement just QR decomposition as many small things done Ruby using just an expression or a statement took multiple lines to be implemented. For example, declaring and initializing an NMatrix object can be done using a simple statement but to do the same thing in C, one has to allocate memory for shape, elements and initialize all parameters manually.

The way Linalg has been implemented in most libraries is that the methods have been written in the same language (Ruby in our case) and the LAPACK routines are directly called from these methods. The problem here is LAPACK routines can’t be directly called from Ruby as there aren’t any bindings of LAPACK for Ruby but there are libraries in C which lets one invoke LAPACK routines through C code. This way is much better in terms of complexity of code and there was hardly any performance difference. So, now the task was to write functions in C which would act as interface between LAPACK and Linalg methods in Ruby.

Wrappers for LAPACK routines

To implement the wrappers functions in C, I listed out all the LAPACK routines to be used in currently planned Linalg module. Then forward declared wrapper functions for all these routines in ruby_nmatrix.c. I also implemented constructors for nmatrix struct, namely, nmatrix_new and nmatrix_copy. These constructors reduced code repetition in wrappers functions as now any new nmatrix instance can be declared and initialized in one line and copy of nmatrix instance can be done in one line.

nmatrix* nmatrix_new(
  nm_dtype dtype,
  nm_stype stype,
  size_t ndims,
  size_t count,
  size_t* shape,
  void* elements
  ) {
  nmatrix* matrix = ALLOC(nmatrix);
  matrix->dtype = dtype;
  matrix->stype = stype;
  matrix->ndims = ndims;
  matrix->count = count;

  matrix->shape = ALLOC_N(size_t, matrix->ndims);
  if(shape != NULL) {
    for(size_t i = 0; i < ndims; ++i) {
      matrix->shape[i] = shape[i];
    }
  }

  if(elements == NULL) {
    return matrix;
  }

  double* temp_elements = (double*)elements;
  double* matrix_elements = ALLOC_N(double, matrix->count);
  for(size_t i = 0; i < count; ++i) {
    matrix_elements[i] = temp_elements[i];
  }
  matrix->elements = matrix_elements;

  return matrix;
}
nmatrix* matrix_copy(nmatrix* original_matrix) {
  nmatrix* matrix = ALLOC(nmatrix);
  matrix->dtype = original_matrix->dtype;
  matrix->stype = original_matrix->stype;
  matrix->ndims = original_matrix->ndims;
  matrix->count = original_matrix->count;

  matrix->shape = ALLOC_N(size_t, matrix->ndims);
  if(original_matrix->shape != NULL) {
    for(size_t i = 0; i < original_matrix->ndims; ++i) {
      matrix->shape[i] = original_matrix->shape[i];
    }
  }

  if(original_matrix->elements == NULL) {
    return matrix;
  }

  double* temp_elements = (double*)original_matrix->elements;
  double* matrix_elements = ALLOC_N(double, matrix->count);
  for(size_t i = 0; i < original_matrix->count; ++i) {
    matrix_elements[i] = temp_elements[i];
  }
  matrix->elements = matrix_elements;

  return matrix;
}

Example invocation for these constructors is nmatrix* result = nmatrix_new(matrix_qr->dtype, matrix_qr->stype, 2, matrix_qr->count, matrix_qr->shape, NULL); and nmatrix* result = nmatrix_copy(matrix); respectively.

Each of the wrappers was written in lapack.c as follows:

VALUE nm_geqrf(int argc, VALUE* argv) {
  nmatrix* matrix;
  Data_Get_Struct(argv[0], nmatrix, matrix);

  int m = matrix->shape[0]; //no. of rows
  int n = matrix->shape[1]; //no. of cols
  int lda = n, info = -1;

  nmatrix* result_qr = nmatrix_new(matrix->dtype, matrix->stype, 2, matrix->count, matrix->shape, NULL);
  nmatrix* result_tau = nmatrix_new(matrix->dtype, matrix->stype, 1, min(m, n), NULL, NULL);
  result_tau->shape[0] = min(m, n);

  switch(matrix->dtype) {
    case nm_bool:
    {
      //not supported error
      break;
    }
    case nm_int:
    {
      //not supported error
      break;
    }
    case nm_float32:
    {
      float* tau_elements = ALLOC_N(float, min(m, n));
      float* elements = ALLOC_N(float, matrix->count);
      memcpy(elements, matrix->elements, sizeof(float)*matrix->count);
      info = LAPACKE_sgeqrf(LAPACK_ROW_MAJOR, m, n, elements, lda, tau_elements);

      result_qr->elements = elements;
      result_tau->elements = tau_elements;

      VALUE qr = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_qr);
      VALUE tau = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_tau);
      return rb_ary_new3(2, qr, tau);
      break;
    }
    case nm_float64:
    {
      double* tau_elements = ALLOC_N(double, min(m, n));
      double* elements = ALLOC_N(double, matrix->count);
      memcpy(elements, matrix->elements, sizeof(double)*matrix->count);
      info = LAPACKE_dgeqrf(LAPACK_ROW_MAJOR, m, n, elements, lda, tau_elements);

      result_qr->elements = elements;
      result_tau->elements = tau_elements;

      VALUE qr = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_qr);
      VALUE tau = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_tau);
      return rb_ary_new3(2, qr, tau);
      break;
    }
    case nm_complex32:
    {
      float complex* tau_elements = ALLOC_N(float complex, min(m, n));
      float complex* elements = ALLOC_N(float complex, matrix->count);
      memcpy(elements, matrix->elements, sizeof(float complex)*matrix->count);
      info = LAPACKE_cgeqrf(LAPACK_ROW_MAJOR, m, n, elements, lda, tau_elements);

      result_qr->elements = elements;
      result_tau->elements = tau_elements;

      VALUE qr = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_qr);
      VALUE tau = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_tau);
      return rb_ary_new3(2, qr, tau);
      break;
    }
    case nm_complex64:
    {
      double complex* tau_elements = ALLOC_N(double complex, min(m, n));
      double complex* elements = ALLOC_N(double complex, matrix->count);
      memcpy(elements, matrix->elements, sizeof(double complex)*matrix->count);
      info = LAPACKE_zgeqrf(LAPACK_ROW_MAJOR, m, n, elements, lda, tau_elements);

      result_qr->elements = elements;
      result_tau->elements = tau_elements;

      VALUE qr = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_qr);
      VALUE tau = Data_Wrap_Struct(NMatrix, NULL, nm_free, result_tau);
      return rb_ary_new3(2, qr, tau);
      break;
    }
  }
  return INT2NUM(-1);
}

Above function nm_geqrf is a wrapper for LAPACK routines sgeqrf, dgeqrf, cgeqrf and zgeqrf. This function takes the input matrix (or matrices) and relevant arguments, parses them, declares and initializes result matrices, uses switch statement to decide which LAPACK routine to call and then passes the required arguments to the routine call and finally returns the resulting matrix (or matrices) as Ruby objects. This wrapper can be called from Ruby code using NumRuby::Lapack.geqrf(matrix). Similarly, other wrapper functions have been implemented.

Corresponding pull request can be found here at https://github.com/SciRuby/numruby/pull/26.

After this, I’ll work on implementing linear algebra methods for NumRuby::Linalg using these wrappers for LAPACK routines. Feel free to share your thoughts in the comments section. Thanks for reading.

»


GSoC 2019 | Week 5, 6

Introduction

Another 2 weeks have passed working on NumRuby. I worked on adding broadcasting support and integrating broadcasting to elementwise operations. I also worked on fixing compiler errors which were mostly occurring due to some missed switch cases and some due to unused variables. There were also some minor issues that I encountered while fixing the warning, I also fixed some those. For broadcasting, I went through NumPy blogs to understand how broadcasting has been implemented in NumPy. These blogs have explained broadcasting quite nicely. I have added links to these blogs at the end of this blog, in case if anyone wants to give them a read.

Broadcasting

What is broadcasting?

Broadcasting is the process of repeating a matrix along some dimensions to make it compatible for elementwise operations with some other matrix. Broadcasting is a means to do elementwise arithmetic operations when the matrices are of different shapes. The smaller matrix is “broadcast” across the larger matrix so that they have compatible shapes. In order to broadcast, the size of the trailing dimension for both matrices in an operation must either be the same size or one of them must be one. Constraints for broadcasting to happen:

  • If input matrices do not have the same number of dimensions, a “1” will be repeatedly prepended to the shapes of the smaller matrix until both the matrices have the same number of dimensions.
  • The lengths should be same for both matrices along a particular dimension or one of them should be “1”. Broadcasting ensures that matrix with a size of “1” along a particular dimension act as if they had the size of the matrix with the largest shape along that dimension. The value of the matrix elements is assumed to be the same along that dimension for the “broadcast” matrix.

For example:

A (3d matrix):       256 x 256 x 3
B (2d matrix):             256 x 3
Result (3d matrix):  256 x 256 x 3
A (3d matrix):      2 x 5 x 7 x 1
B (1d matrix):          5 x 1 x 8
Result (3d matrix): 2 x 5 x 7 x 8

Following examples are incompatible for broadcasting:

A (2d matrix): 3 x 4
B (2d matrix): 4 x 4
A (2d matrix):     2 x 1
B (3d matrix): 8 x 4 x 3

Rules of Broadcasting

Broadcasting in NumRuby follows a strict set of rules to determine the interaction between the two matrices:

  • Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
  • Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
  • Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

Implementation of broadcasting

Broadcasting will only be done for elementwise operators and when these operators are applied on two matrices with different shapes, the #broadcast method will be called and this method will first check the shapes for compatibility of broadcasting. This will be done by comparing each dimension’s length starting from last ones. Then memory will be allocated for new broadcasted matrices and then calculate the resultant broadcasted matrix from these two.

Below are the implementation snippets for broadcasting:

void broadcast_matrices(nmatrix* nmat1, nmatrix* nmat2) {
  size_t* broadcast_shape;
  size_t broadcast_dims;

  //check for broadcasting compatibilty
  //and raise error if incompatible

  get_broadcast_shape(nmat1, nmat2, broadcast_shape, broadcast_dims);

  broadcast_matrix(nmat1, broadcast_shape, broadcast_dims);
  broadcast_matrix(nmat2, broadcast_shape, broadcast_dims);
}

broadcast_matrices function takes the matrices nmat1, nmat2 and first gets the shape these 2 will get broadcasted to and if these are incompatible for broadcasting, raises an error. Currently the matrices are itself updated but later the implementation will be changed such that another copy will be created which will have broadcasted matrix and original one being unchanged.

void get_broadcast_shape(nmatrix* nmat1, nmatrix* nmat2, size_t* broadcast_shape, size_t broadcast_dims) {
  size_t* shape1 = nmat1->shape;
  size_t* shape2 = nmat2->shape;

  size_t ndims1 = nmat1->ndims;
  size_t ndims2 = nmat2->ndims;

  broadcast_dims = max(ndims1, ndims2);
  broadcast_shape = ALLOC_N(size_t, broadcast_dims);

  if(ndims1 > ndims2) {
    for(size_t i = 0; i < ndims1; ++i) {
      broadcast_shape[i] = shape1[i];
    }
    for(size_t i = 0; i < ndims2; ++i) {
      size_t res_index = (ndims1 - ndims2) + i;
      if(shape1[res_index] != shape2[i] && min(shape1[res_index], shape2[i]) > 1) {
        //raise broadcast compatibility error
      }
      broadcast_shape[res_index] = max(shape1[res_index], shape2[i]);
    }
  }
  else {
    for(size_t i = 0; i < ndims2; ++i) {
      broadcast_shape[i] = shape2[i];
    }
    for(size_t i = 0; i < ndims1; ++i) {
      size_t res_index = (ndims2 - ndims1) + i;
      if(shape1[i] != shape2[res_index] && min(shape1[i], shape2[res_index]) > 1) {
        //raise broadcast compatibility error
      }
      broadcast_shape[res_index] = max(shape1[i], shape2[res_index]);
    }
  }
}

get_broadcast_shape calculates the broadcast shape for matrices nmat1 and nmat2 and stores it in broadcast_shape and it’s corresponding length in broadcast_dims. It raises an error if it finds that the matrices are incompatible for broadcasting.

void broadcast_matrix(nmatrix* nmat, size_t* new_shape, size_t new_ndims) {
  size_t prev_ndims = nmat->ndims;
  size_t* prev_shape = nmat->shape;

  nmat->ndims = new_ndims;
  nmat->shape = new_shape;

  size_t new_count = 1;
  for(size_t i = 0; i < new_ndims; ++i) {
    new_count *= new_shape[i];
  }
  nmat->count = new_count;

  VALUE* state_array = ALLOC_N(VALUE, new_ndims);
  for(size_t i = 0; i < new_ndims; ++i) {
    state_array[i] = SIZET2NUM(0);
  }

  double* nmat_elements = (double*)nmat->elements;
  double* new_elements = ALLOC_N(double, new_count);

  for(size_t i = 0; i < new_count; ++i){
    size_t nmat_index = get_index_for_broadcast_element(prev_shape, prev_ndims, state_array, new_ndims);
    new_elements[i] = nmat_elements[nmat_index];

    size_t state_index = (nmat->ndims) - 1;
    while(true){
      size_t curr_index_value = NUM2SIZET(state_array[state_index]);

      if(curr_index_value == new_shape[state_index]){
        curr_index_value = 0;
        state_array[state_index] = SIZET2NUM(curr_index_value);
      }
      else{
        curr_index_value++;
        state_array[state_index] = SIZET2NUM(curr_index_value);
        break;
      }  

      state_index--;        
    }
  }

  nmat->elements = new_elements;
}

Above function broadcast_matrix actually does the broadcasting of matrix. It makes use of new_shape and new_ndims (calculated by get_broadcast_shape) and updates the nmat data members, namely, nmat->count, nmat->shape, nmat->ndims and nmat->elements.

size_t get_index_for_broadcast_element(size_t* prev_shape, size_t prev_ndims, size_t* state_array, size_t new_dims) {
  size_t* indices = ALLOC_N(size_t, prev_ndims);
  for(size_t i = (new_dims - prev_ndims), index = 0; i < new_dims; ++i, ++index) {
    indices[index] = max(state_array[i], prev_shape[index] - 1);
  }

  size_t new_index = 0;
  size_t* stride = ALLOC_N(size_t, prev_ndims);
  
  size_t val = 1;
  for(size_t i = prev_ndims; i > 0; --i) {
    stride[i - 1] = val;
    val *= prev_shape[i - 1];
  }

  for(size_t i = 0; i < prev_ndims; ++i) {
    new_index += ((size_t)FIX2LONG(indices[i]) * stride[i]);
  }
  return new_index;
}

When broadcast_matrix function fills up the elements in the broadcasted matrix, it finds the corresponding element in the original matrix which is to be placed in a particular location and to the index of element in original matrix, get_index_for_broadcast_element is used.

These above functions are called internally before doing any elementwise operation. There are also some methods which can be directly called from the NumRuby ruby module which are NumRuby::broadcast_to and NumRuby::broadcast_arrays.

VALUE nm_broadcast_to(VALUE self, VALUE new_shape) {
  nmatrix* nmat;
  Data_Get_Struct(self, nmatrix, nmat);

  size_t new_ndims = (size_t)RARRAY_LEN(new_shape);

  size_t* new_shape = ALLOC_N(size_t, new_ndims);
  for (size_t index = 0; index < new_ndims; index++) {
    new_shape[index] = (size_t)FIX2LONG(RARRAY_AREF(new_shape, index));
  }

  broadcast_matrix(nmat, new_shape, new_ndims);

  return Data_Wrap_Struct(NMatrix, NULL, nm_free, nmat);
}

nm_broadcast_to can be called from broadcast_to(nm, new_shape) where nm is an NMatrix object and new_shape is a one-dimensional array denoting the shape matrix is to be broadcasted to.

References

Corresponding pull requests can be found here at https://github.com/SciRuby/numruby/pull/23 and https://github.com/SciRuby/numruby/pull/21.

For coming weeks, I’ll work on implementing wrapper functions in C for LAPACK functionalities. Feel free to share your thoughts in the comments section. Thanks for reading.

»