How to build a machine learning project in Elixir

Machine learning is an ever-growing area of interest for developers, businesses, tech enthusiasts and the general public alike. From agile start-ups to trendsetting industry leaders, businesses know that successful implementation of the right machine learning product could give them a substantial competitive advantage. We have already seen businesses reap significant benefits of machine learning in production through automated chat bots and customised shopping experiences.
Given we recently demonstrated how to complete web scraping in Elixir, we thought we’d take it one step further and show you to apply this in a machine learning project.

The classic algorithmic approach vs machine learning

The traditional approach has always been algorithm centric. To do this, you need to design an efficient algorithm to fix edge cases and meet your data manipulation needs. The more complicated your dataset, the harder it becomes to cover all the angles, and at some point, an algorithm is no longer the best way to go. Luckily, machine learning offers an alternative. When you’re building a machine learning-based system, the goal is to find dependencies in your data. You need the right information to train the program to solve the questions it is likely to be asked. To provide the right information, incoming data is vital for the machine learning system. You need to provide adequate training datasets to achieve success. So without further adieu, we’re going to provide an example tutorial for a machine learning project and show how we achieved success. Feel free to follow along.

The project description

For this project, we’re going to look at an ecommerce platform that offers real-time price comparisons and suggestions. The core functionality of any ecommerce machine learning project is to:

  1. Extract data from websites
  2. Process this data
  3. Provide intelligence and suggestions to the customer
  4. Variable step depending on actions and learnings
  5. Profit

One of the most common problems is the need to group data in a consistent manner. For example, let’s say we want to unify the categories of products from all men’s fashion brands (so we can render all products within a given category, across multiple data sources). Each site (and therefore data source) will likely have inconsistent structures and names, these need to be unified and matched before we can run an accurate comparison.

For the purpose of this guide, we will build a project which :

  1. Extracts the data from a group of websites (in this case we will demonstrate how to extract data from the harveynorman.ie shop)
  2. Train a neural network to recognise a product category from the product image
  3. Integrate the neural network into the Elixir code so it completes the image recognition and suggests products
  4. Build a web app which glues everything together.

Extracting the data

As we mentioned at the beginning, data is the cornerstone of any successful machine learning system. The key to success at this step is to extract real-world-data that is publicly available and then prepare it into training sets. For our example, we need to gather the basic information about the products (title, description, SKU and image URL etc). We will use the extracted images and their categories to perform the machine learning training.

The quality of the trained neural network model is directly related to the quality of datasets you’re providing. So it’s important to make sure that the extracted data actually makes sense.

We’re going to use a library called Crawly to perform the data extraction.

Crawly is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival. You can find out more about it on the documentations page. Or you can visit our guide on how to complete web scraping in Elixir.

Now that is explained, let’s get started! First of all, we will create a new Elixir project:

mix new products_advisor --sup

Now the project is created, modify the deps function of the mix.exs file, so it looks like this:

  # Run "mix help deps" to learn about dependencies.
  defp deps do
    [
      {:crawly, "~> 0.1"},
    ]
  end

Now, fetch all the dependencies: mix deps.get, and we’re ready to go. For the next step, implement the module responsible for crawling harveynorman.ie website. Save the following code under the lib/products_advisor/spiders/harveynorman.ex

defmodule HarveynormanIe do
 @behaviour Crawly.Spider

 require Logger
 @impl Crawly.Spider
 def base_url(), do: "https://www.harveynorman.ie"

 @impl Crawly.Spider
 def init() do
  [
   start_urls: [
    "https://www.harveynorman.ie/tvs-headphones/"
   ]
  ]
 end

 @impl Crawly.Spider
 def parse_item(response) do

    # Extracting pagination urls
  pagination_urls =
   response.body |> Floki.find("ol.pager li a") |> Floki.attribute("href")

    # Extracting product urls
  product_urls =
   response.body |> Floki.find("a.product-img") |> Floki.attribute("href")

  all_urls = pagination_urls ++ product_urls

    # Converting URLs into Crawly requests
  requests =
   all_urls
   |> Enum.map(&build_absolute_url/1)
   |> Enum.map(&Crawly.Utils.request_from_url/1)

  # Extracting item fields
  title = response.body |> Floki.find("h1.product-title") |> Floki.text()
  id = response.body |> Floki.find(".product-id") |> Floki.text()

  category =
   response.body
   |> Floki.find(".nav-breadcrumbs :nth-child(3)")
   |> Floki.text()

  description =
   response.body |> Floki.find(".product-tab-wrapper") |> Floki.text()

  images =
   response.body
   |> Floki.find(" .pict")
   |> Floki.attribute("src")
   |> Enum.map(&build_image_url/1)

  %Crawly.ParsedItem{
   :items => [
    %{
     id: id,
     title: title,
     category: category,
     images: images,
     description: description
    }
   ],
   :requests => requests
  }
 end

 defp build_absolute_url(url), do: URI.merge(base_url(), url) |> to_string()

 defp build_image_url(url) do
  URI.merge("https://hniesfp.imgix.net", url) |> to_string()
 end

end

Here we’re implementing a module called HarveynormanIe which triggers a Crawly.Spider behavior by defining its callbacks: init/0 (used to create initial request used by the spider code to fetch the initial pages), base_url/0 (used to filter out unrelated urls, e.g. urls leading to the outside world) and parse_item/1 (responsible for the conversion of the downloaded request into items and new requests to follow).

Now for the basic configuration:

Here we will use the following settings to configure Crawly for our platform:

config :crawly,
  # Close spider if it extracts less than 10 items per minute
 closespider_timeout: 10,
  # Start 16 concurrent workers per domain
 concurrent_requests_per_domain: 16,
 follow_redirects: true,
  # Define item structure (required fields)
 item: [:title, :id, :category, :description],
  # Define item identifyer (used to filter out duplicated items)
 item_id: :id,
  # Define item item pipelines
 pipelines: [
  Crawly.Pipelines.Validate,
  Crawly.Pipelines.DuplicatesFilter,
  Crawly.Pipelines.JSONEncoder
 ]

That’s it. Our basic crawler is ready, now we can get the data extracted in a JL format, sent to a folder under the name: /tmp/HarveynormanIe.jl

Crawly supports a wide range of configuration options, like base_store_path which allows you to store items under different locations, see the related part of the documentation here. The full review of Crawly’s capabilities is outside of the scope of this blog post.

Use the following command to start the spider:

iex -S mix
Crawly.Engine.start_spider(HarveynormanIe)

You will see the following entries amongst your logs:

6:34:48.639 [debug] Scraped "{\"title\":\"Sony MDR-E9LP In-Ear Headphones | Blue\",\"images\":[\"https://hniesfp.imgix.net/8/images/detailed/161/MDRE9LPL.AE.jpg?fit=fill&bg=0FFF&w=833&h=555&auto=format,compress\",\"https://hniesfp.imgix.net/8/images/feature_variant/48/sony_logo_v3.jpg?fit=fill&bg=0FFF&w=264&h=68&auto=format,compress\"],\"id\":\"MDRE9LPL.AE\",\"description\":\"Neodymium Magnet13.5mm driver unit reproduces powerful bass sound.Pair with a Music PlayerUse your headphones with a Walkman, "<> ...

The above entries indicate that a crawling process is successfully running, and we’re getting items stored in our file system.

Tensor flow model training

To simplify and speed up the model training process, we are going to use a pre-trained image classifier. We will use an image classifier trained on ImageNet to create a new classification layer on top of using a transfer learning technique. The new model will be based on MobileNet V2 with a depth multiplier of 0.5 and an input size of 224×224 pixels.

This part is based on the Tensorflow tutorial on how to how to retrain an image classifier for new categories. If you followed the previous steps, then the training data set has already been downloaded (scraped) into a configured directory (/tmp/products_advisor by default). All the images are located according to their category:

/tmp/products_advisor  
├── building_&_hardware  
├── computer_accessories  
├── connected_home  
├── headphones  
├── hi-fi,_audio_&_speakers  
├── home_cinema  
├── lighting_&_electrical  
├── storage_&_home  
├── tools  
├── toughbuilt_24in_wall_organizer  
├── tv_&_audio_accessories  
└── tvs  

Before the model can be trained, let’s review the downloaded data set. You can see that some categories contain a very small number of scraped images. In cases with less than 200 images, there is not enough data to accurately train your machine learning program, so we can delete these categories.

find /tmp/products_advisor -depth 1 -type d \
        -exec bash -c "echo -ne '{}'; ls '{}' | wc -l" \; \
    | awk '$2<200 {print $1}' \
    | xargs -L1 rm -rf

This will leave us with just 5 categories that can be used for the new model:

/tmp/products_advisor  
├── headphones  
├── hi-fi,_audio_&_speakers  
├── tools  
├── tv_&_audio_accessories  
└── tvs  

Creating a model is as easy as running a python script that was created by the Tensorflow authors and can be found in the official tensorflow Github repository:

TFMODULE=https://tfhub.dev/google/imagenet/mobilenet_v2_050_224/classification/2

python bin/retrain.py \  
    --tfhub_module=$TFMODULE \  
    --bottleneck_dir=tf/bottlenecks \  
    --how_many_training_steps=1000 \  
    --model_dir=tf/models \  
    --summaries_dir=tf/training_summaries \  
    --output_graph=tf/retrained_graph.pb \  
    --output_labels=tf/retrained_labels.txt \  
    --image_dir=/tmp/products_advisor  

On the MacBook Pro 2018 2.2 GHz Intel Core i7 this process takes approximately 5 minutes. As a result, the retrained graph, along with new label categories, can be found in the configured locations (tf/retrained_graph.pb and tf/retrained_labels.txt in this example), these can be used for further image classification:

IMAGE_PATH="/tmp/products_advisor/hi-fi,_audio_&_speakers/0017c7f1-129f-4fa7-a62b-9766d2cb4486.jpeg"

python bin/label_image.py \
    --graph=tf/retrained_graph.pb \
    --labels tf/retrained_labels.txt \
    --image=$IMAGE_PATH \
    --input_layer=Placeholder \
    --output_layer=final_result \
    --input_height=224 \
    --input_width=224

hi-fi audio speakers 0.9721675
tools 0.01919974
tv audio accessories 0.008398962
headphones 0.00015944676
tvs 7.433378e-05

As you can see, the newly trained model classified the images from the training set with 0.9721675 probability of belonging to the “hi-fi audio speakers” category.

Image classification using Elixir

Using python a tensor can be created using the following code:

import tensorflow as tf

def read_tensor_from_image_file(file_name):
    file_reader = tf.read_file("file_reader", input_name)
    image_reader = tf.image.decode_jpeg(
        file_reader, channels=3, name="jpeg_reader")
    float_caster = tf.cast(image_reader, tf.float32)
    dims_expander = tf.expand_dims(float_caster, 0)
    resized = tf.image.resize_bilinear(dims_expander, [224, 224])
    normalized = tf.divide(resized, [input_std])
    sess = tf.Session()
    return sess.run(normalized)

Now let’s classify the images from an Elixir application. Tensorflow provides APIs for the following languages: Python, C++, Java, Go and JavaScript. Obviously, there’s no native support for BEAM languages. We could’ve used C++ bindings, though the C++ library is only designed to work with the bazel build tool. Let’s leave the mix integration with bazel as an exercise to a curious reader and instead take a look at the C API, that can be used as native implemented functions (NIF) for Elixir. Fortunately, there’s no need to write bindings for Elixir as there’s a library that has almost everything that we need: https://github.com/anshuman23/tensorflex As we saw earlier, to supply an image as an input for a tensorflow session, it has to be converted to an acceptable format: 4-dimensional tensor that contains a decoded normalised image that is 224×224 in size (as defined in the chosen MobileNet V2 model). The output is a 2-dimensional tensor that can hold a vector of values. For a newly trained model, the output is received as a 5x1 float32 tensor. 5 comes from the number of classes in the model.

Image decoding

Let’s assume that the images are going to be provided encoded in JPEG. We could write a library to decode JPEG in Elixir, however, there are several open source C libraries that can be used from NIFs. The other option would be to search for an Elixir library, that already provides this functionality. Hex.pm shows that there’s a library called imago that can decode images from different formats and perform some post-processing. It uses rust and depends on other rust libraries to perform its decoding. Almost all its functionality is redundant in our case. To reduce the number of dependencies and for educational purposes, let’s split this into 2 simple Elixir libraries that will be responsible for JPEG decoding and image resizing.

JPEG decoding

This library will use a JPEG API to decode and provide the image. This makes the Elixir part of the library responsible for loading a NIF and documenting the APIs:

defmodule Jaypeg do
  @moduledoc 
  Simple library for JPEG processing.

  ## Decoding

   elixir
  {:ok, <<104, 146, ...>>, [width: 2000, height: 1333, channels: 3]} =
      Jaypeg.decode(File.read!("file/image.jpg"))



  @on_load :load_nifs

  @doc 
  Decode JPEG image and return information about the decode image such
  as width, height and number of channels.

  ## Examples

      iex> Jaypeg.decode(File.read!("file/image.jpg"))
      {:ok, <<104, 146, ...>>, [width: 2000, height: 1333, channels: 3]}


  def decode(_encoded_image) do
    :erlang.nif_error(:nif_not_loaded)
  end

  def load_nifs do
    :ok = :erlang.load_nif(Application.app_dir(:jaypeg, "priv/jaypeg"), 0)
  end
End

The NIF implementation is not much more complicated. It initialises everything necessary for decoding the JPEG variables, passes the provided content of the image as a stream to a JPEG decoder and eventually cleans up after itself:

static ERL_NIF_TERM decode(ErlNifEnv *env, int argc,
                           const ERL_NIF_TERM argv[]) {
  ERL_NIF_TERM jpeg_binary_term;
  jpeg_binary_term = argv[0];
  if (!enif_is_binary(env, jpeg_binary_term)) {
    return enif_make_badarg(env);
  }

  ErlNifBinary jpeg_binary;
  enif_inspect_binary(env, jpeg_binary_term, &jpeg_binary);

  struct jpeg_decompress_struct cinfo;
  struct jpeg_error_mgr jerr;
  cinfo.err = jpeg_std_error(&jerr);
  jpeg_create_decompress(&cinfo);

  FILE * img_src = fmemopen(jpeg_binary.data, jpeg_binary.size, "rb");
  if (img_src == NULL)
    return enif_make_tuple2(env, enif_make_atom(env, "error"),
                            enif_make_atom(env, "fmemopen"));

  jpeg_stdio_src(&cinfo, img_src);

  int error_check;
  error_check = jpeg_read_header(&cinfo, TRUE);
  if (error_check != 1)
    return enif_make_tuple2(env, enif_make_atom(env, "error"),
                            enif_make_atom(env, "bad_jpeg"));

  jpeg_start_decompress(&cinfo);

  int width, height, num_pixels, row_stride;
  width = cinfo.output_width;
  height = cinfo.output_height;
  num_pixels = cinfo.output_components;
  unsigned long output_size;
  output_size = width * height * num_pixels;
  row_stride = width * num_pixels;

  ErlNifBinary bmp_binary;
  enif_alloc_binary(output_size, &bmp_binary);

  while (cinfo.output_scanline < cinfo.output_height) {
    unsigned char *buf[1];
    buf[0] = bmp_binary.data + cinfo.output_scanline * row_stride;
    jpeg_read_scanlines(&cinfo, buf, 1);
  }

  jpeg_finish_decompress(&cinfo);
  jpeg_destroy_decompress(&cinfo);

  fclose(img_src);

  ERL_NIF_TERM bmp_term;
  bmp_term = enif_make_binary(env, &bmp_binary);
  ERL_NIF_TERM properties_term;
  properties_term = decode_properties(env, width, height, num_pixels);

  return enif_make_tuple3(
    env, enif_make_atom(env, "ok"), bmp_term, properties_term);
}

Now, all that’s left to do to make the tooling work is to declare the NIF functions and definitions. The full code is available on github.

Image resizing

Even though it is possible to reimplement the image operation algorithm using Elixir, this is out of the scope of this exercise and we decided to use C/C++ stb library, that is distributed under a public domain and can be easily integrated as an Elixir NIF. The library is literally just a proxy for a C function that resizes an image, the Elixir part is dedicated to the NIF load and documentation:

static ERL_NIF_TERM resize(ErlNifEnv *env, int argc,
                           const ERL_NIF_TERM argv[]) {
  ErlNifBinary in_img_binary;
  enif_inspect_binary(env, argv[0], &in_img_binary);

  unsigned in_width, in_height, num_channels;
  enif_get_uint(env, argv[1], &in_width);
  enif_get_uint(env, argv[2], &in_height);
  enif_get_uint(env, argv[3], &num_channels);

  unsigned out_width, out_height;
  enif_get_uint(env, argv[4], &out_width);
  enif_get_uint(env, argv[5], &out_height);

  unsigned long output_size;
  output_size = out_width * out_height * num_channels;
  ErlNifBinary out_img_binary;
  enif_alloc_binary(output_size, &out_img_binary);

  if (stbir_resize_uint8(
        in_img_binary.data, in_width, in_height, 0,
        out_img_binary.data, out_width, out_height, 0, num_channels) != 1)
    return enif_make_tuple2(
      env,
      enif_make_atom(env, "error"),
      enif_make_atom(env, "resize"));

  ERL_NIF_TERM out_img_term;
  out_img_term = enif_make_binary(env, &out_img_binary);

  return enif_make_tuple2(env, enif_make_atom(env, "ok"), out_img_term);
}

The image resizing library is available on github as well.

Creating a tensor from an image

Now it’s time to create a tensor from the processed images (after it has been decoded and resized). To be able to load a processed image as a tensor, the Tensorflex library should be extended with 2 functions:

  1. Create a matrix from a provided binary
  2. Create a float32 tensor from a given matrix.

Implementation of the functions are very Tensorflex specific and wouldn’t make much sense to a reader without an understanding of the context. NIF implementation can be found on github and can be found under functions binary_to_matrix and matrix_to_float32_tensor respectively.

Putting everything together

Once all necessary components are available, it’s time to put everything together. This part is similar to what can be seen at the beginning of the blog post, where the image was labelled using Python, but this time we are going to use Elixir to leverage all the libraries that we have modified:

def classify_image(image, graph, labels) do
    {:ok, decoded, properties} = Jaypeg.decode(image)
    in_width = properties[:width]
    in_height = properties[:height]
    channels = properties[:channels]
    height = width = 224

    {:ok, resized} =
      ImgUtils.resize(decoded, in_width, in_height, channels, width, height)

    {:ok, input_tensor} =
      Tensorflex.binary_to_matrix(resized, width, height * channels)
      |> Tensorflex.divide_matrix_by_scalar(255)
      |> Tensorflex.matrix_to_float32_tensor({1, width, height, channels})

    {:ok, output_tensor} =
      Tensorflex.create_matrix(1, 2, [[length(labels), 1]])
      |> Tensorflex.float32_tensor_alloc()

    Tensorflex.run_session(
      graph,
      input_tensor,
      output_tensor,
      "Placeholder",
      "final_result"
    )
  end

classify_image function returns a list of probabilities for each given label:

iex(1)> image = File.read!("/tmp/tv.jpeg")
<<255, 216, 255, 224, 0, 16, 74, 70, 73, 70, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 255,
  219, 0, 132, 0, 9, 6, 7, 19, 19, 18, 21, 18, 19, 17, 18, 22, 19, 21, 21, 21,
  22, 18, 23, 19, 23, 18, 19, 21, 23, ...>>
iex(2)> {:ok, graph} = Tensorflex.read_graph("/tmp/retrained_graph.pb")
{:ok,
 %Tensorflex.Graph{
   def: #Reference<0.2581978403.3326476294.49326>,
   name: "/Users/grigory/work/image_classifier/priv/retrained_graph.pb"
 }}
iex(3)> labels = ImageClassifier.read_labels("/tmp/retrained_labels.txt")
["headphones", "hi fi audio speakers", "tools", "tv audio accessories", "tvs"]
iex(4)> probes = ImageClassifier.classify_image(image, graph, labels)
[
  [1.605743818799965e-6, 2.0029481220262824e-6, 3.241990925744176e-4,
   3.040388401132077e-4, 0.9993681311607361]
]

retrained_graph.pb and retrained_labels.txt can be found in the tf directory of the products-advisor-model-trainer repository that was mentioned earlier on in the model training step. If the model was trained successfully, tf directory should be similar to this tree:

/products-advisor-model-trainer/tf/  
├── bottlenecks  
├── retrained_graph.pb  
├── retrained_labels.txt  
└── training_summaries  

The most probable label can easily be found by the following line:

iex(6)> List.flatten(probes) |> Enum.zip(labels) |> Enum.max()
{0.9993681311607361, "tvs"}

Learn more

So there you have it. This is a basic demonstration of how Elixir can be used to complete machine learning projects. The full code is available on the github. If you’d like to stay up-to-date with more projects like this, why not sign up to our newsletter? Or check out our detailed blog on how to complete web scraping in Elixir. Or, if you’re planning a machine learning project, why not talk to us, we’d be happy to help.

Our work with the

Keep reading

A guide to tracing in Elixir

A guide to tracing in Elixir

Our Erlang Solutions team explores tracing and gives an overview on tracing tools for the BEAM.

Why Elixir is the Programming Language You Should Learn in 2020

Why Elixir is the Programming Language You Should Learn in 2020

Over the course of the article, we’ll show you how and why Elixir could be the ideal way to grow as a developer.

Which companies are using Elixir, and why? #MyTopdogStatus

Which companies are using Elixir, and why? #MyTopdogStatus

How do you choose the right programming language for a project? Here are some great use cases.