From PoC to MvP

5 min readJan 11, 2022

Using NLP to create an AI-driven book recommendation system

Project recap

The goal of the project and this series of articles is to walk the path of AI-based product creation, from the first baby steps to, progressively, building a more complex solution. Each stage of the journey will build on the previous one:

In the first article of the series, we created a PoC to validate our approach, using NLP. You can find all the necessary code in my github repository
In this second article of the series, we will build an app using streamlit as the front end, and fastAPI in the backend.
In the third article, we will migrate the data to an ElasticSearch database.
In the last article, we will create a telegram bot to ask for suggestions.

So, let’s get back on track!

In the first article of the series, we introduced some product-development concepts and outlined the different stages of our MvP. Then, we created a proof-of-concept to test our idea and hypotheses and decide whether or not it was possible to build our book recommendation engine. Namely:

Hypothesis 1: we can use NLP to obtain product recommendations by similarity (books in our case).
Hypothesis 2: using a BERT-based model to compute the book description embedding and the query embedding is an adequate choice.
Hypothesis 3: regardless of the query (different languages or broader vs specific queries), the recommendations were ok.

We were not able to reject hypotheses 1 and 2, and 3 is debatable, although there was room for improvement. However, the main friction point in our product development, especially to gather user feedback, and actually use the tool, was the way our PoC was built.

For those reasons, now we will migrate the back end to fastAPI, and the front end to streamlit. This implementation is strongly based on this project by Davide Fiocco.

Architecture choices

From our previous article, we are keeping the database for now. It is not ideal to have it in a jsonlines file, but this will soon change and it is manageable.

The main change that we will do is to have a back end working as a broker, that will take the user queries from the front-end, send them to an additional service that will compute the query embedding, and then pass that vector to be compared against the embeddings in the database.

This will allow us in the future to connect different databases, front-ends, and embedding providers using the broker to orchestrate all.

You can find all the required services from this stage in my github repository.

Back end

The broker currently has two different endpoints.

The first endpoint, get_recommendations, takes a set of book titles and/or ISBNs and composes a user library, then retrieves matches for that library by their similarity to each title description.
The second endpoint, get_recommendations_from_description, takes a query, and retrieves matches by similarity, relying on the embedding service.

To call the embedding service, we need some docker-compose conventions in the fastAPI

embedding_service = “http://embeddings:8502"

This is the name defined in the docker-compose.yaml file and the port that are exposed for the embedding generation service.

Embedding generator

The embedding generator is another API implemented using fastAPI as well, that takes a query and returns the embedding.

It uses the same model based on sentence transformers, paraphrase-multilingual-MiniLM-L12-v2 as in our previous article, and returns a JSON with the embedding vectors.

Both APIs have their respective docker containers (fastapi and embeddings, respectively) and the base image is the 3.7 python version from tiangolo:

FROM tiangolo/uvicorn-gunicorn:python3.7RUN mkdir /fastapiCOPY requirements.txt /fastapiCOPY embeddings.p /fastapiWORKDIR /fastapiRUN pip install -r requirements.txtCOPY . /fastapiEXPOSE 8000CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]

The API swagger can be accessed from http://localhost:port/docs

Front end

I really like streamlit to prototype the front-end. It is fast, flexible, and allows us to iterate quickly. In my opinion, it is easier to set up compared to Dash, but it is a matter of personal taste and both are equally valid options for an MvP.

Our streamlit application will have two recommendation retrieval options:

Option 1: creating a library using a list of ISBNs and book titles.
Option 2: writing a query.

The front-end logic will decide which endpoint to call, based on having text on the description field of the interface or not. We will load the ISBN and titles list from a csv file. It is not ideal, but yet again, we are building an MvP.

FROM python:3.7-slimRUN mkdir /streamlitCOPY requirements.txt /streamlitCOPY books_info.csv /streamlitWORKDIR /streamlitRUN pip install -r requirements.txtCOPY . /streamlitEXPOSE 8501CMD ["streamlit", "run", "ui.py"]

Connecting everything

We want to be able to spin up the stack at once, as well as map the different services relying on each other, seamlessly. For this, we will use docker-compose, which is a helpful functionality and allows us to later retrieve the different logs for each container, tracing, etc.

Using the docker-compose yaml, we can set up the different services, their options, port mapping, and dependencies. This is very useful because we can build containers independently in case we just want to test one service, or more than one when we have dependencies. In our use case, all the containers are will be in the same network. The fastAPI service depends on the embeddings service, and the streamlit app depends on both.

To create the stack and spin it up, simply run on a terminal

sudo docker-compose build
sudo docker-compose up

And if we just want to test one of the services

sudo docker-compose up myservice

Wrapping up

Now is time to wear our product-owner hats once again. After this stage, we have a running app where the user can input a set of titles and ISBNs or a description to retrieve matching suggestions.

However, we have detected that creating a library is a cumbersome process because of two reasons: searching through a list of 1 million references is resource-consuming, and users are not familiar with the ISBNs of the books they read.

This functionality might make sense for a bookshop or for testing, but a more coherent behavior would be to simply pass the query of what the user wants to read. Moreover, keeping the data in a file, even for an MvP is not robust, as all the data needs to be loaded in memory.

In our next article, we will migrate the database to ElasticSearch and modify the rest of the services accordingly.