Dimensionality reduction approaches for sentence embeddings

Being able to capture the context of a word or sentence provides insightful features for downstream tasks, like classification or named entity recognition. These context captures, called embeddings, are ubiquitous in current NLP approaches.

Using transfer learning, we can leverage pre-trained models created from a rich corpus of data, and use their output as the input layer on nimbler models. Examples of these can be the Universal Sentence Encoder and other models available in SpaCy or HuggingFace.

These models are extremely good at capturing context, but they have a fixed number of outputs. …


Deploying Machine Learning Models at RavenPack

At RavenPack our team of developers integrate all the NLP and Machine Learning innovations we create in order to help our customers capture the insights they need to extract value in (close to) real-time.

As Machine Learning engineers, however, there are occasions where we want to test our models in the real world, with real data and to receive real feedback from the market without releasing a new product version. Thus we require an efficient, cost effective and scalable way to deploy them to be tested by the rest of the teams at the company, and our clients.

Here enters…


This article was originally published in Spanish on cienciadedatos.net

Overview

When I was outlining Wyno?, I wanted to join two of my passions, wine and Data Science, to find recommendations of new wines to taste based on similar expert critics’ opinions using semantic search.

The data is publicly available here and contains more than 120.000 tasting notes of different wines from the old and the new world from the Wine Enthusiast Magazine.

I used transfer learning to convert the tasting notes into vectors and then find the recommendations using cosine similarity.

Approach

On many occasions, we find surprising wines out of what…


This article first appeared in cienciadedatos.net, you can find the original version in Spanish here.

In it, will explore how reinforcement learning and more precisely the multi-armed bandit allows us to reduce the necessary time to assess whether or not a new website version is more efficient to increase the number of clients that are called to action.

Business context

One of the key aspects in marketing is converting the visits to our websites into a CTA (Call to action) from potential clients coming to our main page.

This is one of the reasons why an attractive website that favors this CTA


How to leverage the capabilities of HuggingFace for named entity recognition tasks (NER) using a custom dataset of financially relevant entities to fine-tune a pre-trained model.

Using NER at RavenPack

One of the key values of RavenPack for our customers is the ability of our products to deliver relevant information in real-time for their decision-making.

NER plays an important role in how RavenPack identifies these relevant aspects within a news story. Having prior knowledge of the relevant commodity, company or sector, to name a few, is paramount in providing high-quality information in a timely manner. In order to accomplish this, we maintain a database of around 300,000 and increasing predefined entities with 16 distinct types.

What happens when new companies, like Stripe, or currencies, like BitCoin, start appearing in the…

Francisco ESPIGA

Machine Learning Engineer @Ravenpack. Teacher @ESIC. Data Science enthusiast building a more interesting world 1 epoch at a time.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store