Beer Recommender

Circa 2020

Describe your ideal beer and the NLP engine will find you a match from the top 250 beers on BeerAdvocate.

The Story

Built in 2020, this project was my exploration into what AI could do with natural language — years before ChatGPT made it mainstream. At the time, using word embeddings and semantic similarity to match free-text descriptions to real-world items felt genuinely groundbreaking. The idea was simple: describe your ideal beer in plain English, and let the model find you one. No dropdowns, no filters — just natural language.

Looking back, this was a precursor of what was to come. The same intuition — that computers could “understand” language well enough to act on it — would explode into the LLM era that followed. I'm preserving this project as-is: a snapshot of what pre-transformer NLP looked like in practice.

How It Works

1. Data Collection

A web scraper crawled BeerAdvocate for the top 250 beers, collecting names, styles, ABV, ratings, and up to 40 user reviews each.

2. Text Processing

Reviews are cleaned, tokenized, and stripped of stopwords using NLTK. Each beer's reviews are condensed into a single corpus of descriptive words.

3. spaCy Word Vectors

Using spaCy's en_core_web_lg model (300-dim GloVe vectors), each beer's text corpus is converted into a vector representation — capturing semantic meaning.

4. Sentiment Analysis

SpacyTextBlob analyzes the polarity of each beer's reviews, adding a sentiment dimension to the recommendation scoring.

5. Similarity Matching

Your description is vectorized with the same model, then compared against all beers using cosine similarity — the same technique that powers modern embedding search.

6. Ranked Results

Beers are ranked by similarity score and returned in a sortable table with links back to BeerAdvocate for each recommendation.

Tech Stack

PythonFastAPIspaCyNLTKGloVe EmbeddingsSentiment AnalysisBeautifulSoupNext.jsDocker