Lucas Allen, Charlotte, North Carolina Data Scientist

Lucas Allen

I am a data scientist, lover of mathematics, lifelong learner, Nintendo aficionado, and LOST apologist.

Keep on Movin’ On

By Lucas on October 8, 2017

  This blog has gone pretty quiet the last 6 months or so, which usually signals I’m up to something new. In fact, this time it’s a move across the country for another career move. My new opportunity takes me to a part of the country which is a big change for a guy who’s never lived outside of rural Central Illinois. Heading South I’m now located in the Queen City of Charlotte, NC.

Continue reading

Retro Game Retrieval Engine Design

By Lucas on July 19, 2017

I’ve got a new Shiny web app that I’ve embedded on another site where I’m doing some experimental things, and I wanted to talk generally about how I created it. The web app can be found at the following link that allows the user to do interactive searches for similar classic games for home consoles from what are generally known as the third generation (NES, Sega Master System) through the sixth generation (Wii, PS2, Xbox).

Continue reading

Back2School with Vectors, Cosine Similarity, and Word2Vec

By Lucas on May 12, 2017

Tomorrow, I’ll be making a return visit to the high school where I spent a decade in the mathematics department as a teacher. I’ve got the chance to speak to ten classes over the course of six class periods and tell them a little bit about what I do as a data scientist. Since many of the students will be familiar with concepts like vectors and trigonometry, I’ve decided to do an activity involving the Python gensim package and Word2Vec.

Continue reading

A New Introduction to Spark 2.1 Dataframes with Python and MLlib

By Lucas on May 3, 2017

A couple of years ago, when I was in the midst of my rookie year as a data scientist, I wrote a blog post and tutorial about using the Python Spark API to build a simple model from housing data with Spark dataframes. Despite the simple nature of the model (a straight train-test split with multivariate linear regression), it was one of the more challenging tutorials I’ve ever written for this blog.

Continue reading