tellUsWho is a social network survey tool that creates profiles for students, rich in data on their background, school, and work information, as well as their favorite interests and activities.

The Problems

  1. How can we calculate contextual rarity?</List.Item>
  2. How can we show people potential non-romantic matches they may be interested in meeting?

Prior research conducted by Dr. Julia Mayer informs us of the novel concept of contextual rarity: "the rarer a shared user attribute is in the current context, the more interested the user is in meeting another person who shares this contextually rare attribute" (Mayer, J.M. et. al).

tellUsWhoArchitecture-1

Application Development

tellUsWho Survey

  • Used Anorm Scala Library to interface against our PostgreSQL Database
    • Google Login Oauth2 Using Silhouette Library allowing students to login with their NJIT Webmail accounts
  • Front-End implemented used:
    • Twirl Templating Language (Scala/Play)
    • Twitter Bootstrap</List.Item>
    • Materialize CSS for Material Design Components
  • Deployed using Docker Containers for API & DB on an Ubuntu VPS

Calculating Contextual Rarity

Our survey gathered data from participants on such fields such as what their favorite tv/movies/sports programs were, who their favorite musical artists were, what they were willing to teach or coach, if they wanted to learn something, and a number of other questions. In order for our lead researcher (Dr. Julia Mayer) to be able to conduct statistical analysis to determine which factors were indeed contextually rare, I had to concoct an algorithm to compute contextually rarity.

I used a python script that interfaced with our database.This script contained a combination of helper functions, and its main features were:

  1. eliminating stopwords from the data to help reduce noise
  2. reducing elements to their common root by using techniques such as lemmatizing and eliminating misspelt words by utilizing Peter Norvig's famous spell checker
  3. utilizing NLTK's Frequency Distributions to calculate the contextual rarity for every unique interest and activity entered by participant

The Matching Algorithm

Our application needed to show on a mobile phone(Android) user potential people they could meet if this system was deployed as a real app. We were utilizing the research through design methodology. In order to present users "matches", I developed a matching algorithm to generate matches for the users based on the interest and activity data they entered in the survey. The goal was to generate 150 matches per user for them to be matched on.

genMatch
The presentation of a potential match to a test user

The survey is broken up into the interests section, the school and work section, and the background section. In order to populate the rest of the fields in the card displayed above, I had to use a randomizer function
The names I pulled from the RandomUser.me(https://randomuser.me) API, using a model class defined in scala
This class contained the URL of the API with which returned a Future[JsValue]

def getDummyUser(): Future[JsValue] = 
    ws.url(RandomUser.url).get().map{response => {
        response.json
    }
}

This function is defined in my MatchDataController.scala file.

The rest of the MatchData is defined in my [MatchData.scala model file] (https://gitlab.com/coo-e/tellUsWho-Scala-Server/blob/prodPreDecember/app/models/MatchData.scala). I stored the data to randomize in another model class that contained the elements in Scala's Vector collection type. Vectors were necessary over simple List Collection's because Vectors have equal seek time to any element,whereas List does not. Using list would return very odd results, mixing response from various randomized results. Using the Vector collection class type eliminated this problem.

The Android client application would show the user 30 matches per day over the course of five days. I generated the matches by selecting each choice a user entered. For instance, if they had a nationality of peruvian, I would generate a random user with nationality that was peruvian, and the other fields for this generated dummy user would be randomized. I created helper functions that
[generated school-work info matches] (https://gitlab.com/coo-e/tellUsWho-Scala-Server/blob/prodPreDecember/app/models/MatchData.scala#L219)
as well as demographic matches, and [interest matches] (https://gitlab.com/coo-e/tellUsWho-Scala-Server/blob/prodPreDecember/app/models/MatchData.scala#L121") My final [match generation function] (https://gitlab.com/coo-e/tellUsWho-Scala-Server/blob/prodPreDecember/app/controllers/MatchDataController.scala#L64) defined in the MatchDataController.scala class called each of these helper functions, and continually kept generaing more matches until at least 150 were inserted into our postgres database.

The Challenges

The most difficult part in developing this application was the learning curve of Scala. I had come across some components of functional programming through node.JS/JavaScript, but the documentation was much clearer and thoroughly explained as compared to Scala. In spite of this, I was able to gain enough mastery and I implemented Futures, Promises, attempted using async/await methods. I learned a lot about Collection classes and using functions to transform data such as flatten, map, flatMaps etc. I had reused logic that I had used in python in the past and utilized dictionaries (Or Maps as they are called in Scala)
that contained lambda functions as values

Unfortunately, I was not able to master the Actor Model in this short time frame, and I fell back on using imperative strategies for dealing with state, and relied on using iteration and global counters to generate the >=150 matches required. This was also the first time I learned and utilized the Docker Linux Container Platform. In spite of these challenges, I delivered the critical components in time to support our lead researcher.

Aside from doing the backend development to generate matches, I was faced with one final problem. Our front end developer was unable to create the complicated interactions and designs implemented by
our design team with React.js in the short time frame he was given. As a emergency back-up, I worked around the clock for three days and
nights and learned the Scala Twirl templating language and used Twitter Bootstrap and Google's Material Design CSS styling to try to make the survey as enjoyable as I could in the span of a few days.

tellUsWhoIntAct
Tags input via the Twirl templating language in Scala/Play!