Beanify is a Tensorflow-powered CNN to identify Capital One’s promotional jellybeans, built at Hack Notts 2019.
Capital One sponsor many MLH hackathons in the UK, and they always hand out small branded containers of unusually flavoured jelly beans. Whilst many of them are nice, some of them are really, really not. We decided to solve this problem using machine learning by automatically identifying the flavour of each bean.
To train the CNN we took videos of each type of bean, and used alternate frames for training and testing. Using OpenCV we identified the bean in each frame and cropped to a bounding box.
You’ll notice that these frames are all relatively similar. Unfortunately the format of a hackathon isn’t particularly good for collecting proper training data. To try and improve this we filmed two more videos for each type (at 4AM), however these videos didn’t really work very well with our feature detection system (in the second video for tangerine the bean was detected in just 13 out of 333 frames).
We trained our model with Keras using Tensorflow running on Google Colab (our notebook is on GitHub). Initially our model was reporting over 99% accuracy on the testing set. Whilst the testing set was technically different to the training set, in reality taking alternate frames means the sets are very similar. We also tried using the first half for training and the second half for testing, but this also reported suspiciously high accuracy.
The frontend is built in Vue.js and uses Tensorflow.js to classify the images. Initially we struggled with the model size (many hundreds of megabytes), but eventually we were able to reduce it to a more-easily-distributed 40MB. This still isn’t fantastic and combined with the sizeable OpenCV.js library (used for client-side boundary detection), the page comes in at a hefty 56MB—although ‘only’ 37MB when compressed with Brotli.
Well, no, not really. We were able to get very limited success in some lighting conditions but it’s likely this was just a coincidence.
So why is this? One obvious issue is the training data. In a hackathon environment it’s very difficult to get enough training data to achieve good results (if we were sensible, we would’ve chosen an area with pre-existing datasets1, but where’s the fun in that?)
Despite this, we still made it to the finals and then won MLH’s Best use of Google Cloud prize. However, the day after Hack Notts we discovered an issue that could explain our problems: we were training our dataset with RGB values between 0 and 1, but classifying on the web with values between 0 and 255. Essentially, we were trying to classify a totally white image each time. Fixing this didn’t really seem to do much, so it’s likely that poor training data was still the issue.
We mainly chose this project as a way to learn Tensorflow, so whether it was the best tool for the job wasn’t really a consideration. The problem of identifying jellybeans is mostly about recognising colours and this can be done in a much simpler way. Our friend Marcy wrote a tool which just finds the average colour of a circle and matches it to the closest jellybean. This works almost perfectly in some cases, however some flavours are the same colour as another but with a pattern on the surface, which it may struggle to recognise.