Wednesday, September 9, 2015

The Internship Experiment - Top 5 Challenging Scala and MongoDB concepts

Overview

In the recent internship that I had offered, I wanted the interns to experience new and emerging technologies that they were less likely encounter in their school's curriculum, and hence would add to the overall challenge. Scala, Play Framework, and MongoDB were chosen for this reason. To counter the steep learning curve, I kept the internship project goal simple. The interns were to develop a web-based survey creation and collection application, using the technologies mentioned earlier. In order to help the interns with learning these technologies, I shared the source code of another web-based application I created using these same technologies. We went over a code walk through so as to familiarize them with core concepts of Scala, Play, and MongoDB. As with learning any new technology, there were challenges. This article will list, and briefly describe, the top 5 challenges the interns ran into with learning and using these technologies for their project.

5. MongoDB: Joins and Aggregations

Schools teach RDBMS concepts, and they involve JOINing. Lots of it. Which is why NoSQL databases can be very confusing to understand at first. The interns were no different. The concept of rich documents excited them, and they seemed to believe that they understood the concepts well. They were also successful in creating CRUD code for simple documents. However, when they started work on slightly involved use cases, they began to miss their INNER JOINs and COUNTs from their Advance Databases Class. After a couple hours of revisiting document modeling in light of the use cases that triggered their SQL withdrawals, they became slightly more familiar and comfortable with using Rich Documents.

4. Play/Reactive MongoDB Driver: Writing code for asynchronous execution

Coding involves solving a problem, and so, we ought to code like we think. This involves anticipating and planning ahead of time, for possible decisions that we might encounter. Many of us relied on debugging and stepping through code, early on, as a learning tool. This, however, has an adverse consequence, as instead of teaching us to code like we think, it teaches us to code as we'd like the code to execute. This results in our code being written for sequential real-time execution. This makes it much more difficult to learn asynchronous programming, which involves coding for what might happen in the future, rather than in real time.

Things got even more complicated for the interns, when they had to write code involving multiple Futures, and tying their responses together, in a single code block. We spent a few hours revisiting this issue on more than a couple occasions, where we emphasized the need to "code like we think" rather than "code like we debug".

3. Scala: Implicit Conversions

Implicit type conversions is a very handy feature of the Scala language. It allows you to avoid writing intermediate type conversion code every time you have object of type A, but instead need object of type B. Writing an implicit conversion from type A to B gives you the ability to use object of type B anywhere you'd use object of type A. In my example code to the interns, I had provided them with an implicit conversion of a model case class to, and from, a BSONDocument type, so that I could pass the model case class directly to the MongoDB collection functions, instead of creating BSONDocuments for them every time. The interns soon forgot the concept and assumed that the MongoDB driver worked directly on the model classes. Many of their MongoDB driver related errors were, in fact, a result of copying-pasting code for implicit conversion from one model class to another.

2. MongoDB: Accessing the _id for an inserted document

MongoDB automatically creates _id key, and sets it to a new ObjectID, whenever you insert a document to a collection without an _id key. In the example code for a model class, I had also provided code for implicit conversion for the model case class to and from BSONDocument type. The implicit conversion to BSONDocument accounted for a missing id member of the model class. In that case, a new BSONObjectID was generated and assigned to the id member. This was part of the implicit conversion.

However, since the interns had forgotten all about the purpose of the implicit conversions, they were confused when they encountered a use case, where they had to use the id of a freshly inserted document, as a key in another document, in a different collection. They wanted to know how they could retrieve the id of a newly inserted document. The solution was simple. Supply the id yourself before inserting the document the first time. Since you control the generation of id in code, you could use it however you like.

1. Scala: Type Inference

Type inference is a very powerful feature of the Scala language and compiler. If a type can be inferred, chances are, you can get by without specifying the type of the object you are creating. This can, however, result in some unexpected behavior, and compile time errors, that could be confusing to interpret if not used carefully.

The most common complaint and frustration I heard from the interns had to do with this feature. The easiest way to troubleshoot type inference related issues happens to be to not let the compiler infer type of the problem code block. Explicitly specifying the type of the object created, or the return type of the code block, helps isolate the issue to the specific code block that might be causing the problem behavior. Once the problem is fixed, you can then omit the type information.

Conclusion

The above challenges were in no way a reflection on the interns' aptitude or ability to solve problems. I remember encountering similar challenges when I was first introduced to Scala, Play, and MongoDB driver for Scala. My academic training had not prepared me sufficiently for these challenges either. That was the primary reason why I chose these technologies for the internship. The interns admitted their frustration on my decision, but in the end were grateful for the same, as it pushed them out of their comfort zones, and gave them a safe platform to explore these technologies and concepts.