After several months building on Apache Spark here are some lessons we learned about the benefits of DataFrame vs RDDs and several situations in which the RDD API may still be preferable.