
In this post, we will be performing analysis on the Uber dataset in Apache spark using Scala. The Uber dataset consists of 4 columns. They are dispatching_base_number, date, active_vehicles and trips. You can download the dataset from the below link: https://drive.google.com/open?id=0ByJLBTmJojjzS2c2UktqLW5uRG8 Problem Statement: Find the days on which each basement...

There is an increasing demand for NoSQL database experts, especially those highly trained on MongoDB. This is mainly due to the fact that NoSQL databases are replacing traditional RDBMs as NoSQL databases like MongoDB offers the flexibility to organize data in a way that makes the data usable. This along...

In the previous blog, we discussed about using histograms to check the central tendency measures of a continuous variable. We also plotted the frequencies of different ordinal variables to see the distribution of observations in each category. In this blog, we will discuss about other types of data visualizations,...