Shark is an interactive SQL system for Hadoop that claims to provide blazing fast (even real-time) performance that is comparable to MPP databases. It is highly-scalable system that works on top of Spark that includes features for data co-partitioning, fault tolerance, and even the integration of machine learning. Shark supports many Hive data formats as well as HDFS, HBase, and Amazon S3.
Documentation for Shark is available Github. According to the project website, it takes around 5 mins to set up Shark on a single node for a quick spin, and about 20 mins on an Amazon EC2 cluster.