Knowledge check

1.

Which user requirements are best suited for using HDInsight Interactive Query?

When you want to use MapReduce on unstructured data with role-based access controls.

When you want to use SQL-like queries on structured data with row and column level controls.

When you want to use SQL-like queries on high concurrency data for long running-computations.

2.

What file formats are supported with Interactive Query?

.xml, .doc, .log

.json, .csv, .txt

.PDF, .DBK, .MD

3.

Which scenario is best for HDInsight Interactive Query?

Batch processing

Streaming data.

Ad hoc queries

4.

Why is the Hive Warehouse Connector needed?

Hive and Spark are different cluster types.

Hive and Spark have two different metastores. They require a connector to bridge between the two.

Hive is for static data and Spark is for streaming data.

5.

Why is using the Hive Warehouse Connector more efficient and scalable than using a standard JDBC connection from Spark to Hive?

Because the library loads data from the HiveServer into the spark driver in parallel

Because the Hive Warehouse Connector is optimized for streaming data.

Because the library loads data from LLAP daemons into Spark executors in parallel

Feedback