Problem Scenario 32: You have been given below data format

FirstName,LastName,EMPID,LoggedInDate,JoiningDate,DeptId
Ajit,Singh,101,20131206,20131207,hadoopexamITDEPT
Arun,Kumar,102,20131206,20110607,hadoopexamPRODDEPT
Ajit,Singh,101,20131209,20131207,hadoopexamITDEPT
Ajit,Singh,101,201312011,20131207,hadoopexamITDEPT
Ajit,Singh,101,201312012,20131207,hadoopexamITDEPT
Ajit,Singh,101,201312013,20131207,hadoopexamITDEPT
Ajit,Singh,101,20131216,20131207,hadoopexamITDEPT
Ajit,Singh,101,20131217,20131207,hadoopexamITDEPT
Arun,Kumar,102,20131206,20110607,hadoopexamPRODDEPT
Arun,Kumar,102,20131209,20110607,hadoopexamPRODDEPT
Arun,Kumar,102,20131210,20110607,hadoopexamPRODDEPT
Arun,Kumar,102,20131211,20110607,hadoopexamPRODDEPT
Arun,Kumar,102,20131212,20110607,hadoopexamPRODDEPT
Arun,Kumar,102,20131213,20110607,hadoopexamMARKETDEPT
Arun,Kumar,102,20131214,20110607,hadoopexamMARKETDEPT

Remove duplicate records from this file ignoring LoggedInDate. 
In output you can have any LoggedInDate, does not matter. And store final result in a Hive table.
 

cca159problemscenario32_Screen_Stream.avi



http://www.hadoopexam.com/Cloudera_Certification/CCPDE575/CCP_DE575_Hadoop_Cloudera_Data_Enginer_Questions_Dumps_Practice_Test.html

ApacheSpark Interview Questions

·         Apache Spark InterviewQuestions-1

·         Apache Spark Interview Questions-2

·         Apache Spark Interview Questions-3

·         Apache Spark Interview Questions-4

·         Apache Spark Interview Questions-5

·         Apache Spark Interview Questions-6 

          Apache Spark Interview Questions-7


http://www.hadoopexam.com/cloudera_certification/cca175/cca_175_hadoop_cloudera_spark_certification_questions_dumps_practice_test.html