HADOOP interview Questions with Answers 2021
Hadoop is divided into HDFS, MapReduce, Hive, Hbase, Sqoop, Flume, ZooKeep, Pig, and Yarn Hadoop interview questions. Hadoop Interview Questions · What does ‘jps’ command do? · How to restart Namenode? · Which are the three modes in which Hadoop can be run? Hadoop Interview Questions and Answers for freshers and experienced cover latest interview question of Hadoop with answers,
Q1.Which of the following are the Goals of HDFS?
A. Fault detection and recovery
B. Huge datasets
C. Hardware at data
D. All of the above
Option D – All of the above
Q2.__ NameNode is used when the Primary NameNode goes down.
D. Both A and B
Option C – Secondary
Q3.The minimum amount of data that HDFS can read or write is called a _.
D. None of the above
Option C – Block
Q4.The default block size is __.
Option B – 64MB
Q5.For every node (Commodity hardware/System) in a cluster, there will be a _.
D. None of the above
Option A – Datanode
Q6.Which of the following is not Features Of HDFS?
A. It is suitable for the distributed storage and processing.
B. Streaming access to file system data.
C. HDFS provides file permissions and authentication.
D. Hadoop does not provides a command interface to interact with HDFS.
Option D – Hadoop does not provides a command interface to interact with HDFS.
Q7.HDFS is implemented in _ language.
Option C – Java
Q8.During start up, the _ loads the file system state from the fsimage and the edits log file.
Option B – Namenode
Q9.What is full form of HDFS?
A. Hadoop File System
B. Hadoop Field System
C. Hadoop File Search
D. Hadoop Field search
Option A – Hadoop File System
Q10.HDFS works in a __ fashion.
A. worker-master fashion
B. master-slave fashion
C. master-worker fashion
D. slave-master fashion
Option B – master-slave fashion
Q11.HDFS block size is larger as compared to the size of the disk blocks so that
A. Only HDFS files can be stored in the disk used.
B. The seek time is maximum
C. Transfer of a large files made of multiple disk blocks is not possible.
D. A single file larger than the disk size can be stored across many disks in the cluster.
Option D – A single file larger than the disk size can be stored across many disks in the cluster.
Q12.What is are true about HDFS?
A. HDFS filesystem can be mounted on a local client’s Filesystem using NFS.
B. HDFS filesystem can never be mounted on a local client’s Filesystem.
C. You can edit a existing record in HDFS file which is already mounted using NFS.
D. You cannot append to a HDFS file which is mounted using NFS.
Option A – HDFS filesystem can be mounted on a local client’s Filesystem using NFS.
Q13.Underreplication in HDFS means-
A. No replication is happening in the data nodes.
B. Replication process is very slow in the data nodes.
C. The frequency of replication in data nodes is very low.
D. The number of replicated copies is less than as specified by the replication factor.
Option D – The number of replicated copies is less than as specified by the replication factor.
Q14.HDFS stands for
A. Highly distributed file system.
B. Hadoop directed file system
C. Highly distributed file shell
D. Hadoop distributed file system.
Option D – Hadoop distributed file system.
Q15.When a file in HDFS is deleted by a user
A. it is lost forever
B. It goes to trash if configured.
C. It becomes hidden from the user but stays in the file system
D. File sin HDFS cannot be deleted
Option B – It goes to trash if configured.
Q16.The archive file created in Hadoop always has the extension of
Option B – .har
Q17.Which of the following is not a scheduling option available in YARN
A. Balanced scheduler
B. Fair scheduler
C. Capacity scheduler
D. FiFO schesduler.
Option A – Balanced scheduler
Q18.What is AVRO?
A. Avro is a java serialization library.
B. Avro is a java compression library.
C. Avro is a java library that create split table files.
D. None of these answers are correct.
Option A – Avro is a java serialization library.
Q19.You have loads of data that can be processed by you MRjobs. However you need the data to be available to Analysts and Scients in you organisation. What is the best format to represent the input?
A. Sequence File.
Option B – Avro
Q20.For the frequently accessed HDFS files the blocks are cached in
A. the memory of the datanode
B. in the memory of the namenode
C. Both A&B
D. In the memory of the client application which requested the access to these files.
Option A – the memory of the datanode