After install the sandbox from the Hortonworks, you can visit the http://localhost:50070 page to find the information about the HDFS cluster. YARN job manager can be access via http://localhost:8088.

You can use of the following way to communicate with HDFS:

  • Ambari
  • CLI
  • HTTP/HDFS proxies
  • Java interface
  • Network File System (NFS)

Login to the Hortonworks sandbox:

ssh root@localhost -p 2222

This is how you can copy files to VM baed sandbox using SCP:

scp -P 2222 mytest.txt root@localhost:

from sandbox to local machine:

scp -P 2222  root@localhost:m1.txt  m1.txt

run the following command to list the HDFS files in root directory of the HDFS:

hdfs dfs -ls /
# or
hadoop fs -ls /

create a new directory

hdfs dfs -mkdir data

to add the myfiles directory to the HDFS:

hdfs dfs -put myfiles data
# or
hadoop fs -copyFromLocal myfiles data

To list the current directory

hdfs dfs -ls

The result should be like

image-20210102143959773

To delete the myfiles directory:

hdfs dfs -rm -r -skipTrash  data
# or
hadoop fs -rm -r -skipTrash  data

You can use the following command to list the Hadoop jobs:

hadoop job -list

Kill the job using JobId:

hadoop job -kill job_12...