HDFS Basics
Table of Contents
After install the sandbox from the Hortonworks, you can visit the http://localhost:50070 page to find the information about the HDFS cluster. YARN job manager can be access via http://localhost:8088.
You can use of the following way to communicate with HDFS:
- Ambari
- CLI
- HTTP/HDFS proxies
- Java interface
- Network File System (NFS)
Login to the Hortonworks sandbox:
ssh root@localhost -p 2222
This is how you can copy files to VM baed sandbox using SCP:
scp -P 2222 mytest.txt root@localhost:
from sandbox to local machine:
scp -P 2222 root@localhost:m1.txt m1.txt
run the following command to list the HDFS files in root directory of the HDFS:
hdfs dfs -ls /
# or
hadoop fs -ls /
create a new directory
hdfs dfs -mkdir data
to add the myfiles directory to the HDFS:
hdfs dfs -put myfiles data
# or
hadoop fs -copyFromLocal myfiles data
To list the current directory
hdfs dfs -ls
The result should be like

To delete the myfiles directory:
hdfs dfs -rm -r -skipTrash data
# or
hadoop fs -rm -r -skipTrash data
You can use the following command to list the Hadoop jobs:
hadoop job -list
Kill the job using JobId:
hadoop job -kill job_12...