Author Archive: Linoxide

These articles are published with the combined efforts of linoxide team members

rss feed

viddl - Command Line tool to Edit (cut, crop and resize) Video Clips

Viddl is a ruby command line utility that can be used to download video from youtube. You can easily download, cut, crop and resize video clips using Viddl. In this tutorial, we will see how to install Viddl on CentOS 7. Then we will see how to downlaod, cut, crop and resize youtube video using […]

May 11, 2017 | By in OPEN SOURCE TOOLS
| Reply More

How to Import Data from MySQL to HDFS Using Sqoop

Apache Sqoop is a tool in Hadoop ecosystem which is used to import/export data between RDBMS and HDFS. This data is in structured format and has a schema. There are multiple cases where you want to analyze some data in your RDBMS, but due to huge size of data your RDBMS is not capable enough […]

January 2, 2017 | By
| Reply More

How to Run Hadoop MapReduce Program on Ubuntu 16.04

In this blog, I will show you how to run a MapReduce program. MapReduce is one of the core part of Apache Hadoop, it is the processing layer of Apache Hadoop. So before I show you how to run a MapReduce program, let me briefly explain you MapReduce. MapReduce is a system for parallel processing […]

December 20, 2016 | By
| Reply More

How to Setup Single Node Hadoop Cluster Using Docker

In this article, I will show you how to setup a single node hadoop cluster using Docker. Before I start with the setup, let me briefly remind you what Docker and Hadoop are. Docker is a software containerization platform where you package your application with all the libraries, dependencies, environments in a container. This container […]

December 2, 2016 | By in TRENDING
| Reply More

How to Deploy Spark Application on Yarn and Integrate with Hive

In this article, I will tell you working of Spark with  YARN and  Hive. Before I begin, let me briefly tell you what Apache Spark, YARN and Apache Hive are. Apache Spark is an in-memory distributed processing framework. Spark is used for real-time processing. Apache Spark can run programs up to 100x faster than Hadoop MapReduce […]

December 1, 2016 | By
| Reply More

Awesome ! Hadoop HDFS Commands Cheat Sheet

HDFS is now an Apache Hadoop subproject. An HDFS instance contains of vast amount of servers and each store a part of file system. A typical file size in HDFS would be in gigabytes or terabytes in size hence applications will have large data sets. A file once created need not be changed ie it […]

November 21, 2016 | By
| Reply More

How to Install Apache Sqoop on Ubuntu 16.04

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. For example : MySQL, Oracle, Microsoft SQL Server. You can import and export data between relational databases and hadoop. You can also import / export from / to semi-structured data sources, for example HBase and […]

November 16, 2016 | By in UBUNTU HOWTO
| 1 Reply More

30 Most Frequently Used Hadoop HDFS Shell Commands

In this tutorial, we will walk you through the Hadoop Distributed File System (HDFS) commands you will need to manage files on HDFS. HDFS command is used most of the times when working with Hadoop File System. It includes various shell-like commands that directly interact with the Hadoop Distributed File System (HDFS) as well as other file […]

November 11, 2016 | By in UBUNTU HOWTO
| Reply More

How To Set Up Jenkins for Continuous Development Integration on Ubuntu 16

In this blog, we will set up jenkins and configure jenkins on ubuntu 16 for continuous development and continuous integration. Jenkins is a Continuous Integration server. Basically, Continuous Integration is the practice of running your tests on a non-developer machine automatically everytime someone pushes new code into the source repository. Before we set up jenkins, you need to […]

October 26, 2016 | By in UBUNTU HOWTO
| Reply More

How to Setup Hadoop Multi-Node Cluster on Ubuntu

In this tutorial, we will learn how to setup a multi-node hadoop cluster on Ubuntu 16.04. A hadoop cluster which has more than 1 datanode is a multi-node hadoop cluster, hence, the goal of this tutorial is to get 2 datanodes up and running. 1) Prerequisites Ubuntu 16.04 Hadoop-2.7.3 Java 7 SSH For this tutorial, […]

October 19, 2016 | By in TRENDING
| 13 Replies More

30 Expected DevOps Interview Questions and Answers

If you are searching for devOps job, you are making good career decisions since it is well paid and heavily demanded job nowadays. In this article, we will go through the DevOps interview questions that you might expect from your interviewer. The questions are many and we couldn't possibly list all of them, but after […]

August 18, 2016 | By in DEVOPS
| 1 Reply More

Learn Ansible Basics to Install, Create Roles, PlayBook in Linux

DevOps has become an increasingly important aspect of daily life for many systems administrators. The demand to automate as much as possible, combined with the needs for flexibility and scalability, can give the most seasoned veteran a headache. Ansible will help ease a lot of those pains. Ansible is an open source tool used for […]

July 8, 2016 | By in DEVOPS
| Reply More