Linux system administrator are users with special privileges and duties who must maintain system stable. Very useful resources are the manual pages, which should always be consulted when you are not familiar with a command. If you do not know which command you need, then the apropos, info or man command can be used but, it is more powerful and interesting when you understand the basics concepts and can explain it friendly to the others persons. This article concerns a lot of questions and answers that you can probably face in your Linux sysadmin job.
1) What is the difference between CTRL-C and CTRL-Z?
When you have a process in progress which handle your prompt, there are some signals (orders) that we can send to theses process to indicate what we need:
Control+Csends SIGINT which will interrupt the application. Usually causing it to abort, but a process is able to intercept this signal and do whatever it likes: for instance, from your Bash prompt, try hitting Ctrl-C. In Bash, it just cancels whatever you've typed and gives you a blank prompt (as opposed to quitting Bash)
Control+Zsends SIGTSTP to a foreground application, effectively putting it in the background on suspended mode. This is very useful when you want the application to continue its process while you are doing another job in the current shell. When you finish the job, you can go back into the application by running
fg(or %x where x is the job number as shown in jobs).
2) I want to troubleshoot my network but I don’t know how does the traceroute command work exactly?
Traceroute is a program that shows you the route taken by packets through a network. It traces the route of packets from source to destination. It is commonly used when your network doesn’t work as well and you want to examine where can be the problem. Traceroute sends a UDP packet to the destination taking advantage of ICMP’s messages. ICMP has two types of messages: error-reporting messages and query messages. Query messages are generally used to diagnose network problems (the ping tool uses ICMP’s query messages). The error-reporting messages as the name suggest report errors if any in the IP packet; it uses Destination unreachable and Time exceeded errors message. It works by theses steps:
- Traceroute creates a UDP packet from the source to destination with a TTL(Time-to-live) = 1
- The UDP packet reaches the first router where the router decrements the value of TTL by 1, thus making our UDP packet’s TTL = 0 and hence the packet gets dropped.
- Noticing that the packet got dropped, it sends an ICMP message (Time exceeded) back to the source.
- Traceroute makes a note of the router’s address and the time taken for the round-trip.
- It sends two more packets in the same way to get an average value of the round-trip time. Usually, the first round-trip takes longer than the other two due to the delay in ARP finding the physical address, the address stays in the ARP cache during the second and the third time and hence the process speeds up.
- The steps that have occurred up til now, occur again and again until the destination has been reached. The only change that happens is that the TTL is incremented by 1 when the UDP packet is to be sent to next router/host.
- Once the destination is reached, Time exceeded ICMP message is NOT sent back this time because the destination has already been reached.
- But, the UDP packet used by Traceroute specifies the destination port number to be one that is not usually used for UDP. Hence, when the destination computer verifies the headers of the UDP packet, the packet gets dropped due to the improper port being used and an ICMP message (this time – Destination Unreachable) is sent back to the source.
- When Traceroute encounters this message, it understands that the destination has been reached. Even the destination is reached 3 times to get the average of the round-trip time.
3) NSCD sometimes die itself and DNS resolving doesn't happen properly. How can we avoid NSCD for DNS and there is a disadvantage to bypass it?
nscd is a daemon that provides a cache for the most common name service requests. When resolving a user, group, host, service..., the process will first try to connect to the nscd socket (something like
If nscd has died, the connect will fail, and so nscd won't be used and that should not be a problem.
If it's in a hung state, then the connect may hang or succeed. If it succeeds the client will send its request (give IP address for www.google.com, passwd entries...). Now, you can configure nscd to disable caching for any type of database (for instance by having enable-cache hosts no in
/etc/nscd.conf for the hosts database).
However, if nscd is in a hung state, it may not be able to even give that simple won't do answer, so that won't necessarily help. nscd is a caching daemon, it's meant to improve performance. Disabling it would potentially make those lookups slower. However, that's only true for some kind of databases. For instance, if user/service/group databases are only in small files (
/etc/services), then using nscd for those will probably bring little benefit if any. nscd will be useful for the hosts database.
4) How can I redirect both stderr and stdin at once?
command > file.log 2>&1 : Redirect stderr to "where stdout is currently going". In this case, that is a file opened in append mode. In other words, the
&1 reuses the file descriptor which stdout currently uses.
command 2>&1 | tee -a file.txt
5) What is the difference between /dev/random and /dev/urandom to generate random data?
The Random Number Generator gathers environmental noise from device drivers and other sources into entropy pool. It also keeps an estimate of Number of bits of noise in entropy pool. It is from this entropy pool, random numbers are generated.
/dev/random will only return Random bytes from entropy pool. If entropy pool is empty, reads to /dev/random will be blocked until additional environmental noise is gathered. This is suited to high-quality randomnesses, such as one-time pad or key generation.
/dev/urandom will return as many random bytes as requested. But if the entropy pool is empty, it will generate data using SHA, MD5 or any other algorithm. It never blocks the operation. Due to this, the values are vulnerable to theoretical cryptographic attack, though no known methods exist.
For cryptographic purposes, you should really use
/dev/random because of nature of data it returns. Possible waiting should be considered as an acceptable tradeoff for the sake of security, IMO. When you need random data fast, you should use
/dev/urandom of course.
Both /dev/urandom and /dev/random are using the exact same CSPRNG (a cryptographically secure pseudorandom number generator). They only differ in very few ways that have nothing to do with “true” randomness and /dev/urandom is the preferred source of cryptographic randomness on UNIX-like systems.
6) How to recover from a chmod -R 000 /bin?
If the chmod binary was set to 000, how would you fix it? You can face a problem with /bin/chmod permission denied so you will be not able to apply for permissions. There is a method to recover it by reinstalling coreutils
# ls -ls /bin/chmod 60 -rwxr-xr-x 1 root root 58584 Nov 5 20:46 /bin/chmod # chmod 000 /bin/chmod # ls -ls /bin/chmod 60 ---------- 1 root root 58584 Nov 5 20:46 /bin/chmod # uname -a Linux centos-01 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux # rpm --query centos-release centos-release-7-3.1611.el7.centos.x86_64 [[email protected] ~]# ls -l total 12 drwxr-xr-x 2 root root 4096 Apr 6 22:54 linox drwxr-xr-x 2 root root 4096 Apr 6 22:54 pac drwxr-xr-x 2 root root 4096 Apr 6 22:54 utils # chmod 640 linox -bash: /usr/bin/chmod: Permission denied # wget http://mirror.centos.org/centos/7/os/x86_64/Packages/coreutils-8.22-18.el7.x86_64.rpm --2017-04-06 23:23:44-- http://mirror.centos.org/centos/7/os/x86_64/Packages/coreutils-8.22-18.el7.x86_64.rpm Resolving mirror.centos.org (mirror.centos.org)... 18.104.22.168, 2604:eb80:1:4::10 Connecting to mirror.centos.org (mirror.centos.org)|22.214.171.124|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 3412144 (3.3M) [application/x-rpm] Saving to: ‘coreutils-8.22-18.el7.x86_64.rpm’ 100%[============================================================================================================>] 3,412,144 4.84MB/s in 0.7s 2017-04-06 23:23:45 (4.84 MB/s) - ‘coreutils-8.22-18.el7.x86_64.rpm’ saved [3412144/3412144] # rpm -Uvh --force coreutils-8.22-18.el7.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:coreutils-8.22-18.el7 ################################# [100%] # ls -ls /bin/chmod 60 -rwxr-xr-x 1 root root 58584 Nov 5 20:46 /bin/chmod # chmod 640 linox # ls -ls total 3348 3336 -rw-r--r-- 1 root root 3412144 Nov 20 17:26 coreutils-8.22-18.el7.x86_64.rpm 4 drw-r----- 2 root root 4096 Apr 6 22:54 linox 4 drwxr-xr-x 2 root root 4096 Apr 6 22:54 pac 4 drwxr-xr-x 2 root root 4096 Apr 6 22:54 utils
Reinstalling coreutils also works on Apt based systems.
7) What is the difference between tar and zip ?
Sometimes sysadmins Linux need to save data safety and to this, it is recommended to compress the data. We have some methods or commands for compression on Linux. So frequently asked questions could be why should I use this command instead of another one example, why should I use tar instead of zip. To answer this, you should know the difference between the two.
tar is only an archiver whereas zip is an archiver and compressor. Tar uses gzip and bzip2 to achieve compression. With using tar command, we preserve metadata information of file and directories like seiuid, setgid and sticky bit information which are very important while zip doesn't preserve theses information. It is very important for criticals information. Other advantages of using tar is the fact that it assembles all the files into a single file to compress directly while zip compress file by file.
8) How to check open ports on a remote server without netcat or nmap linux command?
In the work of sysadmin, we can sometimes want to check open ports on our remote server. But if we are on a machine where can not install nmap or we don't have the possibility to install a tool which can help us to check open ports, what could we do?
We can check it with bash using
/dev/udp to open a TCP or UDP connection to the associated socket. The command behavior is:
$ echo > /dev/tcp/$host/$port
we can associate a message to display if the port is opened
$ echo > /etc/tcp/126.96.36.199/53 && echo "OPEN PORT" OPEN PORT $ echo > /dev/tcp/188.8.131.52/80 && echo "GOOD" || echo "NOT OPEN" -bash: connect: Connection timed out -bash: /dev/tcp/184.108.40.206/80: Connection timed out NOT OPEN
9) systemd over init system, What do you think?
Systemd is well designed. It was conceived from the top, not just to fix bugs, but to be a correct implementation of the base system services. A systemd, may refer to all the packages, utilities and libraries around daemon. It was designed to overcome the shortcomings of init. It itself is a background process which is designed to start processes in parallel, thus reducing the boot time and computational overhead. It has a lot other features as compared to init while Sysvinit was never designed to cope with the dynamic/event-based architecture of the current Linux kernel. The only reason why we still use it today is the cost of a migration.
- Systemd ships a growing number of useful, unified command-line interfaces for system settings and control (timedatectl, bootctl, hostnamectl, loginctl, machinectl, kernel-install, localectl). In Debian, they use the existing configuration files without breaking compatibility.
- Systemd makes the boot process much simpler, entirely removing the need to specify dependencies in many cases thanks to D-Bus activation, socket activation, file/inotify activation and udev integration.
- Systemd supports SELinux integration while SysV doesn't
- Systemd can handle the boot process from head to toe, without needing to use any of the existing shell scripts. Systemd extends the logging features of the system in many ways with journald, and can remain integrated with the existing rsyslog daemon. Logs are in a structured format, attributed to filename, line of code, PID and service. They include the early boot (starting from initramfs). They can be quickly filtered and programmatically accessed through an efficient interface.
- Systemd unit files, unlike SysV scripts, can usually be shipped by upstream, or at least shared with other distributions (already more than 1000 existing unit files in Fedora) without any changes, the Debian specifics being handled by systemd itself.
- Systemd is incredibly fast (1 second to boot). It was not designed with speed in mind, but doing things correctly avoids all the delays currently incurred by the boot process.
- The transition plan is easy, since existing init scripts are treated as first-class services: scripts can depend (using LSB headers) on units, units can depend on scripts. More than 99% of init scripts can be used without a modification.
It is not just init. It unifies, in fewer lines of code, everything that is related to starting services and managing session groups: user login, cron jobs, network services (inetd), virtual TTY management… Having a single system to handle all of that allows us to remove a lot of cruft, and to use less memory on the system.
10) What basics measures could you take to secure an ssh connection?
For Linux sysadmins, it is frequent to access servers by ssh. But are we sure the communication established is really good secured? There some additionals very simple steps that can be taken to initially harden the SSH service, such as:
- Disabling root login, and even password-based logins will further reinforce the security of the server.
- Disabling password-based logins and allow key based logins which are secured but can be taken further by restricting their use from only certain IP addresses.
- Changing the standard port to something other significantly decreases random brute force attempts from the internet
- Forcing the service to use only version 2 of the protocol will introduce both security and feature enhancement.
- The whitelist approach can be taken, where only the users that belong to a certain list can log in via SSH to the server.
11) What is LVM and does it required on Linux servers?
LVM is a logical volume manager. It requires to resize filesystem size. This size can be extended and reduced using
lvreduce commands respectively. You can think of LVM as dynamic partitions, meaning that you can create/resize/delete LVM partitions from the command line while your Linux system is running: no need to reboot the system to make the kernel aware of the newly-created or resized partitions. LVM also provides:
- You can extend over more than one disk if you have more than one hard-disk. They are not limited by the size of one single disk, rather by the total aggregate size.
- You can create a (read-only) snapshot of any LV (Logical Volume). You can revert the original LV to the snapshot at a later time, or delete the snapshot if you no longer need it. This is handy for server backups for instance (you cannot stop all your applications from writing, so you create a snapshot and backup the snapshot LV), but can also be used to provide a "safety net" before a critical system upgrade (clone the root partition, upgrade, revert if something went wrong).
- you can also set up writeable snapshots too. It allows you to freeze an existing Logical Volume in time, at any moment, even while the system is running. You can continue to use the original volume normally, but the snapshot volume appears to be an image of the original, frozen in time at the moment you created it. You can use this to get a consistent filesystem image to back up, without shutting down the system. You can also use it to save the state of the system, so that you can later return to that state if you mess things up. You can even mount the snapshot volume and make changes to it, without affecting the original.
12) What is umask and how can it be helpful on linux server?
When user create a file or directory under Linux or UNIX, it created with a default set of permissions. In most case, the system defaults may be open or relaxed for file sharing purpose. The user file-creation mode mask (umask) is used to determine the file permission for newly created files. It can be used to control the default file permission for new files.
It acts as a set of permissions that applications cannot set on files. It's a file mode creation mask for processes and cannot be set for directories itself. Most applications would not create files with execute permissions set, so they would have a default of
666, which is then modified by the umask.
As you have set the umask to remove the read/write bits for the owner and the read bits for others, a default such as
777 in applications would result in the file permissions being
133. This would mean that you (and others) could execute the file, and others would be able to write to it.
If you want to make files not be read/write/execute by anyone but the owner, you should use a umask like
077 to turn off those permissions for the group & others.
The default umask on Ubuntu is
0022 which means that newly created files are readable by everyone, but only writable by the owner:
# umask 0022 # touch file # ls -l total 3340 -rw-r--r-- 1 root root 3412144 Nov 20 17:26 coreutils-8.22-18.el7.x86_64.rpm -rw-r--r-- 1 root root 0 Apr 7 04:00 file # umask 133 # umask 0133 # touch new-file # ls -l total 3336 -rw-r--r-- 1 root root 3412144 Nov 20 17:26 coreutils-8.22-18.el7.x86_64.rpm -rw-r--r-- 1 root root 0 Apr 7 04:00 file -rw-r--r-- 1 root root 0 Apr 7 04:00 new-file
13) There is two command to schedule automated task, why should I use cron instead of anacron? What is the difference between the two?
When we schedule task in cron jobs, we have the possibility to use cron and anacron. But there is a frequently asked question which should be and what are the particularity of the two commands?
anacron are daemons that can schedule execution of recurring tasks to a certain point in time defined by the user. The main difference between
anacron is that the former assumes that the system is running continuously. If your system is off and you have a job scheduled during this time, the job never gets executed.
On the other hand
anacron is 'anachronistic' and is designed for systems that are not running 24x7. For it to work
anacron uses time-stamped files to find out when the last time its commands were executed. It also maintains a file
/etc/anacrontab just like
cron does. In addition,
cron.daily runs anacron everyhour. Also,
anacron can only run a job once a day, but
cron can run as often as every minute so
crongives minimum granularity in minute while
Anacrongives it in days
- Cron job can be scheduled by any normal user while Anacron can be scheduled only by the super user (the superuser is a special user account used for system administration.
Cronexpects the system to be up and running while the
Anacrondoesn’t expect the system to be up and running all the time. In case of Anacron, if a job is scheduled and the system is down that time, it will execute the job as soon as the system is up and running so
cronis ideal for servers while
anacronis ideal for desktops and laptops.
cronshould be used when you want a job to be executed at a particular hour and minute while
Anacronshould be used in when the job can be executed irrespective of the hour and minute.
14) What is an inode?
File systems, in general, have two parts: the metadata or the “data” about the data and the data itself. Metadata consist of information about the data. More precisely it includes information such as the Access Control List (ACL), the date the file was modified, file owner, file permissions, size of file, device ID, uid of the file, etc. This type of information is key to a file system otherwise we just have a bunch of bits on the storage media that don’t mean much. Inodes store this metadata information and typically they also store information about where the data is located on the storage media.
In a file system, inodes consist roughly of 1% of the total disk space, whether it is a whole storage unit (hard disk, thumb drive, etc.) or a partition on a storage unit. The inode space is used to track the files stored on the hard disk. The inode entries only points to these structures rather than storing the data. Each entry is 128 bytes in size.Space for Inodes is allocated when the operating system or a new file system is installed and when it does its initial structuring. So this way we can see that in a file system, the aximum number of Inodes and hence maximum number of files are set. Now, the above concept brings up another interesting fact. A file system can run out of space in two ways:
- No space for adding new data is left
- All the Inodes are consumed.
To get a listing of an inode number, use
ls -i command.
ls -li total 3336 57741 -rw-r--r-- 1 root root 3412144 Nov 20 17:26 coreutils-8.22-18.el7.x86_64.rpm 57725 -rw-r--r-- 1 root root 0 Apr 7 04:00 file 57736 -rw-r--r-- 1 root root 0 Apr 7 04:00 new-file # ls -li new-file 57736 -rw-r--r-- 1 root root 0 Apr 7 04:00 new-file # find /root -inum 57736 /root/new-file
15) When you get a "filesystem is full" error, but 'df' shows there is free space, what is the problem?
It is possible that we have free storage space but still we cannot add any new data in file system because all the Inodes are consumed. the
df -i command will show that. This may happen in a case where file system contains very large number of very small sized files. This will consume all the Inodes and though there would be free space from a Hard-disk-drive point of view but from file system point of view no Inode available to store any new file.
A storage unit can contain numerous small files. Each file takes up 128 bytes of the inode structure. If the inode structure fills up before the data storage of the disk, no more files can be copied to the disk. Once inode storage is freed in the structure, the storage unit can have files written to it again.
# touch test-file touch: cannot touch 'test': no space left on device # df -Th Filesystem Type Size Used Avail Use% Mounted onFilesystem udev devtmpfs 3.9G 0 3.9G 0% /dev tmpfs tmpfs 788M 10M 778M 2% /run /dev/sda6 ext4 44G 24G 18G 59% / /dev/sda7 ext4 103G 74G 24.0G 71% /home /dev/sda2 vfat 95M 29M 67M 31% /boot/efi # df -i Filesystem Inodes IUsed IFree IUse% Mounted on udev 1002898 650 1002248 1% /dev tmpfs 1008079 1128 1006951 1% /run /dev/sda6 2875392 617635 2257757 22% / /dev/sda7 6815744 80342 6735402 100% /home
you can see that in /dev/sda7 we have available space but because of inodes full, we can't create any file on the disk.
There is a lot of questions that can face most Linux sysadmins and that is a guide to answer to theses questions. You can face it during an interview or when troubleshooting. For a better answer, you need to understand the functioning provided in this topic. You can add comments below if you find some question which is mostly asked.