Monday, December 28, 2015

Digging more into NoSQL- Explaining Redis

In this era of Big Data, the first thing we need is an in-memory data structure. In this post, I will elaborate a useful in-memory store, called Redis.
As they say If you can map a use case to Redis and discover you aren't at risk of running out of RAM by using Redis there is a good chance you should probably use Redis.

What is Redis?
A simplest answer can be, its only a data structure server. It can be differentiated from MongoDB, which is a disk-based document store. Although MongoDB can be used as a key-value store too.

Those two additions may seem pretty minor, but they are what make Redis pretty incredible.Persistence to disk means you can use Redis as a real database instead of just a volatile cache. The data won't disappear when you restart, like with memcached.

The additional data types are probably even more important. Key values can be simple strings, like one will find in memcached, but they can also be more complex types like Hashes, Lists (ordered collection, makes a great queue), Sets (unordered collection of non-repeating values), or Sorted Sets (ordered/ranked collection of non-repeating values).

The entire data set in Redis, is stored in-memory so it is extremely fast, often even faster than memcached. Redis had virtual memory, where rarely used values would be swapped out to disk, so only the keys had to fit into memory, but this has been deprecated. Going forward the use cases for Redis are those where its possible (and desirable) for the entire data set to fit into memory.

Redis becomes fantastic choice if you want a highly scalable data store shared by multiple processes, multiple applications, or multiple servers. As just an inter-process communication mechanism it is tough to beat. The fact that you can communicate cross-platform, cross-server, or cross-application just as easily makes it a pretty great choice for many many use cases. Its speed also makes it great as a caching layer.

For more info :

Wednesday, December 23, 2015

How a process is Deamon-ized in Linux

A Linux process works either in foreground or background.

A process running in foreground can interact with the user in front of the terminal. To run a.out in foreground we execute as shown below.
When a process runs in the foreground, it can interact with the users in front of the terminal. To run, we execute the command a.out.
 However, for a background process, it runs without any interaction of any user’s interaction. But obviously a user can check its current status, though he doesn’t know or might be doesn’t need to know what it is doing. 
The command is similar to the other, with some changes:
$ ./a.out  &  
[1] 8756
As shown above when we run a process with '&' at the end, then the process runs in background and returns the process id (8756 in above example).

Now back to the current topic.

What is actually a DAEMON Process?
A 'daemon' process is a process that runs in the background, begins execution at startup 
(not necessarily), runs forever, usually do not die or get restarted, waits for requests to arrive and respond to them and frequently spawn other processes to handle these requests.

So running a process in BACKGROUND with a while loop logic in code to loop forever makes a Daemon ? Yes and also No. But there are certain things to be considered when we create a daemon process. 

Let's follow a step-by-step procedure to create a daemon process.

1. Create a separate child process - fork() it.

fork() system call create a copy of our process(child), then let the parent process exit. Once the parent process exits the Orphaned child process will become the child of init process (this is the initial system process, in other words the parent of all processes). As a result our process will be completely detached from its parent and start operating in background.

pid=fork();    if (pid<0) exit(1); /* fork error */    if (pid>0) exit(0); /* parent exits */    /* child (daemon) continues */  

2. Make child process In-dependent - setsid()

Before we move to check how we are going to make a child process independent, let  talk a bit about Process group and Session ID.

A process group denotes a collection of one or more processes. Process groups are used to control the distribution of signals. A signal directed to a process group is delivered individually to all of the processes that are members of the group. 

Process groups are themselves grouped into sessions. Process groups are not permitted to migrate from one session to another, and a process may only create new process groups belonging to the same session as it itself belongs to. Processes are not permitted to join process groups that are not in the same session as they themselves are.

New process images created by a call to a function of the exec family and fork() inherit the process group membership and the session membership of the parent process image.

A process receives signals from the terminal that it is connected to, and each process inherits its parent's controlling tty. A daemon process should not receive signals from the process that started it, so it must detach itself from its controlling tty.

In Unix , processes operates within a process group, so that all processes within a group is treated as a single entity. Process group or session is also inherited. A daemon process should operate independently from other processes.

setsid() system call is used to create a new session containing a single (new) process group, with the current process as both the session leader and the process group leader of that single process group. 

3. Change current Running Directory - chdir()

A daemon process should run in a known directory. There are many advantages, in fact the opposite has many disadvantages: suppose that our daemon process is started in a user's home directory, it will not be able to find some input and output files. If the home directory is a mounted filesystem then it will even create many issues if the filesystem is accidentally un-mounted.

The root "/" directory may not be appropriate for every server, it should be chosen carefully depending on the type of the server.

4. Close Inherited Descriptors and Standard I/O Descriptors

A child process inherits default standard I/O descriptors and opened file descriptors from a parent process, this may cause the use of resources un-neccessarily. Unnecessary file descriptors should be closed before fork() system call (so that they are not inherited) or close all open descriptors as soon as the child process starts running as shown below.

for ( i=getdtablesize(); i>=0; --i)    
close(i); /* close all descriptors */  
There are three standard I/O descriptors: 
  1. standard input 'stdin' (0),
  2. standard output 'stdout' (1),
  3. standard error 'stderr' (2).

For safety, these descriptors should be opened and connected to a harmless I/O device (such as /dev/null).

int fd; 
fd = open("/dev/null",O_RDWR, 0); 
   if (fd != -1)     
 dup2 (fd, STDIN_FILENO); 
dup2 (fd, STDOUT_FILENO);    
dup2 (fd, STDERR_FILENO);       
 if (fd > 2)   
 close (fd); 

5. Reset File Creation Mask - umask()

Most Daemon processes runs as super-user, for security reasons they should protect files that they create. Setting user mask will prevent unsecure file priviliges that may occur on file creation.
view plainprint?
This will restrict file creation mode to 750 (complement of 027).