PHP IPC - Interprocess Communication in PHP


The purpose of this article is to familiarize PHP developers with the capabilities of interprocess communication in this language. Note does not intend to tell in every detail about each of the features, implementation details or show working code examples.

Since any programmer sooner or later has the task of parallelization, this note was conceived as a starting point from which you can begin your journey into the world of an exciting hemorrhoids process of building such systems.



Also, the topic of multithreading in PHP will be affected, moreover, precisely threads ( Thread ), since until recently, PHP allowed (conveniently) to implement any parallelization only thanks to Fork (we will not touch on perversions like curl_multi_ *, popen \ fsockopen, Apache MPM, etc.). The topic will be considered only in the context of IPC, I leave the search for details of the implementation of a particular approach to readers.

The narration will be conducted in the context of software running on a single computer, and IPC within a single computer. This is due to the fact that interprocess communication in distributed systems is a very, very extensive topic. Therefore, all kinds of message brokers, databases, Pub \ Sub, and other "intercomputer" approaches will not be considered, moreover, they are covered on other resources on the Web.

In view of all of the above, some preparation is required from the reader, since the terminology will be explained only at key points, however, the text is plentifully provided with links to necessary articles of PHP documentation, Wikipedia, etc., as well as the understanding that many things are on purpose simplified due to the format of this material.

What does IPC look like?


So what is interprocess communication?

Межпроцессное взаимодействие — (англ. Inter-Process Communication, IPC ) — набор способов обмена данными между множеством потоков в одном или более процессах. Процессы могут быть запущены на одном или более компьютерах, связанных между собой сетью. IPC-способы делятся на методы обмена сообщениями, синхронизации, разделяемой памяти и удаленных вызовов (RPC). Методы IPC зависят от пропускной способности и задержки взаимодействия между потоками и типа передаваемых данных.

Describe like IPC looks like!


The outline is as follows:
0. PCNTL
1. Sockets
2. Shared Memory
3. Semaphore, Shared Memory and IPC
4. pthreads

0. PCNTL


The extension implements the most basic functionality for working with processes, but we are interested in working with UNIX signals , and more specifically, the pcntl_signal function , which allows you to install a signal handler. This approach is the least functional of all considered, since it does not allow data transfer. Using this extension, you can, for example, organize the start / stop of workers, or read tasks from the buffer (file, database, memory, etc.), or signal one part of the system to another about an event.

The most easily implemented, there are many examples, and the possibilities in the application, it can often be more than enough for some kind of not very difficult tasks.

1. Sockets

Сокеты — (англ. socket — разъём) — название программного интерфейса для обеспечения обмена данными между процессами. Процессы при таком обмене могут исполняться как на одной ЭВМ, так и на различных ЭВМ, связанных между собой сетью. Сокет — абстрактный объект, представляющий конечную точку соединения.

Следует различать клиентские и серверные сокеты. Клиентские сокеты грубо можно сравнить с оконечными аппаратами телефонной сети, а серверные — с коммутаторами. Клиентское приложение (например, браузер) использует только клиентские сокеты, а серверное (например, веб-сервер, которому браузер посылает запросы) — как клиентские, так и серверные сокеты.


Perhaps this is the most obvious and best-known way to implement IPC, however, and the most time-consuming. The first option is to create a broker (socket server), and client threads to connect to it. Here you will find the fascinating world of debugging non-blocking input / output (and how would you like to write blocking code?), As well as the implementation of many trivial things like wrappers over extension functions. The second option is simpler, it can be used for simpler implementations: create_socket_pair , which creates a pair of connected sockets, an example is available by reference.

Using sockets to implement IPC requires a rather serious approach and lighting up manuals, but the pluses include the ability to distribute system elements to different servers in the future, without resorting to significant code changes. Also, the advantage of this technology is its versatility: for example, writing a client in PHP connecting to a C-sh server is not difficult.

Also, the minuses should be canceled: the non-blocking IO mentioned above. Since the data will be received in batches, you should think carefully about the mechanism for ensuring their integrity, buffering and processing, which would not reduce all the advantages of non-blocking input / output.

2. Shared Memory

This extension allows you to fully work with virtual memory . The advantages of the approach are that it is the fastest (if speed is put at the forefront of the application) and the least resource-intensive. Moreover, its implementation is not associated with as many pitfalls as in the previous solution, and the technology itself is not difficult to digest.

There are many options for use: both the common space and the allocation of blocks individually for each thread / process, data processing is also simplified due to the clear definition of the block size. The disadvantages include some complexity in the convenient implementation of such an interaction: you have to forward the addresses of blocks to child processes (in the form of parameters, at startup pcntl_fork , using marker files etc.)
This approach is perhaps the most common and preferred, since it does not require a lot of labor in the implementation, and is more universal.

3. Semaphore, Shared Memory and IPC

This extension includes the capabilities of the previous one, however, it adds such basic resources for synchronizing resources as semaphores, another way of interacting flows, known as messaging.

Semaphores can come in handy when threads will be forced to work with some kind of shared resource, say, you wrote a firewall that, with each request, gets into a file with the IP addresses of Roskomnadzor and does street magic with the incoming request. The file, of course, is updated by some other service flow, therefore, its reading (or change) is unacceptable while the update process is underway, by someone else. The theory of semaphore work is simple, and there are many examples of their implementation, therefore, for those who have not yet worked with this type of locks, I recommend that you familiarize yourself with this, it will help to better understand the processes of building interaction between threads.

Messaging is a more “high-level” and more convenient solution than shared memory, but this topic is poorly covered in the context of PHP. In addition, I know of cases where this technology showed some oddities, let's say, in its work, therefore, you should carefully check and double-check the results of the code.

4. Pthreads

And so we have reached segfault's pinnacle of evolution for both IPC and multithreading in PHP.

A cool guy named Joe Watkins wrote the pthread extension , which adds support for true multithreading in PHP. Just the other day (September 8, 2013) the first stable version (0.0.45) was released, however, the author in his post on Reddit very thoroughly disclosed the topic of beta \ stable releases, therefore, do not focus on this. I strongly recommend that you study all of his comments in the subject, there is a lot of useful information about pthread.

What are the advantages? In everything. Pthreads provides an extremely simple and convenient API for implementing any of your multi-threaded fantasies. Here you and synchronize both in Java, and events, and IPC with probros objects! However, things are not so smooth with them (see examples on the github), and the author writes that this problem is not his business, however, he managed to create a miracle with socket resources, and now, socket_accept results from the main thread you can stick it in the daughter - amazing! Enough to parse the examples to understand how simple and elegantly done.

I will not describe all the features and delights of this extension, everything is on the author’s github and in the documentation on php.net
Apparently, the author is quite intensively working on his project, therefore, in the future he may have many more interesting features, stay tuned.

To run the extension, you need to build PHP in Thread-safe mode, here is a small script that will do everything for you:

Finish with a file if necessary.
mkdir /opt/php-ts && \
cd /opt/php-ts && \
wget http://www.php.net/get/php-5.5.3.tar.bz2/from/ua1.php.net/mirror -O php-5.5.3.tar.bz2 && \
tar -xf php-5.5.3.tar.bz2 && \
cd php-5.5.3/ext && \
git clone https://github.com/krakjoe/pthreads.git && \
cd ../ && \
./buildconf --force && \
./configure --disable-all --enable-pthreads --enable-maintainer-zts && \
make && \
TEST_PHP_EXECUTABLE=sapi/cli/php sapi/cli/php run-tests.php ext/pthreads  && \
alias php-ts="/opt/php-ts/php-5.5.3/sapi/cli/php"

Does he look like a Pipe?


That’s probably all. Although the language is limited in its IPC capabilities, nevertheless, it allows you to write efficient applications using various approaches for implementing interprocess communication. For those of you who are now faced with the task of implementing such interaction, I recommend that you carefully study all the methods listed in the note, since they are not interchangeable, but they complement each other effectively.

P.S. Непосредственно к теме статьи это не относится, зато очень даже применимо к некоторым описанным здесь моментам, а именно, блокировании IO и несовершенстве событийной модели: рекомендую ознакомиться с расширениями Eio и Ev (автор обеих osmanov )