Learning deep packet inspection from RETN

News: Retn backbone provider , despite the fact that it is backbone, filters traffic through DPI. Since the operator is a backbone operator, and, in particular, is engaged in the delivery of foreign traffic, we have censorship at the output for many providers, including those who wanted to check out all sorts of "forbidden lists", but have RETN in uplinks.

DPI is the collective name of technologies in which equipment “crawls” into traffic inside, that is, it reacts not only to packet headers of different levels, but also to content.

To avoid interference, the test was carried out from several cities and from several providers, which eliminated the factor of local provider filtering (the second, indirect test was based on the use of TTL scanning, which always pointed to the RETN area).

Let's see exactly how RETN implements DPI, zealously enforcing laws to protect drugs from suicidal pornographic children who violate copyright laws.

We take as a basis the well-known magazine Stervozinka (it is widely known only for being blocked, blocked for some kind of absurdity, moreover, it has been blocked for a long time, and is not going to crawl out of the ban).

The address of this post is included in the list of prohibited for viewing by citizens of the Russian Federation. In this regard, I carried out irreversible manipulations with the domain name of the journal so that there is no reliable and unambiguous algorithm for reversing the resulting hash function.

Let's look at the everyday symptoms of the problem:
Next to the console, run tcpdump host stervozzinka.dreamwidth.og

 17:17:14.376828 IP local.49510 > dreamwidth.og.http: Flags [P.], seq 1:136, ack 1, win 115, options [nop,nop,TS val 11199749 ecr 1627034663], length 135 17:17:17.924801 IP local.49510 > dreamwidth.og.http: Flags [P.], seq 1:136, ack 1, win 115, options [nop,nop,TS val 11200636 ecr 1627034663], length 135 17:17:18.068805 IP local.49509 > dreamwidth.og.http: Flags [P.], seq 1:136, ack 2, win 115, options [nop,nop,TS val 11200672 ecr 1627029045], length 135 


The same seq is a sign of an oversubscribed segment. But we cannot understand where they are blocking (on receiving a response or sending a request). But we can see for sure that they are blocking, because just so TCP segments do not resend.

Let's switch from the masterful wget to something simpler to precisely control what is being sent:
This will not advance us in any way, but it will give us some freedom to experiment with the headers. The specified request is also blocked.

But for the variations (which are violation of the RFC, but varnish'em from the side of dreamwidth are processed normally), we can see some features:

  • GET /15580.html HTTP/1.1\nHost: stervozzinka.dreamwidth.og\n" (two spaces after GET) - lets
  • GET /15580.html HTTP/1.1\nHost: stervozzinka.dreamwidth.og\n (two spaces before HTTP / 1.1 - does not let
  • get /15580.html HTTP/1.1\nHost: stervozzinka.dreamwidth.og\n (get in small letters) - lets
  • GET /15580.html HTTP/1.1\nIgnore:me\nHost: stervozzinka.dreamwidth.og\n (extra header between GET and HOST) - doesn't let


The preliminary conclusion is boring and primitive exact matching. If so, how does it understand what the contents of the package are and what is not?

So ...
echo -e "GET /15580.html\n\nHost: stervozzinka.dreamwidth.og\n"|nc stervozzinka.dreamwidth.og 80 - does not let.

For those who did not understand - I put two line feeds after GET, that is, Host already refers to the body, and not to the header. And I removed HTTP / 1.1, that is, it is plain HTTP 1.0, which does not have a Host header, that is, we requested /15580.html from the server without specifying hostname.

Note that a request without hostname works:
In other words, we see that DPI checks for something completely out of place - the presence of Host in BODY. As a result, requests are dropped that have nothing to do with the blocked site.

Let's complicate the experiment:
echo -e “POST / \ n \ nAnd you know that they ban on the content? For example:
Oh, oh, oh. We were banned from sending POST with innocent content. This POST did not reach the server. Can not be?

Let's check and send the post using more cultural methods:
Our assumption is that the filter requires both lines in one package and does not check its validity:
Passes. This means that the packet must have both headers. (yes, if we write such a request to a server with Host: in the header, it will go in a different package, then maybe we can break through censorship.
Another check: do people check the port number?
(empty answer)
(timeout)

Not. The traffic to the 443rd port is filtered with the same success (they skip the usual one, drop the “forbidden” one).

One more check: are they filtered by IP? We find the neighboring (from the same segment) open IP that responds to port 80. Let it go.

Summary



Conditions for package drop:
  • On any TCP port (UDP was not checked)
  • With any flags
  • By the actual presence in the package (in any order) of the lines
    • GET /15580.html
    • Host: stervozzinka.dreamwidth.og
  • Src_IP match with the ban in the list


Thus, it is more like a packet filter with a search for signatures without regexps in passing packets, and not a real DPI at all.