DEFCON 22. “Mass Scanning of the Internet through Open Ports” Conference. Robert Graham, Paul McMillan, Dan Tantler

My name is Rob Graham, I am the head of Errata Security, an Internet consulting company. Today we’ll talk about how to scan the entire Internet and what it is for. Until today, there were few tools to solve this problem, so we created our own tools. The Internet is small enough - it has only about 4 billion addresses.

Scanning the Internet is quite simple - you sit down in front of the computer, start the console with the command line and enter the subnet address. And you watch how your screen is filled with data, and the lines all run and run further. As a result, you get a list of open device ports with different IP addresses.

Why scan the Internet in the context of protection? If you are concerned about security issues, you must do this to get the answer to the following questions:

  • how many computer systems are affected by the Heartbleed vulnerability (an error in cryptographic software that allows an attacker to read the client or server memory and get the server’s private encryption key)?
  • how many computer systems can be used to intensify attacks on NTP servers?
  • How many systems are at risk due to the vulnerability of D-Link routers?
  • Overview of all SSL certificates used.

Existing tools for finding vulnerabilities of specific networks and equipment are rather slow, but mass scanning allows you to obtain vulnerability characteristics of over 100,000 devices quickly enough. An important problem that needs to be addressed is the identification of equipment that is used to communicate with NTP servers during DDOS attacks. A lot of home equipment is vulnerable due to the lack of strong protection in D-link routers. Just look at the D-link network to see how many botnet systems exploit their vulnerability. Scanning SSL certificates is also useful because it identifies outdated certificates that are prone to errors and vulnerabilities. So scanning everything you can “reach out” is an important task.

Internet scanning is also needed in the context of prevention. It helps to identify Deepnet - many Internet pages that are not visible to search engines. These pages are generated at the request of users and may carry malicious information.

Try to scan random ports by running the "- banners" bulk scan command, and within a few minutes you will find that you can easily crack it.

In fact, scanning the Internet is useful because:

  • it's fun;
  • this is informative (you can see how small the Internet is by running the scan command, the Internet has only 65 thousand ports);
  • This will make you famous:
    - select a target, for example, a Siemens control system;
    - scan the Internet for it;
    - create a BlackHat Talk computer security conference for her;
    - Use the obtained privileges of an expert.

What you need to know in order to scan the Internet? First, you need to know the theoretical part of the physical infrastructure:

  • data packets have a fixed size:
    - Ethernet packets contain 44 bytes;
    - TCP SYN packets contain 40 bytes.
  • maximum speeds of 1 Gbit / s Ethernet:
    - 476 Mbit / s for real traffic;
    - 524 Mbps for Ethernet connection;
    - 1,488,000 packets per second.

This means that we overpay the provider due to the fact that it charges us for a guaranteed bandwidth, that is, for a guaranteed, rather than the actual amount of data. This is due to excess packet size. If we transmit 22 or 33 bytes, they are still packed in a packet of 40 or 44 bytes in size. The user almost never can reach the full transmission capacity given in 1 gigabit, because in reality it transfers no more than 524 megabits per second. But due to a fixed packet overflow, the data has a margin size, and this margin is not used at all. But we pay for it. Even if we have a perfectly tuned switch, we still won’t be able to use the full bandwidth of the network, and I don’t know why this happens. There is confusion in the system of paying bills for Internet services.

How are bills for paying traffic from Internet providers formed:

  • some provide us with a maximum Internet connection speed of 1 Gb / s;
  • some measure the real bandwidth of the working network, providing us with a speed of about 600 Mbps;
  • some Internet service providers do not see small packets, so they only capture incoming and not outgoing traffic. For example, we transmitted a ton of information, and paid for a few megabytes that we downloaded from the network;
  • some providers do not measure the volume of traffic at all, and this is of particular interest to us!

For example, in Germany there is a CCC club that provides users with a speed of 100 Gb / s. I could not test this network, but maybe this year I will take my 10 gigabit Ethernet card with me and check if this is really so. But the problem is that when we send packets too small, we thereby violate the existing agreements between the peers of the same network.

Consider the physical infrastructure of the network further.

Private virtual VPNs can adapt to the load of small packets. Ethernet fights small packets, and speeds over 500 Kbps are often difficult. If your switch is able to work at such a speed, this does not mean that other elements of the infrastructure can support it. In this case, disabling Flow Control data flow can help, in which the transmitter slows down the data transfer if the receiver is not ready to receive them.

In some cases, packets may be lost - transmission at 500 Kbps does not guarantee that all packets will reach the Internet. Scanning allows you to identify ports when using which packet loss is observed. You can use only those ports that provide the same receive-transmit: if you send 10 thousand packets, you will also receive 10 thousand packets. Therefore, I mainly use speeds of up to 150 Kbit / s, and sometimes even 15 Kbit / s, this allows you not to think about packet integrity.

Abuse Complaints are a big problem. This term means that someone has flagged you as a source of spam or other malicious activity. Often this happens with companies when the recipient does not want to receive letters from you anymore, but cannot unsubscribe from the mailing list because your company did not give him a link for this. It marks your mail as spam, and it harms the overall reputation. This can happen when scanning a network. You can get Abuse Complaints, and your ISP will be very upset about this. Or you violate the agreement between the feasts, you will be forbidden to play the role of a feast. However, there are much worse things:

  • A Heartbleed threat scan will generate Abuse Complaints a few weeks later, and you will still get a hit on your reputation;
  • HTTP scanning can send you to the fail2ban ban list, that is, your IP will be blocked;
  • Breaking Threat Detection Rules Snort Threat Rules can also create many Abuse Complaints complaints.

The existing network monitoring methodology tracks incoming traffic. If you use a scan, your incoming traffic will be large and you will be suspected. It is believed that this way you can track down hackers, although it’s the same as looking for keys lost in the bushes under a street lamp just because it is lighter there.

What should Internet service providers take seriously? Moreover, some networks use blackholing (“routing to nowhere” when packets of such routing are deleted due to “No route to host”) for the entire autonomous AS network.

The list of exceptions is necessary when scanning, since we do not want to scan other people's mailboxes and network segments of a private nature. Creating an exclusion list is done by setting scan parameters on the command line:

exclude =
exclude-file – exclude.ips

The important thing is to create a public exclusion list. We would really like to create a public list of security experts, but most of those who send us a request to participate in the program usually ask to remove them from this list. They are afraid that someone will find out their IP or corporate network addresses and try to crack them. Fortunately, BGP networks have all this information in the public domain, which is laid out in a rather elegant format and is accessible to everyone. People should understand that scanning the Internet will only benefit them and will not in any way affect personal information that they do not want to show to anyone. Unfortunately, most have to prove it, because they confuse scanning and hacking. Anyway, it's hard for people to believe that you can scan the entire Internet.

For example, the company has a certain network, which we scan at their request, but they also have a subnet in which important information is stored. So, when they see how the scan goes, they get scared and say: “You even scan hidden networks, you see a range of ports and addresses, so you can hack us!”

An interesting story happened six months ago. I scanned the network for one customer, and they woke him up at night with a call about an emergency conference convened due to a network hack. He called me, and I had to reassure him and explain that the scan had nothing to do with a hacker attack, they just found their vulnerabilities earlier than we did. Often, customers believe that as soon as they give us permission to scan, some security gaps immediately open, and hackers immediately get into it.

Another case was with one guy from Australia. He noticed that when scanning the network we sent him a connection request in the form of a single SYN packet, and called me, they say, who we are and on what basis we are doing this. I explained everything, told him the address of our site, where there are all the rules and regulations, said that we are doing this absolutely legally, on orders from customers. He did not want to listen to anything and began to threaten us with the Internet police that he would immediately call where we should and they would arrest us all. It’s just some crazy person who didn’t understand that if we had been engaged in scanning illegally, within an hour we would have all been caught, because we are acting completely openly.

Complainants like these are often just stupid. They do not understand that the vast majority of processes taking place on the network, all ports, routers, switches, sessions are constantly open and are not protected by any encryption. It’s just that otherwise the Internet could not work at all if permission was required for each action. If a person is afraid that his bank card information might be stolen, he’d better not use the Internet at all. And this happens against the background of the fact that people are not able to simply configure their devices to close existing gaps. They leave them open to everyone and then they are surprised that they have become the prey of hackers. I want to show you the letter we received with the following content: “The infrastructure of the Woori financial group is classified as“ equipment of a Class A national security facility ” and unauthorized access to this equipment is prohibited by applicable laws and regulations. ” This company is located in Korea, and we first learned about it from this letter because they not only sent a complaint to us, but also explained their actions by letter. I first met a whole organization that wants to have access to the Internet and at the same time secret all its equipment. Why then go to the Internet if it is impossible to do with closed ports?

An important aspect for our work is close cooperation with Internet providers. We must be friends with them, otherwise we will not succeed in scanning effectively. We offer them free consultations on Internet security, they help us adjust the list of complaints received. That is, the provider understands who and why complained about us, and rejects unfounded accusations against us. Together with them, we create the SWIP project “Who is Who on the Internet” with a list of verified IP addresses, and add to our “black list” those who insist that we be banned for scanning.

As an alternative to avoiding some confusion, you can create an anonymous virtual dedicated VPS server. It has the following advantages:

  • VPS provider can pay a small amount in bitcoins;
  • you can scan without any complaints, since you simply disconnect your account from the network after the study, for example, VPS on Linode hosting allows you to delete your account immediately after paying $ 50;
  • a sufficient number of such providers favor spammers and scammers working under the guise of virtual servers.

What does masscan technology look like?

It is similar to the nmap utility, which is designed to scan IP networks with any number of objects and determine the status of ports and their corresponding services:

  • all nmap options can be disassembled in parts, except for those about which it is said: “this nmap option is not supported”;
  • when using some tools, it is useful that the output data formats are close to nmap;
  • Many features are supported, such as scanning a transmission protocol with SCTP flow control or using the UDP user data protocol as nmap payload.

But masscan is not like nmap:

  • Port-at-a-time mode instead of Host-at-a-time mode. This means that the results for each port are reported immediately as it is detected, and these results are not combined with each other using the host. That is, our program does not need to send a request, receive a response and spend time on it. There is no need to store a billion queries and a billion responses in memory, so it works faster.
  • It works asynchronously: the transmitted array is created from requests, the received array is created from responses;
  • it scans 1000 times faster.

Nmap is the best scanner - its NSE scripting engine is very flexible, and scanning multiple hosts can be performed without problems. Masscan is designed for large networks, as this program is much faster and scales better.

Masscan has its own TCP / IP stack:

  • It works in parallel with the existing stack;
  • defaults to the same address;
  • Duplicates the ARP network layer protocol and TCP RST packets.

This is how hacker attacks are performed with spoofing, the so-called spoofing attacks. Suppose we have host A - the attacker, host V that they attack, and host O, the IP address of which the hackers want to use for attack.

Host A sends a SYN packet to host V, but the return address does not indicate its IP address, but the address of host O. The attacked host V responds to host O with a SYN / ACK packet. But host O did not send anything to host A and therefore should break the connection with an RST packet. Suppose host O did not send such a packet because it was overloaded, or turned off, or is protected by a firewall that blocks SYN / ACK packets.

If host O did not send the RST packet and did not interrupt the attack, hacker host A can interact with host V, posing as host O. Therefore, any authorizer, captcha verification, etc. becomes useless if the user’s firewall is configured incorrectly.

Thus, RST packets protect IP connections from communication, that is, they respond to packets with a SYN failure. With their help, we can protect various IP addresses from spoofing, or install a security filter for a specific range of ports.

Now let's talk about the teams that are managed by masscan.

Multiple devices are scanned in such a way that security protocols are not violated:

  • - - shard 1/50 is used when you need to scan multiple computers;
  • - - source-ip - extends the scan range to several IP addresses on the same computer;
  • - - source- ip - should not be used at all! You simply will not wait for any results, and your computer will freeze.

Sometimes, in order to avoid problems, you manually configure the TCP / IP connection:

  • - - source-ip;
  • - - source-port 4444;
  • - - router-mac 00-11-22-33-44-55 with - - router-ip

Here's what the banner validation team does:

  • establishes a TCP connection;
  • performs heuristic analysis of protocols, that is, it scans port 443 for SSH and HTTP, which the Internet addresses to this port.

I'm currently using something similar to NSE scripting, but I will soon switch to C-based programming.
You can also use load testing. This can “break through” the protection of firewalls and is therefore relevant for testing their ability to provide security. В этом случае команды — - infinite, — - banners, — - sourse-ip <диапазон > являются полезными для быстрого сканирования большого количества устройств.

Usually no one uses this, but in our program there is the possibility of using outgoing binary files, for this we use the command:

– oB foo.scan вместо –oX foo.xml

Then the conversion is performed:

masscan–readscan foo.scan –oX foo.xml

This method provides a more compact scan. In addition, if there are errors in the outgoing data, they are easier to correct in binary format.

Another useful feature is spoofing scanning. IP-spoofing consists in replacing the IP address in the body of the packet so that the response packet is intercepted by a hacker address. This technology is used by hackers to intercept traffic between hosts on Ethernet networks.

Spoofing scanning is as follows:

  • receiving a packet with one IP address, for example, a smartphone running Android;
  • received packets have low bandwidth;
  • sending packets from the data center without outgoing filters, the command - - source- ip allows you to scan spoofing of another IP address.

Here's what the scan results look like. In the first picture you see the window of our program, in the second - the result of its work.

The result of the Heartbleed threat test shows that as of April 10, vulnerability was detected in 600,000 systems, and in July 300,000 systems were still vulnerable, most of which were hardware devices, that is, computers, routers, webcams themselves and servers. That is, you will not see their vulnerability, if you check them by DNS names, only scanning by IP addresses helps. We also scanned the mainframes - large fault-tolerant servers, for example TN3270 Telnet –over-SSL through port 992. You can take a look at @ mainframed767 and see such interesting things as the authorization window for the IBM main server user.

The third picture shows the results of scanning banners. Now I will try to show you our program in action. To do this, I open the main window and set the address of the server that I want to scan using the command line. In some cases, the server does not respond.

Now Paul will try to log in with his login and demonstrate the scanning capabilities.

Paul says that if you have any questions about using the program, you can contact him directly and get the necessary explanations. As an example, Paul crawls the Internet through a VNS 5900 server, which takes 15 to 20 minutes.
The advantage of our program is the ability to obtain a list of vulnerabilities without the need for authorization on the network or on each network device. We test the system from the outside, not from the inside. Using scaling allows you to check huge arrays of the Internet network, including the cloud, and it costs less than 16 cents per hour.

Right now I am setting up a slow scan of the defcon network through port 80 at 10 packets per second, and the result immediately appears on the screen.

At the moment, we see how many open unprotected device ports with corresponding IP addresses are in the network. And a hacker can use these IP addresses for his spoofing attack.

This procedure does not interfere with the operation of the network, which is being scanned, the user can run any application. So, defcon network scan took a little more than a minute, and we identified all the existing vulnerabilities by simply scanning port 80. By setting the packet size, you can speed up or slow down the scan. We told you everything you need to know about the masscan program, and if you have questions, write to us by e-mail or twitter @erratarob and paulm .

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending it to your friends, a 30% discount for Habr users on a unique analogue of entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to divide the server? (options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper? Only we have 2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure class using Dell R730xd E5-2650 v4 servers costing 9,000 euros per penny?