For years, most companies wrote off DoS attacks as an acceptable risk because the probability of becoming a victim was relatively low, as was the risk of damage to the business. Recently, however, this class of attack has increased in popularity, causing many organizations to rethink the relative risk. CEOs are concerned about lost revenue and bad press; IT frets over crashed applications and long work hours.
While you can't prevent all DDoS attacks, there are options to limit their effectiveness and allow your organization to recover faster. Most of the recent attacks have targeted Web applications -- they simply send more requests than the targeted Web application can handle, making it difficult for visitors to use.
In such attacks, most attackers aren't concerned about whether the system and application actually crashes, though they would be happy if a crash occurs. Their main goal is to prevent services offered by the targeted company from responding to requests from legitimate users, causing problems for the victim company.
If you have the proper monitoring technology, then these attacks are easy to spot. Your network operations center (NOC) will be at status quo -- bandwidth, requests per second, and system resource usage will all trend normally. Then, either suddenly or over a small amount of time, all of these trends will shoot upward, thresholds will be reached, and alerts will be sent by the monitoring system.
In a typical organization, these events will trigger an escalation at the NOC, and the IT team will rush to get the right people involved. Management will receive notice that sites and applications are responding unusually, and all will be wondering why there is such a huge increase in requests in such a short period time.
Unless your site was just mentioned on the front page of Slashdot, then most likely you’re experiencing the start of a DDoS. Congratulations! You’re now part of a club of organizations that have been targeted for a DDoS attack -- usually because they don't like your corporate policies, or because they were paid to attack you.
The first step is to analyze the logs for requests. You want to know what is being requested and where the requests are coming from. Compare the new requests to normal traffic to determine whether this is a legitimate overload, such as the Slashdot effect, or truly an attack.
If you're smart, then you have centralized logging, so you can work from a system that’s not being flooded directly. Unfortunately, if the logging server can’t keep up, then it might be overloaded, too -- it might drop logs and fail to respond to your searches. If the attack is bringing down your application, then logs might not be sent from the affected servers to the logging system. You’ll have to start your analysis with what was transmitted at the beginning of the attack, before it was overwhelmed.
If you need to get more current logs -- and if you have redundant systems hosting the targeted application -- bring one out of the pool temporarily, grab the logs, and place it back in the cluster. This could temporarily increase the attack's effects -- but without the logs, you will be chasing a ghost.
Once you have the logs, break out your favorite sorting tools and parsing tools. I like grep and sed. You’ll want to parse the logs and look for a few key things that you'll use to understand and respond to the attack.
First, you need to identify how the attack is being conducted. Is the attack sending data to a port that is not open on a remote system, thus consuming firewall resources? Is it a Syn flood, requesting a specific URL over and over? Or is the attack simply sending a basic "Get / HTTP/1.1" request to the Web server?
Through enterprise monitoring, it should be possible to identify where the load is being applied. If your firewalls are spiking but the Web servers are not, then that indicates the first type of attack. If the Web servers are falling behind on processing requests -- and the firewalls are handling the load, but possibly at a higher resource utilization level than normal -- then the Web application is being attacked directly.
Once the attack vector is known, you'll want to work with the logs to determine a few key factors.
By viewing the logs, you can determine what the attacking tool is doing. Regardless of whether the requests are to a Web application or being processed by a firewall, the data you are looking for is the same.
First, you want to identify the offending requests. Looking through the logs should make it very clear what’s part of the attack. There should be a high number of similar requests or a pattern of requests grouped together. For instance, there might be 10,000 requests attempting to access a URL, or there might be a port that is not valid.
In some cases, distributed tools might vary their requests. But, generally, you’ll see requests for the same resource, from the same source, all grouped together -- such as requests for a URL that doesn’t exist repeated over and over in the logs. Identify the request the attack is using. If it is the same across all attacking nodes, you will be able to fingerprint attackers and differentiate them from legitimate traffic.
Once you understand the request pattern, you must identify the attackers. Find the attacking nodes sending the highest volume and fastest requests. These are the big offenders, and you’ll want to neutralize them as best you can to reduce the effectiveness of the attack. Once you know the most common request and its sources, you can begin to react.
In the heat of the attack, I have often heard the suggestion of renaming the affected resource -- such as a URL or hostname. This will only the attacker; they will just refactor the attack once they figure out what you’ve done, or they will target a new resource.
This strategy works in cases where the attack calls a legitimate Web application URL, which executes some resource-intensive job, such as a large database query. In these cases, modifying the application, implementing a confirmation screen, or implementing a redirect that the attacker’s tool can’t understand (such as a CAPTCHA or Flash application with a user confirmation and redirect) can reduce the effectiveness of the attack. Unfortunately, in most cases, the attacker will just change his attack.
Once you've tried these initial steps, your next level of defense will be source filtering, connection, and rate limiting. If we can block the largest offenders and slow the rest, then we can decrease the effectiveness of the attack. To be effective, the attackers' nodes must be able to outperform the number of requests the members of our production cluster can handle in any given amount of time. If we can block some attacking nodes, then we can reduce the load on our systems, giving us time to either block more, notify the ISPs of big offenders, move services, or wait out the attack.
To protect as much of your infrastructure as possible, you want to apply filters as close to your network edge as possible. If you can convince your ISP or data center that it’s in their best interest to help, applying filters on their equipment will provide the best results. Their equipment is before yours in the chain, it's faster, and it can handle the overhead of more filtering.
If you must apply filters on your equipment -- or if you need to start while you wait on a response from your upstream providers -- start at your edge device and work backward. Use everything you have to filter the requests and minimize impact on your systems. Routers, load balancers, IPS, Web application firewalls, and even the systems themselves can play a part in denying requests.
The first filter should drop all connections from the offenders that are sending the most requests. Apply this rule to the top of your access control list (ACL). Of course, your edge devices should not accept traffic addressed to their interfaces, and they shouldn’t respond to packets. If they do either, they will become additional targets for the attackers.
Next, you'll want to rate-limit connections based on the source. This will drop any new connections from a host once they have reached the limit of connections allowed in the time period. Review your logs and set this limit lower than the average number of requests per interim being sent.
If you’re unsure which log entries are attackers and which are legitimate, use the request rate from your biggest offenders as a starting point. Since the big offenders might have been sending requests faster than the average, you might still be allowing too many requests. Review and tweak as needed.
In addition, you can tweak hosts and edge devices to expire idle sessions faster than normal to free resources. But be careful not to be too aggressive -- you might consume too many resources building and tearing down connections.
With source blocking and rate limiting, you’ve done the big things you can to mitigate the attack. If the attack is still ongoing and there isn’t an end in sight, the team should move to protect the rest of the business where possible.
Typically, attackers will target a specific host, application, or network. Routing traffic destined for these resources to another location -- or dropping this traffic altogether -- will save load on back-end systems, but it could also ensure that your services go down, which was the attacker's goal.
Separating the service being attacked from other services untouched is also a good idea. There is no reason your business should be completely down if you can separate the affected services from unrelated services.
If you’re offering services via the Internet, then you’re a potential DDoS target. The likelihood that such an attack will hit your organization depends on your business practices, the whim of attackers, and who is angry with your company. The best approach to mitigating attacks is to ensure you have adequate capacity, redundant sites, separation of business services, and a plan to deal with an attack when it occurs.
Have a comment on this story? Please click "Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.