By Jessica Ellis | March 12, 2020
Recently we highlighted one of the most common evasion techniques employed by threat actors in order to keep a phishing site online: geoblocking, or blocking by location. However, many other techniques exist, some that are more subtle and make it more difficult for unwanted visitors to view a site. One such method is used to thwart unintended parties – bots, analysts, hosting providers, etc. – when they are not using the appropriate device: blocking by user-agent.
What is a User-Agent?
Simply put, a user-agent is a string that is shared with a server upon a request for content. This information is sent in the request headers; a bundle of data specifying the kind of information that should be displayed to the user’s device. In practice this means if you are using a mobile device, many websites will display a mobile-friendly version of their contents. On a tablet, a website may display its content in a way that fits nicely on a tablet-sized screen, and on an older browser, you may receive a warning stating that not all items on the page can be displayed. In other cases, a bot may have a specific string to tell the webpage that it has come to crawl it for information.
For example, this is a user-agent that Google may use for their crawler:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
This string identifies the Google crawler bot to the webpage, letting it know that Google is collecting its contents. Since attackers do not usually want Google crawling, indexing, and displaying their phishing websites in search results, they can add instructions to their server to deny access to this user agent. This can be done through various types of standardized files such as the robots.txt or .htaccess files, or through custom server code. Many phish kits – prepackaged phishing websites – contain intricate .php files with hundreds of lines of specific user-agents to allow and disallow.
Slowing Down Bots
PhishLabs’ proprietary crawler inspects more than a million data points each day. Like our crawler, there are hundreds of others that branch out through the web, seeking out malicious content for further analysis. This is one of the most efficient ways to collect intelligence, determine if it’s actionable, and then present it to analysts, enabling them to proactively mitigate attacks.
Threat actors are well aware of this fact, and many maintain growing lists of known crawlers that may hinder their attacks. A peek into one of these lists reveals lines and lines of – you guessed it – user-agents. If a bot operating under one of these comes across the malicious site, it will usually be redirected somewhere else, shown benign content, or be told that the page is offline.
As last year’s Phishing Trends and Intelligence report indicated, mobile phishing has increased year-over-year with no signs of slowing. This is due to continued increase in mobile adoption across the globe. As previously mentioned, many legitimate websites have capitalized on this trend by presenting a mobile-friendly version of themselves to any request using a mobile user-agent. Additionally, websites often opt to display different versions of their mobile content to different kinds of mobile devices. Access the site from an iPhone and you may receive a prompt to go to the App Store for the official mobile app. Accessing from an Android may give a similar notification, but for the Play Store instead.
Threat actors have seen an opportunity with this trend as well. If attackers know that each “Your account has been locked!” lure related to a phishing page will be sent through an SMS message, it is prudent for them to add in some user-agent blocking to the server in order to block access to requests that come from a PC. This simple technique can greatly boost the fidelity of phishing site views, decreasing server expenses and thus increasing an attacker’s return on investment.
The Inception Bar
One of the more clever uses of user-agent blocking takes advantage of a convenience feature found within most mobile devices. Using what James Fisher dubbed the Inception Bar, attackers have undermined one of the easiest ways to determine the legitimacy of a webpage.
To date, a URL cannot be spoofed; the address displayed in the URL bar is the address being viewed. In many mobile browsers, however, the default configuration minimizes the URL bar as the user scrolls. This allows the attacker to cleverly substitute the hidden URL bar with a replica that displays a fake URL for the legitimate site. While the victim cannot interact with this bar, it affords a deep feeling of legitimacy to casual onlookers. And, of course, this too is all triggered by the victim’s user-agent string, which indicates exactly what kind of device is sending the content requests.
Because most organizations rely on SaaS and automation to collect intelligence, user-agent blocking can be an effective evasion technique. However, as with many HTTP request headers, a user-agent string can be manipulated with relatively little effort.
For example, most PC-based web browsers have extensions available that are capable of changing the browser’s user-agent to any number of presets. A brief search for user-agent through the Chrome Web Store reveals many such options. More advanced users can manually manipulate their user-agent during individual requests, enabling them to prod a server for different responses and gain intelligence about the kinds of blocking being used. Many pieces of command-line software, such as cURL or wget, make this possible by allowing a user to pass in the raw request headers as parameters.
Additionally, an understandable misconception is that a user-agent is something static and predefined. On the contrary, while the format is standardized, it is possible to create custom user-agents, or even to omit it entirely. For reference, here is a list of millions of user-agents employed in the real world, with more being created by vendors every day. This said, creating a custom user-agent for everyday use can have many disadvantages, as legitimate websites rely on this data to present the information to a device in the most convenient, accurate manner.
A Tool in the Kit
As we have emphasized in previous articles, cyber criminals view their attacks as revenue streams in a business. They will do whatever they can to keep those streams alive and flowing. Since blocking techniques that take the user-agent into account can be thwarted through some basic manipulation of request headers, and many tools exist that make this a one-click solution, attackers almost never use this technique on its own. Instead, they often use it as a tool within a larger toolkit of blocking techniques. One of the most common combinations of blocking techniques observed by PhishLabs involves using geoblocking in tandem with user-agent blocking. Encountering a site configured like this means that our analysts must determine not only the required user-agent, but also the geographical region being targeted by the attack.
By rapidly overcoming these techniques and spreading awareness to responsible organizations, PhishLabs is able to disrupt the attackers’ revenue streams and cause lasting damage to their infrastructure.