Michael Cobb, Application Security
For some reason the Web has never been able to rid itself of cross-site scripting (XSS)vulnerabilities. Here you will learn how to prevent XSS attacks and exploits within your own organization.
Sites continue to fall prey to XSS attacks because most need to be interactive, accepting and returning data from users. This means attackers, too, can interact directly with an application's processes, passing data designed to masquerade as legitimate application requests or commands through normal request channels such as scripts, URLs and form data. This communication at the application layer can exploit poorly written applications to bypass traditional perimeter security defences.
According to a 2008 WhiteHat Security Statistics Report, 90% of all websites have at least one vulnerability, and 70% of all vulnerabilities are XSS-related. In this article, the first in a series on application-layer attacks, I want to look at how and why XSS attacks work and what you can do to remove this vulnerability from your own Web applications.
Cross-site scripting explained: How XSS attacks work
Let's look at how simple an XSS attack can be. The XYZ football club's message board allows club members to post comments about the team and its performance. Comments are stored in an online database and displayed to other members without ever being validated or encoded. A malicious member can simply post a comment containing a script enclosed by the <script> tags. The attacker then waits for other members to view the comment. Since the text inside a <script> tag is not generally displayed, other members may not even be aware that the script has executed; merely viewing the comment will execute the script. The script can legitimately request the member's cookie information and pass it to the attacker. This type of XSS attack is known as persistent XSS because the malicious script is rendered more than once.
XSS attacks work even if the site is viewed over an SSL connection, because the script is run in the context of the "secured" site, and browsers cannot distinguish between legitimate and malicious content served up by a Web application. But attackers don't have to rely on injecting their code into a site's comment page. They can try to trick a victim into clicking on a URL in a phishing email, which then injects code into the viewed page, giving the attacker full access to that page's content –- this is a non-persistent XSS attack. URL encoding is often used in such attacks to disguise the link and make users more likely to follow it. In the example below, the link is to a secure a https URL to a trusted site:
Users see that the link is to www.userstrustedbank.com and is over an SSL connection; it looks genuine enough since links often have long, seemingly meaningless text at the end. The user clicks the link. However, the code between the <script> tags when translated by a browser reads:
This attack string renders an IFRAME -- an HTML document embedded inside another HTML document on a website -- in the context of userstrustedbank's actual site. The attacker's login.php page will be mocked up to look exactly like the userstrustedbank's login page, tricking the user into entering and sending his login username and password to the bad bank server, the source of the IFRAME, while all the time being on the real userstrustedbank.com website. This very attack has been used on banks' websites this year.
Essentially, the underlying problem and cause of XSS holes is that many dynamically created webpages display user input that is not validated or encoded. If you don't validate user-generated input and control how it is processed or published, you could fall victim to an XSS attack.
How to prevent XSS attacks
To reduce the chances of your site becoming a victim of an XSS attack, it's essential that... ...any Web application is developed using some form of security development lifecycle (SDL). I will look at SDLs in more detail in a future article, but their aim is to reduce the number of security-related design and coding errors in an application, and reduce the severity of any errors that remain undetected. A critical rule you'll learn when developing secure applications is to assume that all data received by the application is from an untrusted source. This applies to any data received by the application -- data, cookies, emails, files or images -- even if the data is from users who have logged into their account and authenticated themselves.
Not trusting user input means validating it for type, length, format and range whenever data passes through a trust boundary, say from a Web form to an application script, and then encoding it prior to redisplay in a dynamic page. In practice, this means that you need to review every point on your site where user-supplied data is handled and processed and ensure that, before being passed back to the user, any values accepted from the client side are checked, filtered and encoded.
Client-side validation cannot be relied upon, but user input can be forced down to a minimal alphanumeric set with server-side processing before being used by your Web application in any way. You can use regular expressions to search and replace user input to ensure it's non-malicious. This cleaning and validation should be performed on all data before passing it on to another process. For example, a phone number field shouldn't accept any punctuation other than parentheses and dashes. You also need to encode special characters like "<" and ">" before they are redisplayed if they are received from user input. For example, encoding the script tag ensures a browser will display <script> but not execute it. In conjunction to encoding, it is important that your webpages always define their character set so the browser won't interpret special character encodings from other character sets.
Given that browsers aren't meant to assume any default value for the page's charset, and some servers don't allow or aren't configured to allow a charset parameter to be sent, it's important that you don't miss this meta tag out from your webpages. It will greatly reduce the number of possible forms a script injection can take. So if your Web application doesn't need to display characters outside the ISO-8859-1 character set, which is sufficient for English and most European languages, every single page should use the following meta tag to declare its characters:
<META http-equiv="Content-Type" content="text/html; charset= ISO-8859-1">
Web applications that do not need to accept rich data can use escaping to completely eliminate the risk of XSS. There are, of course, times when an application will need to accept special HTML characters, such as "<" and ">", for example on social networking sites where font formatting is accepted functionality. Securely encoding such input can be tricky due to the flexibility and complexity of HTML. I'd recommend making use of a security encoding library. Microsoft's ASP.NET provides validation server controls that can validate user input. Web applications running on an Apache server or using the Perl programming language can also use the Apache::TaintRequest module or PerlTaintcheck to automate the process of handling external data. You should ensure that all your developers understand how to incorporate these additional safety features into their code.
Another barrier to XSS attacks is a "crossing boundaries" policy whereby authenticated users have to re-enter their passwords before accessing certain services. For instance, even if a user has a cookie that will automatically log them into your site, they should be forced to enter their username and password when they attempt to access any sensitive account information. This extra boundary can limit the possibility of a session being hijacked by a XSS attack. Another simple yet effective technique is to immediately expire a session if machines at two separate IP addresses attempt to use the same session data. You can create session IDs using information specific to the user such as a timestamp and IP address. Although this technique can be overcome by IP spoofing, it does provide an extra layer of security against automated attacks.
You should look at using automatic source code scanning tools and Web vulnerability scanners during the development of your applications. A good Web vulnerability scanner will spot common technical weaknesses, such as those that are vulnerable to cross-site scripting. If you use third-party packages like search engines on your site, you should always check for known vulnerabilities or configuration issues with the vendors. This should be followed with a thorough test of how they handle unwanted input. Never assume they are secure.
Prior to putting your Web application live, you should conduct a penetration test. By simulating an attack, you can evaluate whether your site still has any potential XSS vulnerabilities resulting from poor or improper system configuration, hardware or software flaws or weaknesses in the perimeter defences protecting the site. I would recommend that you read the Open Source Security Testing Methodology Manual, which provides a recognized methodology for performing security tests and measuring the results.
Most sites nowadays won't work without client-side scripting, so asking users to turn off scripting in their browser is not really a solution, particularly as most wouldn't know how to do this anyway. In a welcome effort to combat XSS attacks, Microsoft's IE 8 has a XSS Filter which aims to provide automatic detection and prevention of common XSS attacks if they try to replay in the server's response. Users are not presented with questions they are unable to answer; IE simply blocks the malicious script from executing.
At the end of the day, though, it is down to Web developers to follow a secure development life cycle and Web administrators to scan their sites for vulnerabilities and protect them with a Web application firewall to prevent XSS attacks. After all, it's your customers and your reputation that XSS damages.