HaCRS Improves Mechanical Phish Bug Finding with Human Assistance

Overview


This post describes a system we developed recently to re-introduce humans to automated vulnerability discovery. While human experts can find bugs unreachable to automated bug finding, we were curious whether untrained humans can help automated systems to do better. We found that by integrating human labor with no prior experience in bug finding, otherwise automated systems can overcome some of their shortcomings and find more bugs than they could on their own. We were able to recruit 183 workers through Amazon Mechanical Turk who helped increase program coverage. In effect this lead to a 55% improvement in finding bugs for Cyber Grand Challenge (CGC) binaries. This blog post will discuss key insights and material that did not fit into our forthcoming CCS paper (pdf and bib) "Rise of the HaCRS". The paper was a collaboration between UC Santa Barbara, Arizona State University, and Northeastern University.

Update: additional materials (slides, video) available here.

Introduction


Mechanical Phish is an open source Cyber Reasoning System (CRS) that scored third in last year's CGC event. CGC was a fully automated hacking competition with no human interaction, the first computer vs. computer hacking contest. While this pushed forward automated reasoning, it also highlighted shortcomings in the state of the art of automated bug finding. In this project we enhance fully automated bug finding by adding human assistance to cover areas where human intuition beats computing power.

A shortcoming of fully automated analyses is that tools start without real input and have to explore programs on their own. While lacking intuition, these tools can still fare well, for example AFL can reconstruct JPG file format on it's own, which is impressive. But we were curious whether better input seeds help automated reasoning and found through experimentation that we were able to enhance results significantly. In particular human intuition allows to distinguish states that are logically different, e.g.: winning a game as opposed to losing a game. While automated systems might be able to differentiate, the implications are not clear. Or more generally: semantic hints given by programs go unnoticed by a CRS.

We developed a prototype system which we tested on Amazon Mechanical Turk, evaluating against the CGC sample binary corpus. The results back our suspicion that new inputs can improve CRS findings significantly.

Mechanical Turk


Amazon offers access to human assistants where requesters can offer tasks to be solved for money. This service is often used to gather data where automation is infeasible or results must come from a human (e.g.: surveys). While our system is not designed specifically for Mechanical Turk, we chose the platform due to it's vast access to workers. In HaCRS, a "Tasklet" is a request for human work to solve an issue the CRS can't deal with on it's own. We issue these in steps. E.g., to improve coverage to a specific target, and once that's done we aim higher.

We armed our system with Amazon credits and iteratively let it issue HITs, requesting labor to increase coverage, such that Mechanical Phish can find more bugs. We had the system generally request coverage increases of 10%, and scale the payout based on difficulty. For example while a tasklet we thought of as easy would earn $1, a particularly hard one would be worth $2.5. Performance was measured in triggered program transitions, we provided live feedback as the Turkers were exercising the programs (see screenshots below). We further issued bonus payments based on performance that went further than required, so Turkers would be encouraged to exercise programs further. In total we paid $1,100 in base payment and bonuses to 183 Turkers.

HaCRS User Interface: The Human-Automation Link (HAL)


As we were hoping to enroll large amounts of unskilled labor in our experiments, the UI had to be self-explanatory to scale. Issues with the UI would result in confused emails and result in loss of time on both ends. We tried to fit all information the Turkers could need, and offer all options that could make them work faster.

Mechanical Turk does not allow for Turkers to install software for tasks. This is for good reason as requesters could exploit this to let them install malware or other unwanted software. However, this also presented a challenge for us: our interface needed to be accessible to them while observing this restriction. We decided to build a Web UI for our system, adding a noVNC JavaScript window where we presented the interaction terminal. This choice also lets us be flexible in the future, we can reuse most of the UI while pointing noVNC to other targets.

HaCRS main screen

Above we see the HaCRS Human-Automation Link (HAL). Turkers can type in the terminal to interact with the program. To the left is the progress window. We see how many transitions have been triggered and how many more need to be triggered to receive a payout.

HaCRS input questions

Turkers see previous input / output sequences and can restore these states by clicking on the character in the interaction. All inputs are available to all Turkers. I.e.: if any Turker manages to reach a previously unknown program state, they can pick up from there and explore further without manually repeating all steps. A click will spawn a new docker container in the backend, replay the interaction, and be available to the Turker via noVNC. Note that such replay is only possible for systems where randomness is controlled, this is a general limitation and not specific to HaCRS.

We also offer programmatic input suggestions based on strings that might be encountered later, which the Mechanical Phish otherwise lacks program context to use directly. These strings can function as inspiration to humans to exercise the program better.

Sample program: NRFIN_00005


We will demonstrate HaCRS capabilities based on NRFIN_00005. This application is a game described as "Tic-Tac-Toe, with a few modifications". The player does not see the game board and has to keep track of state on their own. See screenshots above for gameplay and sample inputs. The game has a null pointer dereference bug, which can be triggered after one round has been played and typing "START OVER". Other strings will not trigger the vulnerability.

Driller and AFL (the two main components of Mechanical Phish) were not able to play the game successfully, as they cannot reason about the state of the game. Our Turkers however were able to win the game easily, but typed strings such as "PLAY AGAIN" afterwards, which does not trigger the bug. Next, Mechanical Phish picks up the Turker input and mutates it towards "START OVER", as it recognizes this as a special state, and crashes the program.

Takeaways


Our key takeaways from this project are as follows:

  • Input seeds can impact CRS results significantly, and should be used in conjunction with symbolic execution and fuzzing.
  • Even unskilled users' intuition can improve CRS results.
  • Mechanical Turk turned out to be a good platform for collecting diverse program interactions.
  • Semi-experts did not fare significantly better than non-expert users. However, this could be a limitation of our system.

Future Work


For HaCRS, we used humans to increase program coverage to reach states which Mechanical Phish could turn into crashes. However, we envision to involve humans in other areas to enhance CRSs. For example enroll them more directly into exploit generation, or testing patches to verify fixes. These tasks might be less suitable for unskilled labor, and will require more research. Furthermore, finding optimal incentive structures could increase performance of such systems.

Conclusion


We had a total of 183 Turkers work for us at a combined cost of $1,100. These Turkers managed to help Mechanical Phish find 55% more bugs than it could on it's own. HaCRS presents a step towards augmenting traditional CRSs with human intuition where computers are still lacking. Such a combined approach should be further explored to overcome CRS obstacles. Our paper features case studies and implementation details about our system. The full paper is available here: pdf and bib, and will be presented at CCS in Dallas.

If you are interested in doing similar work, do get in touch at mw@ccs.neu.edu and yans@asu.edu.

These Chrome extensions spy on 8 million users

Overview


This post investigates the upalytics.com library for Chrome extensions performing real time tracking of users on all sites they visit. The code is bundled with plenty of "free" extensions, exfiltrating browsing history as a feature. Such software is commonly known as spyware. Within the top 7,000 extensions of the Chrome Web store, the library is used 42 times with over 8 million installs. The post also looks into the relationship of upalytics with similarweb.com. The compiled data is also available in this spreadsheet.

Update: We published a paper about a system to automatically find such extensions.

Intro


I came across a website that offered browsing insights for websites they have no clear relation to, similarweb. The data includes links clicked on a site, referrer statistics, the origin of users, and others. While this is interesting, it also raises a question - where is that data coming from? Based on their website they collect data from millions of devices, but the software they advertise was orders of magnitude away from that. Data had to come from somewhere else.

Bundling unwanted content with "free" software is an unfortunate reality which has been shown before. This quickly became my working theory. Tracking browsing behavior alone is nothing new, but I was surprised by how widespread this library turned out to be.

Methodology


I started with the similarweb Chrome Extension, this is where I first came across the upalytics library. By doing some code reading I noticed it was tracking browsing habits and reporting it in real time. Next I started looking for similarities between this extension and the 7,000 most popular ones offered in the Chrome Web store.

Step one was an educated grep - looking for the "upalytics" string, which led to the first hits. What these libraries had in common is the string "SIMPLE_LB_URL" when accessing the backend API. Searching for that lead to more results, not all libraries contain the "upalytics" string.

To evaluate these extensions I wanted to know:

  • Does installing the extension exfiltrate data?
  • Does tracking happen out of the box, or does the user have to opt-in?
  • Is this mentioned in the terms of service?
  • If not, is there at least a link in the terms of service that explains what is happening?

I changed the endpoint address in each extension to point towards my server and evaluated each extension.

Results


I found 42 extensions which used the library totaling 8M installs. Note: "Facebook Video Downloader" (1,000 installs) required updating of the manifest to install.

Containing the code alone does not imply an extension exfiltrates data. But, manual testing confirmed: every single one was tracking browsing behavior. With every requested site, the extensions will send another POST request in the background to announce the action. What is particularly problematic is that some of these extensions pretend to be security relevant. Including phishing protection or content filters.

Out of these 42 extensions 23 did not mention data collection in their terms, out of these 12 further have no URL where this would be explained. One URL that is used across 12 extensions to explain the privacy ramifications is http://addons-privacy.com. The only extension offering opt-in to tracking is "SpeakIt!". They had an issue opened here where someone pointed this out as spyware before introduction of the opt-in step.

All data is compiled into a spreadsheet, available here.

Noteworthy examples


Do it - a Shia LaBeouf motivator: In exchange for browsing history users can get motivated by Shia. The extension offers a button that will make him pop up and shout a motivational quote. 200 thousand users considered this a good deal, who am I to judge? :-)

Video AdBlock for Chrome - this extension is advertised as "ADWARE FREE We are not injecting any third-party ads!". Technically this might be correct. Is spyware and adware the same?

Taking a peek


To see what is transmitted I modified the phishing extension (and all others) to post data to my local server instead of theirs. This was fairly simple - I set up a python Flask application that accepts POST requests to /related and GET requests to /settings. The POST data is base64 encoded - twice. Why twice? I don't know. Below is the data the server-side sees while the client is browsing. Line breaks inserted to help readability.

# We go to bing, after previously visiting asdf.com:

s=714&md=21&pid=gvOq01lLa3ZBt6z&sess=475474837468937000&q=http://www.bing.com/
&prev=http://asdf.com/&link=0&sub=chrome&hreferer=&tmv=3015


# We send a query "this is a test":

s=714&md=21&pid=gvOq01lLa3ZBt6z&sess=475474837468937000&q=http://www.bing.com/search?
q=this+is+a+test&go=Submit&qs=n&form=QBLH&pq=this+is+a+test&sc=8-14&sp=-1&sk=&
cvid=456B43655F44452BB33CC9AE204294B3&prev=http://www.bing.com/&link=1&
sub=chrome&hreferer=http://www.bing.com/&tmv=3015


# We click a link on the bing results:

s=714&md=21&pid=gvOq01lLa3ZBt6z&sess=475474837468937000&q=https://en.wikipedia.org/wiki/This_Is_Not_a_Test!&
prev=http://www.bing.com/search?q=this+is+a+test&go=Submit&qs=n&form=QBLH&pq=this+is+a+test&sc=8-14
&sp=-1&sk=&cvid=456B43655F44452BB33CC9AE204294B3&link=1&sub=chrome&hreferer=http://www.bing.com/search?q=this+is+a+test
&go=Submit&qs=n&form=QBLH&pq=this+is+a+test&sc=8-14&sp=-1&sk=&cvid=456B43655F44452BB33CC9AE204294B3&tmv=3015

What data will be transmitted?

  • Every visited website
  • Search queries (Google, Bing, etc. )
  • Websites visited on internal networks

As far as I can tell this will not be transmitted:

  • POST data (e.g.: passwords, usually)
  • Keypresses

The network view


The endpoints that receive the data use a variety of domain names with multiple IPs. These 42 extension use nine distinct domains, eight of which use the same subdomain (lb.domain.com), one is a subdomain of upalytics.com. I suspect an attempt to distract from the impression that all data flows to one company. The domain names include ones that are supposed to look benign, connectupdate.com, secureweb24.net, searchelper.com. The other domains involved are: crdui.com, datarating.com, similarsites.com, thetrafficstat.net, webovernet.com.

All these domains are registered with domainsbyproxy, a service used to obscure the ownership of domain names. This includes upalytics.com itself which is used in one of the extensions (Speakit!). Also, the robots.txt file used in all cases is the same.

What's more interesting: All these IPs belong to the same hoster, XLHost.com. Eight out of nine of these hosts have all addresses in a /18 network, half of the IPs of the upalytics.com endpoint are in another xlhost network. For browsing convenience (or your firewall?) the list of IPs is available here. All IPs in use are unique, however, this involves consecutive IP addresses and other neighborhood relationships.

To examine this closer I compared the distance of IP addresses used by these extensions for tracking. In the graph below, the nodes are the nine domain names in use, edges are amount x distance. By taking into account distances of up to four, we can link together all hostnames used in all 42 extensions. For example: IPs "1.1.1.1" and "1.1.1.3" have a distance of 2. As for the labels, the edge between "similarsites.com" and "thetrafficstat.net" reads "6x2". This means that the domains share 6 IP addresses with a distance of 2. Before the graph, this is the relationship between lb.crdui.com and lb.datarating.com:

IP distance

Combining all hosts into one graph, we get this:

connection graph

What does this imply? Whether this is one large data kraken or pure coincidence, I will leave for the reader to decide.

Is this malware, an unwanted feature, or totally OK?


Some of these extensions have terms that mention privacy, here is an example:

We consider that the global measuring and ranking of the Internet in the current market is somewhat underdeveloped and obscure. For this reason, we have undertaken a large global project which bring a powerful improvement in the public’s perception of internet trends and expand the overall comprehension of the dynamics that are happening on the internet on daily basis. In order to make this goal a reality, we need anonymous data such as browsing patterns, statistics and information on how our features are being used. When installing one of our free products, you will expect to become a proud part of this project and make this change happen together with us. If you want more details on the interaction that will be going on between your browser and our servers, feel free to check out our Privacy Policy. By installing our product you adhere to the Terms and Conditions as well as Privacy Policy adhered on: http://crxmousetou.com/

Calling the data "anonymous" seems bold, an IP alone can often be used to uniquely identify users, let alone browsing history. Based on this text the majority of users might not be aware of the extent of monitoring. I was surprised myself by the boldness of the tracking. However, even if this was laid out clearly in the terms, common sense dictates that browser extensions have no business recording unrelated traffic.

That being said, this behavior could be in violation of the Extension Quality Guidelines, in particular the "single purpose" rule. Whether this is the case, I can not judge.

Limitations


This post looks into usage of this one library in the Chrome Extensions in the Chrome Web store alone. The number of extensions I found is to be considered as a lower bound, there could be well more. For the extensions I examined I did not check other libraries that were loaded or checked for behavior other than tracking browsing history. Upalytics also offers libraries for other platforms (Smartphones, Desktop, other browsers) - I did not take a look at these either.

Closing


This is just one library for one platform. Uplaytics supports all major smartphones, browsers but also Microsoft and Mac platforms. Also, there are more players in the game than this one.

I'm afraid to say that even if all these extensions get nuked from the store, there might be plenty similar libraries in other extensions.

Updates


04/01/16: None of these extensions are accessible in Google Web store at this point.
03/31/16: I expanded on the explanation of the IP relationships.
10/05/17: We published a paper to detect such leaks automatically. See here for details.

Function-level JavaScript instrumentation with Closure Compiler

Overview

This post describes how to do function-level instrumentation of JavaScript programs using a Closure Compiler fork which is available here. The repository contains all code used here in the instrumentation-sample directory. Program points that can be hooked are function definition, invocation, and exit. Closure supports instrumentation internally as-is, this fork makes it more useful. Since Closure is already a popular part of JS build chains, it was an attractive target to add this feature to. I used this code as part of a project for JS hardening (ZigZag).

Update: The code has been merged into the official Closure repository.

“Hello World”

How to use the instrumentation feature:

$ java -jar compiler.jar --js file.js --instrumentation_template template.txt --formatting pretty_print

instrumentation_template FILE is the new option. The specified file contains the code that will be added to the JS file.</p>

Code specified as `"init"` will be prepended to the program, this is where function definitions for the report call/exit/defined functions go. The other three types: report_call, report_exit and report_defined specifiy the functions that should be invoked for those actions. These functions are where one wants to fill in the blanks with one's own code to see what's happening in a program. Here is a bare-bones instrumentation template:

init: "function instr_call(fun_id){}"
init: "function instr_exit(fun_id, return_value){return return_value;}"
init: "function instr_defined(fun_id){}"
report_call: "instr_call"
report_exit: "instr_exit"
report_defined: "instr_defined"
`fun_id` is a unique identifier of functions within the program. The report_exit function will be used in a return statement. It is important to keep in mind that user specified function has to return the return argument (`return_value`), otherwise the instrumented program will not work as hoped for. When compiling a program that consists of one function:
function a(e) {
   return e+1;
}
The output is:
function instr_call(b) {
}
function instr_exit(b, c) {
  return c;
}
function instr_defined(b) {
}
instr_defined(0);
function a(b) {
    instr_call(0);
    return instr_exit(0, b + 1);
}
;
# Accessing arguments An example that is more interesting would be logging arguments used in a function call. For that, the arguments variable can be used. Since this variable is not defined in the script otherwise, it needs to be defined as an extern. The externs file contains only one line: "arguments". To instrument the program, the command line has to be extended by: `--externs externs.txt --jscomp_off=externsValidation` The updated code for the `"init"` section of the instrumentation template:
function instr_call(fun_id){
    for ( var i = 0; i &lt; arguments.callee.caller.arguments.length; i++) {
        console.log('Argument ' + i + ': ' +
            arguments.callee.caller.arguments[i]
        );
    }
}
function instr_exit(fun_id, return_value){return return_value;}
function instr_defined(fun_id){}
# Closing This post explains how to use a modified version of Closure Compiler to instrument programs via templates. I found it a pity Closure doesn't allow for easier instrumentation out of the box, and hope this code can be useful to others working with JavaScript.

Boston Key Party 2015 - Kendall challenge (Superfish)

Overview


In this post I will provide some background information on the Kendall challenge of the Boston Key Party CTF. The focus is rather on how the challenge was designed than how to solve it. I'm sure others will cover that perspective in writeups.

This CTF is mostly run by BUILDS, but also with some challenges from others including Northeastern SecLab. The game board was organized by MBTA stations as a Google Maps layover, courtesy of Jeff Crowell.

bkp_challenges

The challenge categories were organized by train lines. The blue line was crypto, orange was reversing, red line was pwning. Everything else ended up on the green line.

For the Kendall challenge (pwning, 300 pts) we wanted to combine multiple tasks that require different skills into a single more complicated challenge. Also, we also wanted to create something around the recent Lenovo / Superfish news stories. However, creating a challenge titled "Super*" or "*fish" would have given away too much. We had to be more sneaky about this, but also avoiding giving away too little having players try to guess what to do.

We ended up with a combination of a remote exploitable router that leads on to man-in-the-middling a SSL connection that has the superfish certificate installed. Players were provided with IP/Port of the pwnable router and the binary that was running there.

A breakdown of the steps necessary to finish:

  • pwn the binary
    • Bypass authentication
    • Overwrite DNS entries with DNS controlled by team
    • Trigger DHCP renew
  • Intercept Browsing
    • Set up DNS server that responds with team's IP
    • Listen to the requests and make them succeed
    • Interpret the HTTP request
    • Set up SSL interception with Superfish CA

Part 1: The Router


The router software was remote accessible. When connecting, users were greeted by the following screen:

#####################################################
# DHCP Management Console                           #
# Auditing Interface                                #
#####################################################
 h  show this help
 a  authenticate
 c  config menu
 d  dhcp lease menu
 e  exit
[m]#

The user can operate as guest, anything important requires to be administrator. Read: there is an easy buffer overflow in the "filter" menu option, it allows to overwrite the admin flag. We included log files which hinted at the DHCP setting being important (it reads a static file). Players had to bypass authentication and then change the DNS to point to one of their machines. Next, trigger "renew leases". What happens in the background: the program will call another program in the same directory which pushes the DNS setting to something that drives the browser via sockets. This process will directly kick off an artificial browser that issues two web requests. We separated the accounts of the binary and the browser to make finding shortcuts to the flag harder.

Note: much of the work with the router binary was done by Georg Merzdovnik.

Part 2: The Browser


We simulated a user browsing websites. First a HTTP site, later log into their bank account where some sensitive information is revlealed (the flag). Should any step in this process fail, the browser aborts operation. The "browser" was a python script using urllib2. Parts that were important to get right were the DNS queries and certificate validation. The DNS lookups had to be performed through the server the teams provide by owning the router only. The SSL request verifies against the superfish certificate only. By default urllib2 will not check authenticity of certificates.

Once teams pushed their IP address as DNS server, they could see two incoming DNS queries. One for yandex.ru and the second one for a made up hostname "my.bank"

Next, players had to reply with an IP they control and have a running web server to intercept the requests. For my local testing I used minidns, a dependency-free python script that will resolve any hostname to a single IP address.

One thing I dislike while solving challenges is pointless guessing. So, before making a HTTPS request we issued a HTTP request to give a hint what to do with the SSL connection. We added a new header, namely "X-Manufacturer" with the value "Lenovo". This is a completely made up header which was supposed to be a hint towards Superfish without being blatantly obvious.

The second request was pointed at "https://my.bank" Teams had to make the browser establish a legitimate SSL connection and we would issue a request to: "https://my.bank/login/username={0}".format(self.FLAG)

Although we had no specific format for keys, we decided to prefix the key with "FLG-" to make it obvious once players got that far.

To get this right, teams could either run a web server with the Superfish private key, or MITM and point the request somewhere else.
A writeup using sslsplit for the latter option is available on Rob Graham's blog.

Closing


The source code of the challenges will be released as a tarball at some point in the near future, follow @BKPCTF (or me) for updates. I hope the challenge was fun and am looking forward to hear in writeups how teams did it.

Content Security Policy - Trends and Challenges

In December 2012, I was curious who is using Content Security Policy, and how are they using it?

Content Security Policy (CSP) can help websites to get rid of most forms of content injection attacks. As it is standardized and supported by major browsers, we expected websites to implement it. To us, the benefits of CSP seemed obvious.

However, as we started to look into the CSP headers of websites, we noticed that few of them actually used it. To get a better overview we started crawling the Alexa Top 1M every week. What started out as a just-for-fun project escalated to a data collection of over 100GB of HTTP header information alone. We are publishing the results in our (Michael Weissbacher, Tobias Lauinger, William Robertson) paper Why is CSP Failing? Trends and Challenges in CSP Adoption at RAID 2014.

We investigated three aspects of CSP and its adoption. First, we looked into who is using CSP and how it is deployed. Second, we used report-only mode to devise rules for some of our websites. We tried to verify whether this is a viable approach. And third, we looked into generating content security policies for third-party websites through crawling, to find obstacles that prevent wider deployment.

CSP headers in comparison to other security relevant headers
CSP headers in comparison to other security relevant headers

We have found that CSP adoption significantly lags behind other web security mechanisms and that, even when it has been adopted by a site, it is often deployed in a way that negates its theoretical benefits for preventing content injection and data exfiltration attacks. While more popular websites are more likely to use it, only 1% of the 100 most popular websites use it on their front page.

To summarize our findings:

  • Out of the few websites using CSP, the policies in use did not add much protection, marginalizing the possible benefits.
  • The structure of sites, and in particular integration of ad networks, can make deployment of CSP harder.
  • CSP can cause unexpected behavior with Chrome extensions, as we detail below.
  • The project resulted in fixes to phpMyAdmin, Facebook and the GitHub blog.

In our paper, we suggest several avenues for enhancing CSP to ease its adoption. We also release an open source CSP parsing and manipulation library.

Below, we detail on some topics that did not fit into the paper, including bugs that we reported to impacted vendors.

Chrome Extensions

Chrome enforces CSP sent by websites on its extensions. This seems well known, but comes with a couple of side effects. The implications are that CSP from websites can break functionality of extensions, intentionally or unintentionally. Other than that, it makes devising CSP rules based on report-only mode very cumbersome, due to lots of bogus reports. Enforcing rules on extensions seems surprising, especially since they can request permission to modify HTTP headers and whitelist themselves. In fact, we found one such extension that modifies CSP headers in flight.

Recovering granularity of CSP reports

While older versions of Firefox will report specifically whether an eval or inline violation occurred, newer versions of any browser won’t. We provide a work-around to detect such errors in the paper, this involves sending multiple headers and post-processing the reports.

Facebook

With Facebook, we noticed that headers (including CSP) were generated based on the user agent. This has some advantages, e.g., sending less header data to browsers that don’t support certain features. Also, Firefox and Chrome were sent different CSP headers (the difference being, the Skype extension was whitelisted for Chrome.) We also noticed that for some browser versions, no CSP rules were sent out. The likely reason is that CSP handling in some browser versions is buggy. For example, Chrome enforces eval() even in report-only mode in some versions. However, due to misconfiguration, CSP was only served to browser versions before the bugs were introduced, and none after. As a result, CSP was only in use for a fraction of browsers that in fact support it. After we informed them of this, Facebook quickly fixed the issue. Now, CSP is being served to a wider audience of browsers than before. Also, we were added to the Whitehat “Thanks” list.

phpMyAdmin

We found that phpMyAdmin, which serves CSP rules by default, had a broken configuration on it’s demo page. The setup prevented loading of Google Analytics code. This turned out to be interesting, as the script was whitelisted in the default-src directive, but script-src was also specified and less permissive. Those two are not considered together, and the more specific script-src directive overrode default-src. Hence, including the Google analytics code was not allowed and resulted in an error. We pointed out the issue and it resulted in a little commit.

GitHub

We used several sites that deploy CSP as benchmark to test our tool for devising CSP rules. With GitHub, we noticed that our tool came up with more directives than the site itself. After investigating, we found that one of the sites on their blog was causing violations with the original rules, as it tried to include third party images. This was interesting, as any site which specifies a report-uri would have caught this, but GitHub doesn’t use the feature. While this caused no security issue, it stopped the blog post from working as intended. With report-uri enabled that mistake would have popped up in the logs and could have been fixed instantly. We think this highlights how important usage of the report-uri is. In summary, this was more of an interesting observation about report-uri to us than a problem on their side.