Authors
Muhammad Talha Paracha, Balakrishnan Chandrasekaran, David Choffnes, Dave Levin
Publication date
2020/1
Journal
2020 Network Traffic Measurement and Analysis Conference (TMA'20)
Description
Recent studies of HTTPS adoption have found rapid progress towards an HTTPS-by-default web. However, these studies make two (sometimes tacit) assumptions:(1) That serverside HTTPS support can be inferred by the landing page (“/”) alone, and (2) That a resource hosted over HTTP and HTTPS has the same content over both. In this paper, we empirically show that neither of these assumptions hold universally. We crawl beyond the landing page to better understand HTTPS content unavailability and inconsistency issues that remain in today’s popular HTTPS-supporting websites. Our analysis shows that 1.5% of the HTTPS-supporting websites from the Alexa top 110k have at least one page available via HTTP but not HTTPS. Surprisingly, we also find 3.7% of websites with at least one URL where a server returns substantially different content over HTTP compared to HTTPS. We propose new heuristics for finding these unavailability and inconsistency issues, explore several root causes, and identify mitigation strategies. Taken together, our findings highlight that a low, but significant fraction of HTTPS-supporting websites would not function properly if browsers use HTTPS-by-default, and motivate the need for more work on automating and auditing the process of migrating to HTTPS.
Total citations
202020212022202320242311
Scholar articles
MT Paracha, B Chandrasekara, D Choffnes, D Levin - 2020 Network Traffic Measurement and Analysis …, 2020