How secure is the internet from attacks?

When we use the internet, most of us don’t think twice about entering our credit card numbers and we don’t tend to worry that someone might be looking over our shoulder. Our guest on today’s episode, Jennifer Rexford, knows better than most how the internet works and what kind of vulnerabilities exist that allow hackers to exploit its weaknesses.

Rexford is the Gordon Y.S. Wu professor in engineering, a professor of computer science and the chair of that department. She’s won several awards for her research into the way internet traffic is routed. Jen is a 1991 graduate of Princeton with a degree in electrical engineering. She received her Ph. D. from the University of Michigan. She worked at AT&T Labs before joining the Princeton faculty in 2005. In this episode, she discusses how consumers struggle to keep up with managing the tools that are supposed to make their lives better. Every few weeks, there’s something new consumers should be doing to enhance their privacy, but that requires a lot of technical savvy, and it’s a moving target that requires a lot of work, she says. “I think for consumers, the technology and the misuses of it evolve much more quickly than a reasonable person should be expected to try to follow.”

Links:

Jennifer Rexford interview with “She Roars” podcast, May 3, 2019.

Internet security upgrade borne out of collaboration between Princeton and Let’s Encrypt, February 19, 2020

Princeton Engineering professors elected to National Academy of Sciences, April 30, 2020

Transcript:

Aaron Nathans:

From the Princeton University School of Engineering and Applied Science, this is Cookies, a podcast about technology security and privacy. On this podcast we’ll discuss how technology has transformed our lives from the way we connect with each other, to the way we shop, work and consume entertainment. And we’ll discuss some of the hidden trade-offs we make as we take advantage of these new tools. Cookies, as you know, can be a tasty snack but they can also be something that takes your data. When we use the internet, most of us don’t think twice about entering our credit card numbers and we don’t tend to worry that someone might be looking over our shoulder.

Our guest today, Jennifer Rexford, knows better than most how the internet works and what kind of vulnerabilities exist that allow hackers to exploit its weaknesses. She’s the Gordon Y.S. Wu professor in engineering, a professor of computer science and the chair of that department. She’s won several awards for her research into the way internet traffic is routed. Jen is a 1991 graduate of Princeton with a degree in electrical engineering. She received her Ph. D. from the University of Michigan. She worked at AT&T Labs before joining the Princeton faculty in 2005.

Let’s get started. Jen, welcome to the podcast.

Jennifer Rexford:

Thanks. It’s a pleasure to be here.

Aaron Nathans:

So we tend to trust the internet with our credit card and personal information and we tend to surf freely without fear of having our habits used against us. Should we trust the internet?

Jennifer Rexford:

So the internet’s been this remarkable experiment that escaped from the lab to become critical infrastructure for the world, and really in just in all of our lifetimes. And while it’s amazing, it definitely has a lot of vulnerabilities and they’re really kind of two different ones. One is people trying to directly trick you into doing something you shouldn’t, like giving your login ID and password to the wrong party. There are things like phishing attacks that take place via email or on the web to get you to click on a link. You think you’re going one place you’re actually going another. And you enter in information that only your bank should know. And that gives someone the ability to log in, let’s say, to your bank account in your behalf.

And the second is just a routine data collection that a lot of companies do for commercial reasons. So you’re browsing the web, you buy something for Christmas and all of a sudden it starts to show up in your advertising feed on Facebook. It happened to me last year at Christmas. Under the Christmas tree, my partner, my mom and I, there were three boxes, one for each of us, we’d all bought the exact same thing. Whichever one of us bought it first, the other two started seeing it in their advertising feed and bought it for the others. So that’s a benign example, but certainly your personal data is being harvested by companies for commercial gain, not necessarily adversarial reasons, but still (a) privacy risk of sorts too.

Aaron Nathans:

To a consumer, the internet seems like a seamless communication method, but if we look under the internet’s hood, what would we see? What takes place when we turn on our computer and go onto a website?

Jennifer Rexford:

So, the internet is really, as the name might suggest, an inter-network or a network of networks. So, it’s about 60,000 or so separately administered networks that make up the internet. AT&T, Princeton, China Telecom, Telecom Italia and so on. And through a competitive cooperation, these different networks stitch together, to be able to let you communicate with a computer in one of these networks, to a computer several networks away. And then one level up if you, let’s say, click on a link in a webpage, there’s a bunch of things that go on that gets you from one end of the internet to the other. The first is sort of looking up the name, cnn.com, cs.princeton.edu, in a sort of Yellow Pages for the internet, the domain name system, and that’ll return a computer readable numerical address or internet protocol address.

And your computer will use that to reach out to the computer with that address and start communicating and those different networks underneath will make sure that the data you send, chopped up into little packets, will get from your computer to the computer on the other end of the internet. And maybe exchange some information back and forth, including enabling encrypted, secure communication between the two endpoints, which could help a lot with some of the security issues you mentioned, but still doesn’t replace the fact that people might go to the wrong website or that that cryptography might be broken. Or that someone just watching the communication learns a lot by the very fact you’re communicating at all. Even if they don’t know what you’re saying.

Aaron Nathans:

I mean, is anyone in charge?

Jennifer Rexford:

Not really. I mean, there’s some bodies like ICANN, The Internet Corporation for Assigned Names and Numbers, that help in assigning these identifiers that are used to identify networks and computers and so on. So there are authorities that assign names and numbers to computers and to entire networks, but there’s a very decentralized delegation. A company like AT&T may get a bunch of addresses but then give addresses, in turn, to their customers and so on. And so in the end, we don’t even know the exact shape of the internet because any one network, like Princeton, makes its own personal connections to other networks without having to tell some central authority about exactly who’s connecting to whom. And so in the end, it really is a very decentralized structure with some loose oversight but no one party truly in charge.

Aaron Nathans:

If there’s no one in charge, why does it work?

Jennifer Rexford:

So, it works in part because the there’s an economic incentive for it to work. So, when Princeton connects to, let’s say one of its providers, it’s paying that company money to connect Princeton to the internet. And that company has an incentive now to make sure that data meant to be delivered to Princeton is, and data that Princeton wants to send elsewhere gets delivered as well. And so, even though the entire path from, let’s say, Princeton to a user in Europe, isn’t owned by a single company, each pair of institutions that are involved in delivering that data have a relationship with the one before and the one after it. And an incentive to make those work because it’s part of the contract that they’ve made with their neighbors to exchange data.

Aaron Nathans:

Roughly how many handoffs are there when there’s communication over the internet?

Jennifer Rexford:

Yeah, That’s a great question. We used to talk about sort of six degrees of separation where you might go through as many as a half a dozen networks, from a little network connecting to a regional provider, to a national provider, to another national provider and back down. But increasingly it’s a lot flatter than that. It’s often just two or three networks your traffic goes through. And, that’s because of large companies like Google and Facebook and Amazon that have computers all over the world that have them very close to many of the users that are accessing their services. So as time goes on, increasingly as I said, the Internet’s decentralized, but increasingly a small number of companies play a really huge role in the delivery of that data. And so it’s not uncommon but just two or three different networks are traversed between you and the services you’re using.

Aaron Nathans:

Is that a positive development?

Jennifer Rexford:

It’s a double-edged sword, I would say. On the positive side, it means the network has better performance because you’re often getting your data from someplace very close by. It can improve security because the number of parties that have to cooperate to deliver data securely is less. The ability of an adversary to get himself in the middle is a lot less. So from a security point of view, it’s a good thing. From a privacy point of view, less clear because those networks now have a tremendous amount of information about the users on the internet. And so from a privacy perspective, one might argue that this industry consolidation is somewhat of a negative.

Aaron Nathans:

Can you speak about the Border Gateway Protocol and your own research in this area? What is the Border Gateway Protocol and why is it important to understand?

Jennifer Rexford:

Yeah, so it has this seemingly innocuous name, but it’s essentially the glue that holds the disparate parts of the internet together. They’re the 60,000 separately administered networks that make up the internet and at the places they stitched together, they speak the Border Gateway Protocol. And so what will happen there as a company like AT&T will connect to, let’s say Sprint, and it’ll say, “Hey, here are the parts of the internet I can reach, feel free to come through me to get there.” And Sprint will do the same in turn. And each of them will take that information and tell their neighbors, “Hey, you can go through me to go through AT&T to get to this destination.” And so the Border Gateway Protocol diffuses information throughout the internet about the best way to go, to reach a particular destination on the internet. So it’s what allows a network six hops away to be able to communicate, let’s say with Princeton, even though the network that’s doing the talking to Princeton isn’t part of Princeton or is isn’t even part of Princeton’s own provider.

Aaron Nathans:

How easy is it for a hacker to attack that method and to have users have their personal information stolen?

Jennifer Rexford:

Yeah. So, it’s unfortunately really easy to attack. And the problem is a lot of the early design of the internet assumed that adversaries were on the outside of the internet, that we’re worried about robustness to things like physical attack, like a nuclear bomb, for example. And so, the internet’s remarkably robust to parts of it being physically destroyed, but the information that’s propagated around to make it work often isn’t encrypted or even signed. And so it can be relatively easy for someone to say, “Hey, I own Princeton’s address block. I’m over here. I’m the best way to get there.” And the neighbors of that lying part of the internet well, will often believe the lie. And so, traffic meant for Princeton could, say, get misrouted to this other part of the internet. And it can be relatively difficult to detect when that’s happened and let alone prevent it from happening.

Aaron Nathans:

Have you seen examples where this has happened?

Jennifer Rexford:

Yeah, there’ve been a bunch of high-profile incidents. A few years ago, there was a big incident for 20 minutes. 10 percent of the internet was routed through China Telecom, including traffic where both endpoints were in the U.S. So someone’s communicating with someone else in the U.S., but it’s being misdirected through China and coming back. Another really high-profile incident, a number of years ago, Pakistan Telecom accidentally said they were the best way to reach YouTube, and the entire internet believed it. So for two hours nobody could watch videos of cats flushing toilets, and it was a major problem. No, but all kidding aside, there was an attempt at domestic censorship within Pakistan to block access to YouTube and accidentally that attempt to black-hole or drop all YouTube traffic leaked out to the rest of the world. And a domestic attempt at censorship inadvertently became a global censorship of YouTube for a couple hours.

And it sort of illustrates a challenge here that national borders don’t always align with the way internet traffic is routed. And so a country or a region can have a policy and it can be very difficult to contain that entirely within that country. So we think of the internet as a single entity, but given it’s made up of lots of different companies and lots of different countries with different norms and different incentives, sometimes those incentives or policies get misaligned and things like this can happen. But in this case by accident, but when an adversary is being purposeful, you know, things can happen that are much worse.

Aaron Nathans:

Why do you think it doesn’t happen more?

Jennifer Rexford:

So I think frankly, as much as I work on this sort of plumbing of the internet, I like to say I’m an internet plumber. A lot of the attacks don’t have to be quite as sophisticated as that. I mean, those attacks are a little bit difficult to launch. You have to be yourself participating in these protocols that only people who run network infrastructure actually speak. And in practice, a lot of times you can easily send email that misleads a naive user into visiting the wrong website. And it can be quite effective at attacking individual users without getting all the way into the guts of the underlying plumbing. And frankly, a lot of adversaries want the internet to keep working because that’s exactly how they’re delivering advertisements or stealing credit card information, or launching denial of service attacks. They don’t want the infrastructure to go down. There are exceptions, obviously, people that want to disrupt and particularly in times of war as part of a terrorist attack and so on, but many adversaries are motivated by greed and they don’t want the highways to stop working, if you will.

Aaron Nathans:

You’re listening to Cookies, a podcast about technology security and privacy brought to you by the School of Engineering and Applied Science at Princeton University. We’re speaking with Jennifer Rexford, chair of the Computer Science Department here at Princeton. In our next episode, we’ll speak with Michael Swart, a recent graduate of Princeton, who studied how many YouTube product review videos are actually paid commercials.

But for now, let’s jump back into our conversation with Jennifer Rexford, in which we discuss a project that is remaking how hundreds of millions of websites are authenticated and how government surveillance might be able to evade post 9/11 privacy protections.

Aaron Nathans:
So what’s a digital certificate. And what does domain validation mean?

Jennifer Rexford:

So when you go to a website, increasingly they encrypt the data that you send to them and they send to you for your privacy. And in particularly after the Snowden revelations in the U.S. a few years ago, a lot more websites are using encryption. Now there’s a bit of a bootstrapping problem. If I go to a website, I need to know what key to use to encrypt and they need to be able to know that when I encrypt data that I send to them, how to decrypt it. And so what they do is they go, the website goes to a certificate authority like Let’s Encrypt, which is a really popular one. And they’ll get what’s called a digital certificate. And that will allow them to say, here’s my public key. Here’s my name. I’m really me. And if somebody tries to go to my website and checks with you, please tell them I’m legit.

Now, the main validation comes in because how does Let’s Encrypt know if I, let’s say, purport to be cs.princeton.edu, I say, “Hey, I’ve got cs.princeton.edu, I want a certificate.” Let’s Encrypt has to be able to validate that I’m really who I say I am. Otherwise, all this digital stuff is just saying something that’s false. It’s validating a false claim rather than a true one. So how do they do that? Well, they’re not going to come and take a blood sample from me, right? They’re going to do something automated so that we can really get a lot of people on, using cryptography quickly. So they’re going to say, “Hey, if you’re really who you say you are, go to the website, www.cs.princeton.edu, and put this content some random message at this particular URL.” And my ability to make that web page quickly with those contents on it allows Let’s Encrypt to download it and say, yeah, she must be legit because I just asked her a second ago to put this webpage up and look, she did it.

So she must really be the one that’s in charge of cs.princeton.edu. And therefore, I’m going to give her the certificate she asked for. So that’s domain validation, and it’s used not only by certificate authorities, it’s used by restaurant review sites and so on. So, to make sure that someone that’s purporting to be an institution, actually is that institution, but it’s only as perfect as that automated process is. And so, an attack, a group of us, Prateek Mittal in electrical engineering and myself with an undergrad student, Henry Birge-Lee, we looked at, well, what happens if someone manipulates internet routing intentionally to trick the certificate authority? So now, let’s suppose you’re an adversary and you want to get a certificate for my website. You go to Let’s Encrypt and you say, “Hey, I’m cs.princeton.edu” and they say, “Okay, well, you put up this webpage at Princeton, cs.princeton.edu, right this minute.”

And just at that moment, you trick the routing system with the Border Gateway Protocol, into directing traffic from the certificate authority to a computer you control, instead of my computer. And you know to do it because you’re the one asking for the certificate in a bogus way. And just for a brief moment. And just to that part of the internet, you trick them into thinking the best way to reach my website is through you. And you answer with the webpage you’ve been told to put up. So just for a few seconds of trickery, you’re able to convince the certificate authority that you’re the party that deserves that certificate, not me. And now all the communication that goes on later will mistakenly be able to go through you because you look like you’re the legitimate owner of the website and the keys.

Aaron Nathans:

So your team worked with Let’s Encrypt which, by the way, is the world’s largest certificate authority. It serves 200 million websites. You worked on a way to prevent this sort of attack. Can you talk about that method?

Jennifer Rexford:

Yeah and actually the idea is very simple and it’s not in some ways, not a new idea, but the basic premise is you can fool some of the people some of the time, but you can’t fool all the people all the time. And so, what Let’s Encrypt now does is, when they ask, “Hey, can you put this funny webpage up for me?” And I’m going to download it and make sure you did. They’re not going to ask just from one place on the internet, they’re going to go to lots of different geographically dispersed places in Europe and the U.S and other places. And they’ll try to do that download from many different vantage points on the internet. And now the adversary would have to trick all of those parts of the internet in believing that they should go the wrong way to get to the content.

And it’s very difficult to fool all of those vantage points at the same time. And so they vote in some sense, and they say, gee, nine out of 10 of us saw the same thing, or all 10 of us saw the same thing. It’s probably fine. But if a bunch of those sites, see, let’s say, the legitimate cs.princeton.edu and some see the fake one, they won’t believe the same things. I’d be like, okay, something fishy is going on here. I don’t know what’s going on, but I’m not going to give a certificate to that guy because I think something fishy is going on. And that’s what Let’s Encrypt has deployed. And now we’re working with them to test out their deployment and also to figure out just how many places and what locations are the most effective ones to keep an eye out for these kinds of attacks, to make sure that they don’t give certificates to the wrong parties.

Aaron Nathans:

Now you spoke, at the top of our conversation, about some of the ordinary, non-illegal vulnerabilities that consumers face on the internet. You talked about the gift. How can people take steps to protect themselves against those sorts of personal vulnerabilities?

Jennifer Rexford:

So there are a number of things. I mean, you can imagine using a virtual private network services, when you’re doing sensitive transactions. Some web browsers offer sort of secret mode. You can operate them where you’re not being tracked, how I do my Christmas shopping now ‘cause I don’t want to reveal what I bought everyone for Christmas. So, there are a number of things like that you can do. And certainly when people visit certain countries that block access to websites, they’ll often use the same virtual private network services.

Jennifer Rexford:

Like when I visited China a year and a half ago, I used them to be able to access websites that are blocked in China. So that allows you to essentially bounce from your computer to another computer, over an encrypted channel. And then when that computer accesses the website, it’s a little harder for the company to be able to tell that it’s you. Another thing that companies will use, what are called cookies, kind of aptly named given the podcast we’re doing right now, which is something your computer, your browser, tells the website that allows them to remember you from one visit to the next. And so a lot of people will disable cookies or only enable them selectively for websites they trust in order to avoid being remembered each time they go back to a website they don’t want to be remembered by.

Aaron Nathans:

Is it easy for ordinary Americans to enable these kinds of protections on their computers?

Jennifer Rexford:

Yeah, it’s not really and that’s a big challenge. And it does seem like it’s a war of escalation. It seems like every few weeks, there’s some new thing you’re supposed to be doing. Apply this patch. Disable this feature. And I think that’s one of the biggest problems, is it requires a lot of technical savvy and frankly, a lot of attention because this is a moving target. So all of these are a big problem. I think for consumers, the technology and the misuses of it evolve much more quickly than a reasonable person should be expected to try to follow.

Aaron Nathans:

So are there policy prescriptions that could be taken to protect Americans in this area or some of the others that we talked about a moment ago?

Jennifer Rexford:

Indeed. And in fact, there’s a lot of the work that goes on at our center for IT policy does, in fact, look at that. And looking both at policy prescriptions and also just uncovering particular ways that websites might manipulate users so that people are more aware and to put pressure on companies to change their practices. So it’s a mix of raising awareness as well as trying to help those in government make sounder prescriptions around what’s allowed and what’s not. Particularly around things like web tracking, where you at least hope with companies that are not intentionally being adversarial, they’re just being economically driven that the policy prescriptions can help reduce the incentives for them to collect quite as much data about their users.

Aaron Nathans:

I was kind of surprised to see that the United States’ government, if it wanted to, could work around this prohibition on spying on its own citizens without a court order. Can you talk a little bit about how that could work in theory?

Jennifer Rexford:

Yeah, so there’s sort of two things, about after 9/11, there is what was called the Protect America Act that relaxed some of the prescriptions on being able to do wiretapping on the internet. And in particular, if one of two endpoints in the communication is foreign, then it was okay to tap that traffic without a warrant. Now that doesn’t allow tapping of traffic between two domestic users. And so people speculate the same kind of thing I mentioned that happened in China, where China Telecom had traffic between two U.S. users get misrouted through China and back, so that the two parties are still communicating, but the traffic is leaving the U.S. and coming back in, that those same mechanisms could be used by a government also, to be able to take the traffic to a place they’re allowed to do wiretapping.

So instead of moving the wiretap to the traffic, think of it as moving the traffic to the wiretap and doing so outside the U.S. where there might be fewer prohibitions on tapping that traffic. So, that’s one thing people speculate. We don’t know for sure that that’s happening, but that’s a perfect example of the national borders being very fluid here. Even if two parties are both in the U.S., even if their communication would normally stay within the U.S., routing doesn’t observe national borders particularly if it’s being manipulated to violate those borders.

Aaron Nathans:

So you graduated from Princeton in 1991. What has changed since then in the world of computer science and the challenges that computer scientists are trying to solve?

Jennifer Rexford:

Well, I mean, so much of what’s important now is exactly about the topics we’ve been discussing, security and privacy were certainly not at the forefront of conversations when I was a college student. Certainly not cybersecurity and online privacy. And of course the internet was, although it had been around for a while, it wasn’t in significant use yet. And so I think that the rise of the internet and all the things that came from that was certainly still to come. I think for me, the most significant change though, is sort of a meta change, which is that today’s students just know so much about computer science and have such passion for it. And I think it was somewhat of a niche topic when I went to college, I even majored in electrical engineering because it seemed like it might be a better choice than computer science and frankly, the two fields aren’t so different.

So I don’t think that was a bad choice at all, but today’s students come to college, excited about computer science, it’s the most popular major at Princeton by a mile. Our intro course is taken by more than half of Princeton undergraduates. So I think the students come confident that a knowledge of computing will be a lever for them to wield in their professional life, in their hobbies, in their political pursuits. Basically, they think of it as a means to an end in addition to something interesting in its own right. And I think that that sense that computing is empowering, it was not quite there when I was a student. I think those of us that studied it then did so because we just found it interesting in its own right. And that’s of course still true but the sense of it as an enabler for just about anything I think is something much more recent.

Aaron Nathans:

So when you look to the internet of the future, five years from now, are you filled with hope or fear or both? And what needs to be done to get to a better place than we are now?

Jennifer Rexford:

Well, I think the exciting change right now is the connection of physical devices to the internet. Self-driving cars, your baby monitor, things that really affect the physical security and physical safety of you and the people around you. And that’s an exciting development because it can make a lot of things more convenient, a lot of things better, make driving safer. But of course it makes all these cybersecurity and privacy issues we’ve mentioned even more serious. And there have been studies of people hijacking self driving cars and driving people off the road, people breaking into baby toys to be able to spy on children and so on.

So I think the technology to enable us to really connect the cyber and the physical together is evolving much more quickly than our ability to think about the security and privacy implications of it. And so I think that particularly there, there’s tremendous need for attention to how to make particularly consumer devices more secure and better at protecting user privacy. As again, people setting up a baby cam or a home thermostat, shouldn’t have to be security experts and system administrators to be able to enjoy the fruits of these new technologies.

Aaron Nathans:

Well, I want to thank Jennifer Rexford, chair of the computer science department at Princeton University, as well as our recording engineer, Dan Kearns. Cookies is a production of the Princeton University School of Engineering and Applied Science. This podcast is available on iTunes and other platforms, show notes are available on our website, engineering.princeton.edu. The views expressed on this podcast do not necessarily reflect those of Princeton University. I’m Aaron Nathans, digital media editor at Princeton Engineering. Watch your feed for another episode of Cookies soon, when we’ll discuss another aspect of tech, security and privacy. Thanks for listening. Peace.