The security flaws of online learning: Mihir Kshirsagar

Are online learning platforms really secure? Mihir Kshirsagar co-wrote a paper that spells out in startling detail everything you’ve wondered about — but didn’t want to know — about how online platforms are allowing students to have their personal data exploited as the students use them for online learning.

In this episode, he discusses what he and his colleagues discovered, including the one mistake instructors often make that could compromise the security of their students’ data. He has served at the New York Attorney General’s Bureau of Internet and Technology as the lead trial counsel on matters of consumer protection law and technology. Today he’s a Clinical Lead at Princeton’s Center for Information Technology Policy, and a lecturer in Computer Science and the School of Public and International Affairs.

Link:

“Virtual Classrooms and Real Harms: Remote Learning at U.S. Universities,” June 2021.

Transcript:

Aaron Nathans:
From the Princeton University School of Engineering and Applied Science, this is Cookies, a podcast about technology privacy and security. I’m Aaron Nathans. On this podcast, we’ll discuss how technology has transformed our lives, from the way we connect with each other, to the way we shop, work, and consume entertainment, and we’ll discuss some of the hidden trade-offs we make as we take advantage of these new tools. Cookies, as you know, can be a tasty snack, but they can also be something that takes your data.

Aaron Nathans:
On today’s episode, we’ll talk with Mihir Kshirsagar. He’s a clinical lead at the Center for Information Technology Policy at Princeton, and a lecturer in computer science. He is a co-author of a recent paper that spells out in startling detail everything you’ve wondered about, but didn’t want to know, about how online platforms are allowing students to have their personal data exploited, as the students use them for online learning. Mihir has served at the New York Attorney General’s Bureau of Internet and Technology as the Lead Trial Counsel on matters of consumer protection law and technology. Let’s get started. Mihir, welcome to the podcast.

Mihir Kshirsagar:
Thank you, Aaron. Nice to be here.

Aaron Nathans:
Awesome. So you co-wrote a recent research paper on the vulnerabilities of online learning platforms, some of which are the same platforms that we’ve come to use a lot during the pandemic in other ways, including Zoom, Webex, and BlueJeans. Are these platforms inherently insecure, or is it more about how they’re managed and governed?

Mihir Kshirsagar:
That’s a great question, Aaron, and let me step back for a second to talk about what it means to provide a high-quality educational product, right? We’re using this as we’re using a classroom, and so, we’re putting enormous demands on these systems to serve our needs, and these platforms were designed for a very different context. So when we had to examine how they were being used in the educational context, we had to examine how they’re being used from an interdisciplinary perspective, and that’s something that CITP specializes in. So we had computer scientists, we had people who were specialists in governance, we had lawyers like myself come in and evaluate, how did this system work? How did it protect information? How did it preserve the qualities that we wanted to preserve? And so, with that in mind, I don’t think these platforms are inherently insecure or should be inherently insecure. It is very much a question of how they’re managed and governed, and that we, as educational institutions, ought to exercise our abilities, and the government ought to exercise abilities, to ensure that these platforms are serving our needs as the users of the product.

Aaron Nathans:
Was there much of a precedent for online classroom learning when the pandemic began? I mean, what was the industry standard? Why did so many educators feel like they were caught off guard when they were suddenly forced to go online last year?

Mihir Kshirsagar:
There was online education before. Mostly it was asynchronous, so people recorded videos. There were some platforms where people were communicating, but really, the demand to switch overnight with the pandemic to an online format using video conferencing technology that was mostly used in a totally different situation, for companies who were dealing with remote work, that was the primary use of those technologies. And schools had sometimes experimented with it. Actually, Princeton had an account with Zoom before this, but it had it because it was actually, they got it because I believe it was connected to the Alexander Bridge closure. And so, they knew that some people would be remote, administrators would be remote, and they needed a way to communicate, and so that’s how they had acquired a Zoom license.

Aaron Nathans:
So when you say the Alexander Bridge closure, that’s a local bridge that tied up traffic, so they knew that people would be working from home more often?

Mihir Kshirsagar:
That’s right. And so, in that setting, really, we had not ever thought of moving our entire educational infrastructure online, right? I mean, you step back and you think about the beautiful buildings at Princeton, the classrooms with all their history, and all their ways in which we’ve come to understand how to be respectful and work in a classroom setting, now we’re just saying, “Okay, you’re all online now, and you’re all going to look at these little boxes, and we’re going to teach,” and that’s obviously an enormous challenge.

Aaron Nathans:
It seems like at a lot of institutions, the assumption was, “Why would we want to go online?”

Mihir Kshirsagar:
That’s right, and what was special, and I frankly, as a teacher myself, I pine for the time when we can actually go back to being in person, because there’s really no substitute for in-person learning and being with your students and learning together. But I think we were obviously forced to do it in the online context, and I think some people found some interesting ways to use the online tools to enhance classroom discussions or to make some of the things easier, and I can give a few examples if you like.

Aaron Nathans:
Sure.

Mihir Kshirsagar:
Yeah. One, very good example for me, I’ve found is that I do a lot of one-on-one sessions with students where I talk to them about their projects, about where they are, and previously, they would have to wait in office hours, and they’d have to come in and they wouldn’t know, how long and how many other people were there, and now you can just schedule a time, they can sit from wherever they are, and they’re on a video screen and we can have a discussion, and then they go back to wherever they were, and I can go back to wherever I was. It’s a lot easier to communicate that way, but when you’re doing in-person learning and you’re communicating in a classroom setting, it’s very, very hard to replicate that in an online world.

Aaron Nathans:
When you’re in a classroom, there’s a lot of information that you’re not sharing, personal data, you’re not taking out your credit card or anything, but I mean, what sort of personal data or information about a student is more likely to be shared in a virtual setting than when you’re in person?

Mihir Kshirsagar:
Great question. First of all, you have the ability to make a permanent recording of that interaction, so you can preserve that information. So if I had a camera on every student’s face in a classroom, I doubt they’d be very happy, and as a teacher, I probably would not be very happy if every one of my classrooms was videotaped and evaluated later on. There’s also all your movements on the screen and how you interact with information that these software platforms are designed to collect every last detail of your interactions, how you’re moving your mouse, how large your windows are, how many other things are going on at the same time on your screen, various details that we leave to trust in the offline world, and we develop relationships where we expect the communication to go on in a respectful manner. And so, trying to do that in an online setting gets to be quite hard.

Mihir Kshirsagar:
And then the other aspect, which is again, something which the online… So that’s just the interaction between the student and the teacher in a classroom. Once you create an online world, there’s a possibility for intruders and outsiders to come in. And so, you had this phenomenon of Zoom-bombing, which I’m sure people have heard about, where malicious outsiders would come in and disrupt classrooms. So those are different ways in which the online world poses quite a few set of new challenges, as compared to the offline world.

Aaron Nathans:
You performed an analysis to see which third parties were receiving information on these platforms. So how did you do this, and what did you find?

Mihir Kshirsagar:
So we used a few different technologies, and so, I should say that this is something that the computer scientists in our group did, and what they were able to do was to see on an app what information was being sent to third parties, and we compared that to what the privacy policies required, doing an analysis. We used this system called contextual integrity to figure out where were the information flows? How were they taking place? And we also set up a few computers where we replicated the software and were able to examine the binary code and see what kind of security vulnerabilities were there.

Mihir Kshirsagar:
But I think to step back for a second, we came in with the fundamental question, which is not a question that you often see asked of these platforms. We said, “We have an educational context, we understand the rules in an educational context. Are we applying that appropriately when we move to the online context?” Usually, the world works by saying, “What’s happening online, it’s brand new. It’s the new thing. It’s resetting relationships. How do we just live with that world?” And we came in saying, “No, no, no, we have to look at what we’ve come to develop over a long period of time, and ensure that those values are respected when we make that transition.”

Aaron Nathans:
This is a good time to talk about the conclusions and the findings.

Mihir Kshirsagar:
Yeah. Yeah, sure. We discovered a few things I think that will make a big difference. So one is that it’s really important for administrators when they go out to acquire the software, and in turn, for the software companies, to understand what do users expect? How do they expect their privacy to be treated? What are the norms around whether you put on video recording or not? What are the norms about leaving your screen on? How do students in different situations help educators understand that in certain settings, students may not be comfortable with the video on, and how do you deal with that? There are also specific practices that universities can do. For example, there’s an option within Zoom to have the data hosted locally in the university servers, and not on the Zoom servers. So that was a big issue.

Mihir Kshirsagar:
And then there are certain instances of noncompliance. I mean, Zoom for example, my former office, the New York Attorney General’s Office, reached a settlement with them because they had promised that the information would be encrypted as it went over their systems, but in fact, those promises were not matched by their practices. And so, they reached a settlement where they agreed to change their practices. So you see the need for regulators and universities to identify where are these platforms not serving our needs, and then figure out how do we develop of appropriate penalties for noncompliance? And finally, we can learn from some of the security failures that we’re talking about and make sure that, okay, if one company’s having this problem, let’s make sure that all the other companies that are offering this service don’t have the same set of problems.

Aaron Nathans:
How secure are these platforms against attacks from outside adversaries? I mean, is the danger more outsiders taking advantage of the vulnerabilities, or is it the platforms taking advantage of their customers?

Mihir Kshirsagar:
It’s hard to quantify who’s the bigger threat, and we have a threat model. Actually, we tried to develop a unique threat model in this paper, because really, there are a few different potential adversaries. So there’s the platform itself that can misuse the information, maybe perhaps sell your location data to private parties. There is just the pure third-party adversaries, right? Somebody wants to Zoom-bomb and disrupt a classroom. There are also internal parties to either the educator or the student, who may make surreptitious recordings or disrupt a classroom setting, the internal threats, but they can use this technology to really do things in a way that they could not have done in the offline context.

Aaron Nathans:
Have there been known cases where hackers have taken students’ personal information?

Mihir Kshirsagar:
We haven’t documented. I believe that is the case. We were not looking for that specifically. We were looking for whether there were vulnerabilities. We were not looking for specific instances of hackers taking information.

Aaron Nathans:
Well, let’s take a little break here. You’re listening to Cookies, a podcast about technology security and privacy. We’re speaking with Mihir Kshirsagar, a Clinical Lead at the Center for Information Technology Policy, and a lecturer in computer science here at Princeton University. On next week’s episode, we’ll speak with Mihir’s colleague at CITP, Kevin Lee, who wrote a fascinating study about how easy it is for an attacker to gain control of another person’s cell phone.

Aaron Nathans:
It’s the hundredth anniversary of Princeton’s School of Engineering and Applied Science. To celebrate, we’re providing 100 facts about our past, our present and our future, including some quiz questions to test your knowledge about the people, places, and discoveries who have made us who we are. Join the conversation by following us on Instagram @eprinceton. That’s the letter E, Princeton. But for now, let’s go back to our conversation with Mihir Kshirsagar.

Aaron Nathans:
So I know that during the pandemic, we’ve all had to improvise, but by and large, are educators using the institutional platforms, the one provided by their university, or are many of them using their personal platforms instead?

Mihir Kshirsagar:
It’s a great question, and it’s one we actually surveyed the educators to understand what were they doing, and why were they doing it? Because one of the key aspects of our study was to not start with assumptions about what people were doing, but to really try to uncover data about how they were having these practices. And what we discovered is that educators were not using institutional platforms, in many cases, and that’s because sometimes they didn’t know that there was an institutional option available to them. Sometimes they thought that the institutional platform was just the same as the private platform, and that they privately had a free license to it, and so, and they didn’t need to use the institutional platform. What they didn’t understand in that situation though, was that the institutional platform has various legal protections that the personal platform does not have, and there are protections about how the platform may use the data, who it can share the data with, and what happens if the platform fails to respect your preferences, and the institutional licenses are much better than the private licenses in that context.

Aaron Nathans:
So the professors could actually be putting their students’ personal information in danger by using a platform that’s not protected by some of these agreements with the institution?

Mihir Kshirsagar:
That’s correct. There were requirements about encryption and how the data’s protected, which would be quite a bit different from the institutional version, versus the free license version. And so, it is really important for professors to understand where the institutional versions are, and then there’s a bit of a challenge because of course, the institutional versions, there may not be an institutional version of a particular kind of software you’re trying to use, and that’s a whole other question about how you make sure that there is an institutional version of that software.

Aaron Nathans:
So in the paper, you discuss how large institutions are able to overlay their own rules onto platforms. So Zoom, for instance, provides a template for hospitals to add on their HIPAA obligations. So in the paper, you looked at 50 different addenda to the privacy policies used by higher educational institutions. How much protection did these individual addenda provide?

Mihir Kshirsagar:
They provided additional protections. Again, it’s hard to specifically quantify the amount, but there were important obligations under the Federal Education Records Act, FERPA. Okay, so it’s the Family Educational Rights and Privacy Act, FERPA.

Aaron Nathans:
Okay.

Mihir Kshirsagar:
And so, the federal law provides specific protections for student data, and that includes… And really, that law, one of the things we look at that, is how that law was really built around this paper-based record concept of what’s to protect. And so, it was looked at particular kinds of information and information recipients. So it said if you had (inaudible) data, if you had data about your location and certain other things, name, address, telephone number, and so on, that could be shared under certain circumstances. But let’s say data that goes into your student record about your grades, or any findings about any administrative decisions about you, or other records about your psychological status or something like that that an institution might keep, those had to be afforded the highest level of protection, and were only allowed to be shared under certain circumstances, certain people. And that is something that the Zoom platforms or whatever online platforms would provide the protection to comply with the federal law in their specific addenda that they reached with large institutions.

Aaron Nathans:
So if a professor is using their own personal platform to teach a class, it may not necessarily have that FERPA protection-

Mihir Kshirsagar:
That’s right.

Aaron Nathans:
… whereas if they’re using their institutional account, it would?

Mihir Kshirsagar:
That’s right. The institutional account would’ve specifically negotiated for the FERPA protection in a way that the private account may or may not have it, right? It would be an analysis, but one of the advantages of using the institutional account is that the institution has vetted the software to ensure that it is compliant with the laws that they’re meant to be compliant with.

Aaron Nathans:
How often do universities go through the process of creating these addenda, especially with the pandemic sneaking up on us last year? Were there universities that just didn’t bother to do this?

Mihir Kshirsagar:
We didn’t find that. It’s quite commonplace, and I think when a university goes out to license a product, they do ensure that it complies with the law. I mean, they have a whole acquisition department, and they make sure that the standards are met. It’s when if a professor goes out and gets their own software, they may not appreciate how much work actually goes into evaluating software for use in an institutional context. And so, that is something that is important for educators to understand, the importance of what the IT administrators do in vetting and ensuring that the software meets the standards for an educational setting.

Aaron Nathans:
So has FERPA, have these other privacy laws themselves kept up with the digital age? Are there any burdens on the platforms for the transmission of protected information beyond what we’ve already discussed?

Mihir Kshirsagar:
So FERPA and the other laws have not, unfortunately, kept up with the digital age, and there’s some very nice papers that have been written in fact, by somebody, Elana Zeide, who’s a former fellow at CITP, she’s now a law professor, they’ve examined the real problem with FERPA is that it was built for this paper-based records system, and now we have this endless trail of digital exhaust, right? Wherever you go, wherever you see records about what you look, and who you’re interacting with, and your chats, and there a variety of different ways in which information about your interactions are preserved for all time, and those issues about how do you deal with that data? Who’s responsible for protecting it? When do you delete it? How do you make use of that information? All of that, FERPA and the states really haven’t picked up and haven’t been able to regulate.

Aaron Nathans:
It sounds like you think maybe they need to start rethinking this.

Mihir Kshirsagar:
Absolutely, absolutely. I think they need to think about how to regulate it, and we have some specific ideas, and this paper starts to make those points, is that we have to start, and this borrows from another context, which is contextual integrity, and that is that we have to start looking at the information flows, and we have to say, “Who’s getting this information? What are they going to do with it? Are they our expectations for how the information will be treated?” And not so much on, “What’s the information type and who are the recipients?” So, it is a governance framework to look at the software use as a whole, how it fits into the educational context, rather than just looking at, “Okay, you have this little bit of data about me. Can you share it? Do you have consent to share it or not?” That’s not going to be sufficient to address the challenges of the digital platforms.

Aaron Nathans:
You looked into 23 platforms and read their privacy policies. That must have been a lot of heavy reading there. There were some really interesting policies about with whom they could share data. Could you tell me a little bit about that?

Mihir Kshirsagar:
Sure. Yeah, and this was done by three of my colleagues, and they-

Aaron Nathans:
This might be a good opportunity to name them.

Mihir Kshirsagar:

Sure. Oh, great. My colleagues, Madelyn Sanfilippo, Yan Shvartzshnaider, and Shaanan Coney were the ones who really dug into the privacy policies and tried to understand and manually code them to see, “Okay, who’s getting this information? How do view them, and what do they allow?” And so, they examined these policies, and we saw that there were restrictions on who you could share third parties, which third parties could get the information, and some-

Aaron Nathans:
You’re talking about the platforms?

Mihir Kshirsagar:
The platforms, correct.

Aaron Nathans:
Who the platforms can share it with?

Mihir Kshirsagar:
Correct, sorry. Yeah, who the platforms can share this information with. In the third-party sharing context, we found that for example, nine of the platforms put the burden on the user to figure out who their third parties were, who could access the information. Seven of the platforms had provisions in their privacy policies that may have allowed data to be shared with advertisers. And so, six of them allowed advertisers to share information with the platform. So they were, and if you think about it, what are advertisers doing in the educational context to start with, is a very big question, and of course, when you step back and you say, “Well, this is where the privacy policies were designed for all kinds of contexts. They were not specifically designed for the educational context,” right?

Mihir Kshirsagar:
When we were examining the privacy policies that we’re talking about here, these are the general purpose privacy policies. These are not the specifically tailored to the institutional setting policy. And so, we see there that if you were an educator who went out to get your private license to something, you may actually be allowing advertisers to collect information about your students, and the advertisers to share information about what they have on the students with the platform. So that’s a very surprising fact, I think, for an educator, if they didn’t fully appreciate that they were not getting the kinds of protection they thought they were getting when they signed up for one or the other of these platforms.

Aaron Nathans:
I mean, yeah, that is very surprising. If I’m using a free Zoom account to chat with a friend and an advertiser is harvesting my information, maybe I’m not that surprised, but if I’m in a university class setting and the same thing is happening, that is pretty surprising. Do some of these addenda guard against that.

Mihir Kshirsagar:
That’s right. Yeah, the addenda guard against exactly that question. And so, I can’t remember whether Zoom was one of these, but for example, if you looked at Google Hangouts, if you had a free account, you might be signing up for all that Google does to collect your information and share information, whereas if you signed up an institutional account, then there are additional protections about how Google can share that data.

Aaron Nathans:
Is it pretty universal when you have an institutional account that these platforms are not selling your information to advertisers?

Mihir Kshirsagar:
It depends on what the institution demands of the account, but certainly, under FERPA, they’re not allowed to share information with advertisers without explicit permission, and so on. So that is something that the institutional account would provide protection against.

Aaron Nathans:
So what are some examples of information that’s being shared on the platforms, that students may not recognize as being harmful to share? I mean, you talk a little bit about location sharing. If somebody knows that I’m in Philadelphia, so what?

Mihir Kshirsagar:
Right. There are different kinds of location data that they’re collecting. One is they may be collecting the location data to know which is the closest server to you, so that it can send into the highest quality video feed, and that’s location sharing which is designed to deliver the service. It’s quite a bit different if they were using that location data to sell to advertisers about where your students are and how they’re interacting with the platform. If we’re working remotely and if the educational institution knows where you’re logging in from and kept track of where you are when you were logging in, that might come as a surprise to students. That might not be something that they would necessarily want the institution to know, or their professor to know where they are.

Mihir Kshirsagar:
Because one of the things, and we really haven’t talked about it thus far is that, of course, the pandemic has put a lot of strain on families and people in very vulnerable circumstances, and many of the students, some of them have had to work while they’re back at home, they’ve had to find places to work where they can be quiet, where they can participate in the educational setting. There’s a huge distributional problem for how people from higher income backgrounds can deal with the pandemic in ways that are very different from people from the lower income strata. And so, those disparities get accentuated in this transition to online learning. I mean, just for example, the single biggest one manifestation of that is the quality of broadband access, and the digital divide, and you see that if you’re in a situation where it’s hard for you to get the internet access or the quality that allows you to participate in an educational setting, that makes it very, very difficult.

Aaron Nathans:
Do you think there will continue to be a robust market for online education going forward, or is this simply a phenomenon of the pandemic era?

Mihir Kshirsagar:
I think there is going to be a robust market for online educational tools. It may not be video necessarily that’s the main tool, but I think as I was talking about earlier, some of the ways in which you communicate by one-on-one meetings, or some of the remote collaborations that have taken place, and actually Shaanan, who’s the first author on this paper is in Australia, and we were able to communicate, work together on these issues. I think that there are enormous opportunities for using these online platforms to create new hybrid models of learning, and I think there is going to be a very valued need for such programs and for platforms, and for people to find ways to communicate that are safe, secure, and respect the educational norms.

Aaron Nathans:
So, I mean, what do universities and regulators need to do, beyond what we’ve already discussed, to tighten the bolts on security in this area?

Mihir Kshirsagar:
I think that the biggest thing that they have to do, and it’s difficult and it’s complicated, but it’s absolutely central, is to develop systems to collect feedback from their users, whether it’s students, and one of the things in our paper we weren’t able to do, but would be important to do, is how do students view the use of these platforms? But we only got it third hand through what educators thought their students cared about. But institutions, regulators, they need to understand how are these platforms being used? How does it affect the concerns of the educators? How does it affect the concerns of students, to find a process where they’re collecting this information, where they’re not speculating about how people might use these platforms, but they’re actually collecting in real time how are people using these platforms, and then to develop a mechanism to address gaps that come up, and to find a way to iterate quickly to close those gaps, and I think separately, we’ve talked about how the federal laws and the state laws have to come up to speed with the new digital era.

Aaron Nathans:
All right. Well, thank you. It’s been a pleasure talking with you.

Mihir Kshirsagar:
Thank you, Aaron. Great questions, and I’ve enjoyed being here.

Aaron Nathans:
Thank you. Well, we’ve been speaking with Mihir Kshirsagar, the Clinical Lead at the Center for Information Technology Policy, and a lecturer in computer science, here at Princeton university. I want to thank Mihir, as well as our recording engineer, Dan Kearns.

Aaron Nathans:
Cookies is a production of the Princeton University’s School of Engineering and Applied Science. This podcast is available on iTunes, Spotify, Stitcher, and other platforms. Show notes and an audio recording of this podcast are available at our website, engineering.princeton.edu. If you get a chance, please leave a review. It helps. The views expressed on this podcast do not necessarily reflect those of Princeton University. I’m Aaron Nathans, Digital Media Editor at Princeton Engineering. Watch your feed for another episode of Cookies soon. Peace.

Internet researchers reach beyond academia to close major security loophole

At CITP, students examine broadband inequities and the digital divide

AI technique boosts climate change defenses

Andlinger Center conference unpacks AI’s double-edged role in the clean energy transition

Center for Information Technology Policy gives students up-close government experiences

Eavesdropping on underwater signals from the air

Computer Science

Engineering Newsletter Signup