For most of us, shopping receipts are an annoyance, little bits of paper stuffed into the back of a wallet or dropped to the bottom of a bag.
Not to the Cohen-Shohet sisters of Princeton University. For them, there is untapped value in those pocketed papers. Not money, but something even better: data.
“It seems so archaic to be using paper receipts when we are digitizing everything else,” said Danielle Cohen-Shohet, a senior. “There is so much valuable information about who we are that vanishes when we throw them away.”
In a class taught for the first time last fall, Cohen-Shohet and her twin sister, Leah, developed a computer program that can analyze that data and offer shoppers and retailers a way to balance buying habits and bargains. If it works, it could be the smart bomb of advertising – offering shoppers a notification on their smartphones of exactly the next thing they want to buy before they even realize they want it. The sisters, both economics majors, plan to roll out a prototype of the project, called Rili (pronounced “really”), this spring.
This is just the outcome that Princeton electrical engineering professor Mung Chiang was hoping for when he introduced the undergraduate class, “Networks: Friends, Money and Bytes.” The course, open to students across the University with sufficient math skills, examines the common foundation governing the networks that wind throughout modern life. A key part of the coursework is a two-week mini-project that students completed and presented in January, for which they were mentored by Chiang and teaching assistants Jiasi Chen, Felix Wong and Pei-yuan Wu, who are graduate students in electrical engineering.
“I was blown away by the quality of these mini-projects and how well the teaching assistants worked with the undergraduates,” Chiang said. Projects completed by the students included a smartphone app that lets students rate Princeton nightlife and parties in real time, a computer simulation that showed that information networks can slow the spread of disease outbreaks (and how well-meaning censorship can backfire) and a comparison of radio signal jammers that showed that a jammer that reacts to a specific transmission can be more effective than one that constantly blocks a frequency.
Chiang, an expert in communication networking, said he came up with the idea for the class when he was thinking about how important networks have become to daily life and – despite the huge amount of writing and discussion about social, economic and technical networks – how little is taught that cuts across these different types of networks.
“We cut through the buzzwords and get to the fundamentals,” Chiang said. “We teach the key concepts that help formulate and address central questions.”
Chiang uses a “just-in-time” teaching method that draws in mathematical machinery only when the students are learning about the corresponding real-world applications. The entire course is structured around a list of 20 questions about the networked life, from how Google sells ad spaces to how cloud computing service scales up; from how Netflix recommends movies to how Skype runs online calls; and from why cellular operators charge $10 per gigabyte now to why WiFi is often slower at a hotspot than at home.
Several of the students entered the class with an idea for a project in mind. The Cohen-Shohets said they used the class to transform an idea into a working business plan.
“The class was critical in the project’s development,” Leah said. “Not only in the steps needed to make it successful but for understanding how the project could really work.”
Students said that on the first day of class, Chiang told them two things: that it would be a difficult and time-consuming course; and that he would measure its success by how many of their projects would eventually become published scholarly works or spin out of the classroom and make a difference in the real world. “These were very high bars of expectation,” Chiang said. “I am happy to say that many of the students met that challenge, and all of them did very impressive work.”
The following projects are a sample of the 24 completed for the course.
On YouTube, it’s who you know
On YouTube, it helps to be liked, but it really doesn’t matter if you are well liked.
“Like” and “dislike” are the polar opposites that govern the YouTube universe. With a click of a mouse, users can either recommend a video for everyone else on the site, or consign it to the bin of dislikes.
But when sophomore Samantha Anderson tracked sets of videos for her class project, she found something interesting: Likes and dislikes don’t have much to do with the number of time a video is actually watched – the main standard of popularity.
Far more important is the number of followers belonging to a person who posts a particular video. On YouTube, viewers can choose to be followers of people or organizations that regularly post videos, and can view other videos they have posted as well.
“I was expecting that a video’s popularity would be heavily influenced by likes,” said Anderson, acomputer science major. “But it was more a side effect of the view count, rather than the driver.”
Anderson evaluated sets of videos to get at the factors that determine popularity. Because she wanted to weed out external factors, such as videos created with corporate marketing budgets, she eliminated music videos or movie trailers and focused on amateur videos such as sledding and card tricks.
In the end, her research bolstered the adage that there is no such thing as bad publicity: People who had a lot of followers tended to have popular videos whether or not they had a lot of likes. However, there is still hope for the less noticed. Anderson found that a breakout video that drew a lot of users also could pull up a poster’s less popular video submissions.
“And even if your first video is not that popular, don’t give up because it only takes one popular video to make your channel successful,” she said.
Easing mobile backlog
As mobile phones become more powerful, the amount of data users are dragging through the airwaves is putting increasing strain on the telecommunications networks. Senior Ryan Corey, an electrical engineering major, wanted to see if there was a simple way for phone companies to manage a portion of that data to ease the burden without having to spend a lot on new hardware.
Chip designers, facing a similar problem of transferring data between a processor and computer memory, rely on a technique called caching – placing commonly used data in an easily accessed area. Corey examined whether phone companies could use a similar technique by storing certain data closer to callers. In that way, a request for data from a smartphone would not have to travel through the entire network and back again.
In the past, caching has not been popular on the Internet because data tended to be particular for an individual user. But as the use of the mobile applications changes, that is changing, too.
“With mobile video streaming, it has a lot of potential because a lot of people watch the same videos,” Corey said. “It would make sense to cache this type of thing.”
Corey ran a series of experiments to examine where performance degradation happens along the path from a person’s iPhone to the video server and how caching would affect networks. He looked at current data usage and also future use that involves more data-intensive services such as video streaming.
“My conclusion is that it has potential, with a few caveats. It would only be helpful for specific data and specific kinds of users,” he said. “For future networks, it would really increase the capacity available to the users beyond what the networks can provide right now.”
Know who your friends are
Arpan Ghosh can use data to tell who your friends really are. At least the ones on Facebook.
Ghosh, a first-year graduate student in computer science, has come up with a way to use the data underlying Facebook pages to find connections and commonalities between users within their existing friend networks. His program, called Frappe, analyzes those connections and creates a mini social network for users. He hopes to develop the program to the point where it can recommend new friends as well.
“Facebook gives you hints of people you might already know,” he said. “My idea was to go away from that and recommend new people you might never have met before that you might be compatible with or get along with.”
To do this, Ghosh relies on the information posted on users’ pages. A picture, for example, can have data about locations, activities and people. Text postings have even more information, but the challenge is filtering out less important messages. Ghosh calls that “the birthday problem” – the people who post happy birthday messages who might not have a real connection with a recipient.
“There are thousands of other posts like, ‘You have the highest level in Farmville,'” he said, referring to a popular game played on Facebook.
Although it is an undergraduate course, Ghosh said he was attracted to Chiang’s class because it dealt with the algorithms behind networking rather than “the more traditional computer science and programming knowledge needed to build such systems.”
“The course also took very theoretical topics and analyzed them in the context of applications that are very relevant today,” he said.
Currently, Ghosh is refining his program by presenting testers with possible friends based on their Facebook activity. The testers either mark the individual results as correct, possibly correct or wrong.
“After three iterations, we are identifying eight out of 10 people correctly,” he said.
A bargain just for you
There are a number of apps, like Lemon or Slice, that can track receipts for users. The difference between those and Rili is what happens on the back end.
When a customer with Rili swipes his or her smartphone at checkout, the receipt code is sent to a central computer. Working with the Cohen-Shohets, Tian Long Wang, a graduate student in computer science at Princeton who was not a member of the class, has written a program to analyze the data and predict future purchases by the customer. Rili then matches the recommendations with deals offered by a variety of merchants and sends the results back to the customer’s smartphone.
“If the program sees you have bought shoes from this store, it recognizes a pattern and sends you deals at this or other stores,” Leah Cohen-Shohet said. “One of the reasons it is so powerful is that another customer might match the same pattern as well – and the deals are sent there also.”
Retailers currently track customers’ purchases for the same reason. But those efforts typically don’t reach across many stores, and they don’t look for patterns among many groups of buyers, explained the Cohen-Shohets.
For users, the application would be free. The Cohen-Shohets said the data also would remain on the Rili server and not be shared. Customers would need to actively enter a purchase into Rili’s system – the program would not track any items unless they had been flagged by the buyer. For retailers, the system would allow better targeted advertising and require no additional equipment. It is intended to work for online stores as well as brick-and-mortar operations.
The Cohen-Shohets credited the class and Chiang with helping to transform their project from a good idea to a working business plan.
“From the first day, Mung was a very supportive resource. We bounced ideas off him, he welcomed suggestions,” Leah said. “It was amazing to see a professor inspired by our idea as much as we were.”