“Stitching” together a web user from scattered, messy data

Even though we interact with different web services in different ways, there are clues in the data that can indicate trends and identify a unique profile.

Image Enlarge

Modern internet users submit a massive trove of personal details to the web – but they scatter their data across dozens of websites, accounts, and devices with very little continuity. Our activity on a smart TV at home or logged out of a site at work remains very separate from our posts to social media and mobile browser history. This data, taken together, could be used to assemble a more inclusive profile of our preferences and interests over time, making our online more pleasant and continuous across devices with better product and service recommendations.

Prof. Danai Koutra will work to “stitch” these personal details together into a cohesive, useful whole in a project called “User Stitching: A Representation Learning and Hashing Perspective.” To undertake this research, she’s been awarded an Adobe Digital Experience Research Award.

More than just an individual problem, the fragmentation of user details creates data that poorly represent the population and broad trends, are sparse, and are insufficient for making predictions needed for businesses to be more responsive to user interests. Additional problems arise when considering user privacy across platforms.

Koutra’s project will be taking on two big tasks: finding solutions that remain privacy-aware, and designing methods that can scale to huge sums of ever-changing data. She has proposed a hashing-based approach to capture behavioral patterns of users from mixed sources, including devices, services, and content metadata.

Even though we interact with different web services in different ways, there are clues in the data that can indicate trends and identify a unique profile. These interactions can be modeled as a network between users and device types, websites or apps, and metadata or other content-specific features (such as metadata for the website, which could reveal similarities across websites).

Koutra suggests that users access similar content across devices and create similar metadata, such as IP addresses. She will use these similarities to find “user profiles” embedded in this network of interactions.

The Adobe Digital Experience Award grants $50,000 to faculty working to promote the understanding and use of data science in the area of marketing. The goal is both theoretical and empirical development of solutions to problems in marketing.