Choose an event
Jason Brancazio is a Senior Software Engineer on the Digital Platforms Team of Red Bull Media House. He is currently building a recommendation engine for media content on Red Bull’s website and in its TV apps. He previously worked for TripAdvisor as a data engineer supporting data ingestion and transformation processes at scale. Jason also worked for 13 years as a forensic accountant, helping attorneys value businesses and model losses in civil litigation. He holds a bachelor’s degree in economics from Princeton.
In his talk, Jason will share the development strategies he employed to productionalize the first phase of a recommendation engine for Red Bull TV. He will tell a story of how he transformed a python script provided by his collaborator into a modularized, containerized, cloud-hosted batch processing application, and how quick iteration in Jupyter notebooks sharing the same code repository and environment remained essential for the task.
Josh Muncke leads the Data Science practice for Red Bull North America. His team works across the beverage, distribution and media businesses of Red Bull in the US to deliver growth-focused initiatives using machine learning and statistical modelling. Josh’s team is also responsible for developing and delivering the data training curriculum at Red Bull which aims to better equip analysts and managers with the skills and knowledge to up-level their own use of data. Prior to joining Red Bull, Josh had a 7+ year in data science consulting at Deloitte and IBM. Josh holds a BSc in Physics from the University of Manchester in the UK.
James thinks about himself too much. He has other faults too but that’s the biggest one. Now with that out of the way we can talk about his skills and accomplishments. He studied applied physics and computer science at BYU. His last class was STATS 431. Everyone told him he’d hate it. He loved it. On the first day of class, Professor Whiting (https://www.linkedin.com/in/davidwhiting/) started a discussion on the existence of absolute truth. Things like the speed of light or avogadro’s number were absolute — but we will never know them because our instruments will never be perfect. But we can estimate them! Statistics is where math hits the road … and James was hooked. A few years later he competed his first graduate degree.
Big data can be daunting. “With all this data I should be doing something important!” you might think. “I mean, the government has big data and they are definitely doing something important with it, right? — well at least Google does… and is…” Often we interpret “something important” as making big decisions. But sometimes it really means making lots of little good decisions. At Simpli.fi we process tens of billions of bid requests from billions of devices each day as we interact with 20 ad exchanges and service the diverse needs of our approximately 70,000 active campaigns and over 400,000 creatives (at last count). We should be doing something important with this data — and we are. Essentially our task is to make billions of good bid decisions to optimize the effectiveness of our active campaigns. A good bid decision involves quickly analyzing the bid request dimensions, matching them to an active campaign and then setting the bid price just right to capture the opportunity value presented. All of these little good decisions add up to something important. They provide value to our customers.
This presentation introduces a method that uses Thomas Bayes’ beta distribution to adjust bid prices based on the performance of the discrete decision cell in which the bid request falls.
Anna is a Data Scientist at The New York Times. She works on identifying drivers of reader engagement and builds recommendation models that serve personalized article suggestions to users of the website. Prior to working at the Times, she received her PhD in Cognitive Psychology at New York University, where she used behavioral experiments and computational models to understand how people learn, gather information, and make decisions. She has also worked as a research analyst conducting psychological experiments to help businesses understand customer decisions.
The Data Science group at The New York Times develops and deploys machine learning solutions to newsroom and business problems. Re-framing real-world questions as machine learning tasks requires not only adapting and extending models and algorithms to new or special cases but also sufficient breadth to know the right method for the right challenge. I’ll first outline how unsupervised, supervised, and reinforcement learning methods are increasingly used in human applications for description, prediction, and prescription, respectively. I’ll then focus on the ‘prescriptive’ cases, showing how methods from the reinforcement learning and causal inference literatures can be of direct impact in engineering, business, and decision-making more generally.
Chris Wiggins is an associate professor of applied mathematics at Columbia University and the Chief Data Scientist at The New York Times. At Columbia he is a founding member of the executive committee of the Data Science Institute, and of the Department of Systems Biology, and is affiliated faculty in Statistics. He is a co-founder and co-organizer of hackNY (http://hackNY.org), a nonprofit which since 2010 has organized once a semester student hackathons and the hackNY Fellows Program, a structured summer internship at NYC startups. Prior to joining the faculty at Columbia he was a Courant Instructor at NYU (1998-2001) and earned his PhD at Princeton University (1993-1998) in theoretical physics. He is a Fellow of the American Physical Society and is a recipient of Columbia’s Avanessians Diversity Award.
Bio coming soon.
Abstract coming soon..
Slater Victoroff is Indico’s founder and CTO. He has been building AI, machine learning and deep learning solutions for the enterprise for the better part of the past decade having worked with everyone from the federal government to two-person startups to the Fortune 100. Slater has educated hundreds of business users in the successful implementation of deep learning through a simple framework that helps executives rapidly accelerate the adoption of the technology in their businesses. Slater attended Olin College of Engineering before pursuing Indico full-time after its acceptance into the Techstars Boston Program.
Sam Vidal has over twenty five years of experience in Analytics and Programming ranging from database design to cloud based architecture and front end interface, with specific expertise in predictive and media analytics. During his six years at Nielsen, Sam was lead developer and subject matter expert for custom projects which frequently included custom algorithm solutions and advanced analytic design. Sam has experience working within a broad spectrum of analytic verticals, including: outdoor measurement, digital advertising, TV and Cable networks, local and national political campaigns, automotive, consumer product goods, and technology giants like; Google, Microsoft and Yahoo. Most notably, Sam helped a prominent financial institution gather data from multiple stock exchanges, build predictive models, and report the data with state-of-the-art data visualization techniques across a multi-platform mobile application.
The Future of AI powered Immersive Media should be a monthly discussion topic on the agenda of every entertainment property’s strategy meeting. With consumers making purchases on over 3 billion smartphones worldwide it’s becoming apparent that the bridge into the Internet of Things is closer than we think. Smartphones are just the tip of the Immersive media iceberg with wearables and smart speaker advertising on the horizon. Brands are becoming more savvy about how to monitor and track customer behavior more efficiently from these devices to help them craft content that best suits their target market. This type of Machine learning is helping corporations make better decisions about their products. We’d like to discuss examples of how Immersive Media helps brands stay ahead of the curve.
Jen Walraven leads Physical Production Science and Analytics at Netflix. Her team drives data science initiatives throughout the Netflix Studio, partnering with creative teams to inform and empower global production of Netflix Original content. Prior to joining Netflix, Jen led the Data Engineering team at Nomis Solutions, focusing on data strategy and scalable infrastructure in the financial services industry. She has also worked on customer and financial fraud analytics at several consulting firms. Jen holds a BA in Computer Science from UC Berkeley.
Netflix entered the world of content production with its first Original title in 2012 and has since grown to produce over 700 Original titles around the world. Spanning pre-production, production, post-production, and localization and quality control, content production is a complex operation that consumes and generates significant amounts of data. Translating data insight into actionable recommendations alongside creative teams can introduce tremendous efficiency and scalability into the production process. In this talk, we’ll discuss how data science can help tackle critical challenges in the production space, as well as opportunities on the horizon in a transforming entertainment industry.
Ambrish is Director of Product for Data Platform and Personal Loans business unit at Credit Karma, where he has been leading efforts to transform financial underwriting and architecting a recommendation system that assists over 80M Americans make right financial decisions at every step. Prior to Credit Karma, Ambrish cofounded Digital Health startup that provided a smart digital health assistant for patients with pre-diabetic and obesity related conditions. He also led product and engineering leadership roles in building digital acquisition platform at Electronic Arts and one of the largest search & advertising platform Bing.com at Microsoft. Ambrish holds Masters in Computer Science from University of Southern California and MBA from Haas School of Business at UC Berkeley.
People spend more time shopping for hotels where they stay for 2 nights vs. shopping for a loan that they carry for 2 years. By the application of machine learning models to Terabytes of financial data for over 80M Americans, Credit Karma helps people make better financial decisions at every step in their life. The more specific you can get in your recommendations for users, the more trust and ultimately, loyalty you’re able to maintain. Credit Karma only surfaces the most appropriate products for each member to ensure they are matched with the product that makes the most financial sense for their profile. On a macro level, Credit Karma is using this data to address the issue of mispriced financial products Americans live with and often helping them find the alternatives that help make financial progress possible for everyone. Such as with over $1 Trillion in Credit Cards debt many Americans are paying more than 3x interest than what they could get with a personal loan. If Americans refinanced their auto loans, they could save over $30 billion. This and several other examples like this is how Credit Karma is changing the way consumers make their financial decisions.
Dr. Arun Verma joined the Bloomberg Quantitative Research group in 2003. Prior to that, he earned his Ph.D from Cornell University in the areas of computer science and applied mathematics. At Bloomberg, Arun’s work initially focused on Stochastic Volatility Models for pricing & hedging Derivatives & Exotic financial Instruments. More recently, he has enjoyed working at the intersection of diverse areas such as data science, cross-asset quantitative finance models and machine learning & AI methods to help reveal embedded signals in traditional & alternative data.
The high volume and time sensitivity of news and social media stories requires automated processing to quickly extract actionable information. However, the unstructured nature of textual information presents challenges that are comfortably addressed through machine learning techniques. This talk will cover the following topics: • The application of machine learning in finance • Extracting sentiment from news stories and social media content using machine learning algorithms • Quantitative techniques for constructing aggregated sentiment scores and other derived metrics (e.g., sentiment dispersion) • Demonstrating the sentiment signal based trading strategies that have high risk-adjusted returns • Illustrating variation in sensitivity of sentiment with respect to industry sector, market cap, trading volume, etc.
Abstract coming soon.
Madhav Khurana has a vast amount of experience in Data Science and has worked in India, the UK, Sweden and Germany. He is a Senior Data Scientist at Careem in Berlin, currently leading efforts to optimize dispatching of drivers to customers. As a Data Scientist at King, he worked on a variety of projects to increase players’ retention in Candy Crush games. He also created a dynamic level difficulty model to improve game’s engagement. Prior to applying his Data Science expertise and skills in mobile applications industry, Madhav has helped numerous telecom operators in Asia, Africa, North America and Europe with their customer retention strategies. He did that while working as a Data Scientist at IMImobile, by predicting customer churn, identifying churn triggers, calculating customer lifetime value and devising plans for effective customer relationship management.
The talk is about using machine learning to optimize mobile application experience. Work done by the speaker on Candy Crush games and Careem’s ride hailing application is showcased with details – including how a major failure resulted in establishing the need to predict user behavior. An elaborate used case from Careem is also presented to illustrate best practices in making predictions with mobile app data. A major part of the talk is on feature engineering, arguably the the most important aspect in applied machine learning. Various alternative experiment designs are discussed which are used to deal with interference bias in testing new mobile application features. While the discussed cases are about mobile applications, it should be noted that all these practices can be employed for prediction problems in any other industry.
Colleen M. Farrelly is a data scientist at Graham Holdings (Kaplan Higher and Professional Education) whose research focuses on applications of topology and differential geometry in machine learning and data science. Her industry experience spans healthcare, genomics, education, and business. Lately, she has enjoyed writing articles for lay audiences, which can be found on KDnuggets and Quora.
Identifying trends within high-dimensional datasets can be difficult, and visualization can guide further exploration of the data or provide an easy-to-use check of analysis results. Manifold learning is a broad class of algorithms that map high-dimensional data to lower-dimensional spaces without making assumptions about linearity like principle component analysis. R provides many good packages that wrangle high-dimensional data for easy visualization of trends and subgroups. This talk will demonstrate a couple useful methods on an open-source multivariate time series data using a few lines of R code.
Anna Veronika graduated from the Faculty of Computational Mathematics and Cybernetics of Lomonosov Moscow State University and Yandex School of Data Analysis. She used to work at ABBYY, Microsoft, Bing and Google, and has been working at Yandex since 2015, where she currently holds the position of the head of Machine Learning Systems group.
Gradient boosting is a powerful machine-learning technique that achieves state-of-the-art results in a variety of practical tasks. For a number of years, it has remained the primary method for learning problems with heterogeneous features, noisy data, and complex dependencies: web search, recommendation systems, weather forecasting, and many others. CatBoost (http://catboost.yandex) is a new open-source gradient boosting library, that outperforms existing publicly available implementations of gradient boosting in terms of quality. It has a set of addional advantages.
1. CatBoost is able to incorporate categorical features in your data (like music genre, URL, search query, etc.) in predictive models with no additional preprocessing. For more details on our approach please refer to our NIPS 2017 ML Systems Workshop paper (http://learningsys.org/nips17/assets/papers/paper_11.pdf).
2. CatBoost inference is 20-60 times faster then in other open-source gradient boosting libraries, which makes it possible to use CatBoost for latency-critical tasks.
3. CatBoost has the fastest GPU and multi GPU training implementations of all the openly available gradient boosting libraries.
4. CatBoost requires no hyperparameter tunning in order to get a model with good quality.
The talk will cover a broad description of gradient boosting and its areas of usage and the differences between CatBoost and other gradient boosting libraries. We will also briefly explain the details of the proprietary algorithm that leads to a boost in quality.
In depth look at what differentiates deep learning from rest of ML. In this session, we will take a look at similarities between the human brain and deep learning including the necessary hardware and learning techniques. Deep learning is associated with various forefront ML initiatives such as autonomous vehicles and intelligent digital assistant. This session includes theory about key topics as well as demo using Python, Tensorflow and Keras.
At Hulu, our focus is always on the viewer, and using data to understand our viewers’ behavior is crucial to our business. With our data models, we can predict and understand retention risk, understand viewer behavior and leverage this data to improve our service. In this session, we’ll dig into how we built these predictive model and discuss how they positively impact our business.
Barry Cassidy is a 15 year veteran at the FT, Barry has held diverse leadership roles running campaign management and planning teams as well as data infrastructure projects. His role as Head of Campaign Planning from 2014 led him to confront the impact of legacy systems and outdated practices on the ability of his teams to deliver timely reporting and impactful insight to customers, an effort that led to his current position as Head of Advertising Data Operations. Prior to the FT, Barry worked for Express Newspapers and News International (Now News UK) as well as several creative and design agency start ups. He is a graduate of the University of Sheffield.
“As the premiere global financial media brand, the FT’s Advertising Operations team has a history of award-winning innovation. For years, managing the FT’s growing volume of multi-channel data has been a top priority requiring new tools and techniques. In 2016, the FT advertising launched its most ambitious data initiative to-date: Deploying Data Operations to automatically normalize and unify high-value data across the FT portfolio. The FTs goal with this project was to improve operational excellence in data management and free AdOps resources from manual data tasks so they could build more innovative ad products faster.”
Joe is a serial entrepreneur, including founding roles in Diamond and Global Accessibility Awareness Day. Joe has more than 20 years of development experience in high-profile projects spanning Digital Media, Machine Learning, Search Engines, Performance Management; for Internet backbone providers, Investment Banks, Telcos, Big Pharma and Big Media. In addition to starting several companies, Joe serves on the boards of Cross Campus, LAX Chamber of Commerce, Cloud CMS & Dock.
Xavier Kochhar – Venture Investor, Board Member, Data Architect, AI, Structured Data, and UX/UI Advisor, and Founder, The Video Genome Project (hulu): Xavier Kochhar is a structured data, blockchain, and AI expert who advises C-level executives of MVPD and video distribution companies on their monetization, product, user experience, and data architecture strategies, while sitting on the board of video OTT, IP right management, and blockchain technology companies. Mr. Kochhar is also the Founder of The Video Genome Project® (The VGP), a company whose mission is deeply rooted in the belief that the world’s data should be accessible for all to use. The VGP is the largest, broadest, and most granular structured database of video content (film, tv, and online video) in the industry. The VGP and its underlying structuring technology have had massive implications not only for video content curation tools (such as search, recommendation, and personalization), but also for developers, publishers, content providers, marketers, and users all over the world who utilize any form of video content. Additionally, the Company’s Insights division provides analytical lens tools which it layers on top of its structured databases so that its customers can glean actionable insights relating to video content that are used to drive business decisions. The Video Genome Project was acquired by Hulu, the OTT premium video distribution joint venture between 21st Century Fox, The Walt Disney Company, Comcast NBC Universal, and Time Warner, in order to power personalization, recommendation, search and analytics products in advance of the company’s recently launched live, linear + VOD internet-delivered TV (dMVPD) service. Prior to The Video Genome Project, Mr. Kochhar was Managing Partner of MediaLink (acquired by Ascential plc), a strategic advisory firm operating at the intersection of the media, advertising, and technology communities. As one of the early executives and part of the original leadership team at MediaLink, Mr. Kochhar helped to build the firm into the media and advertising industry’s preeminent strategy and implementation firm. Mr. Kochhar joined MediaLink from The Walt Disney Company, where he served as a senior executive within Disney Corporate. Before Disney, Mr. Kochhar was a corporate packaging agent at the William Morris Agency in Beverly Hills and a strategy consultant in the Media and Entertainment group for London-based L.E.K. Consulting. Mr. Kochhar began his career in corporate finance working as an investment banker for Tucker Anthony Sutro (Royal Bank of Canada).
As a leader in open data, education, community-building, and civic innovation, Jeanne Holm empowers people to discover new knowledge and collaborate to improve life on Earth and beyond. Jeanne is the Deputy CIO and Senior Technology Advisor to the Mayor for the City of Los Angeles, bringing technical innovations to 4 million people. Her work in Los Angeles focuses on delivering great city services like 311, public television, and social media, and public-private collaborations for technology innovations ranging from data science to broadband. She was formerly the Evangelist for open data for the White House, the leader for Africa open data for the World Bank, and the Chief Knowledge Architect at NASA. She is a Distinguished Instructor at UCLA, a Fellow of the United Nations International Academy of Astronautics. She directs two startups: that promote peace and social justice through education for innovators throughout the world.
The future holds a more connected ecosystem where government, citizens, and businesses share information in more intertwined ways. Getting access to that information, equitably, is a challenge throughout the world. In the City of Los Angeles, public-private partnerships are helping to pave the way to connect all 4,000,000 residents with the information and services they need to thrive. Learn how Los Angeles is using leading-edge technology to create better access and to connect all of our communities, citizens, and businesses.
Robert Parviainen is leading the data science team at Seriously Digital Entertainment, a gaming and entertainment company with a mission of marrying the creative with the data. Before his career in gaming, Robert held research positions at the University of Melbourne and Reykjavik University, and received a Ph.D. in Mathematical Statistics from Uppsala University in Sweden.
One of the exciting parts about working in data science today is being exposed to, and learning from, the wide variety of backgrounds and experiences in the community. Today I want to contribute a little bit by sharing things from my background in statistics.
I completed my masters in statistics & applied mathematics from University of California, Santa Cruz. I worked as a research assistant at University of Washington in the computer science department. Before coming to UCSC, I worked for Accenture for four years, and have a B.S. degree in CS.
João Fiadeiro is a Product Manager at Google working in YouTube’s Music team, where he builds tools that help artists connect with their fans, understand their audience and maximize the reach of their music.
Before becoming a Product Manager, João was a data scientist working in YouTube’s Strategy team where he became an expert on data analysis and strived when applying structured thinking to ambiguous business challenges. João obtained his master’s degree from the University of Oxford, where he researched the nature of social networks using massive data sets. He also holds a degree in Computer Science and Business from University College London (UCL).
In his spare time João is likely travelling: he grew up in 10 different countries and continues to explore many new places every year. He spends his time at the intersection between art, technology and education: trying to build emotionally sensitive robots or just listening to the newest electronic music tracks.
YouTube’s goal is to drive artist audience size and artist engagement simultaneously. But how do you decouple an organically growing audience from the incremental reach that a social action may result in? This talk explores an ML-based approach to isolating the incremental effect of social activations.
Clara Shin is a Business Insight Analyst in the Data Science team at Disney. Her work mainly focuses on building statistical and machine learning models for audience segmentation. She is also experienced in Game analytics, Media forecasting and data engineering. Clara earned a Master’s degree in Statistics from the University of Minnesota. Outside of work, she enjoys hiking, biking and playing video games.
Subash DSouza is a respected Tech Community Evangelist. He has been at the forefront of evangelizing the use data & analytics in the SoCal region. He has several years honing his database & engineering skills. He currently runs the Los Angeles Big Data Users Group with over 3000+ members and the Los Angeles Apache Spark users groups with over 1500+ members. he also organizes Big Data Day LA, a community driven data conference held annually. The conference draws over 2000 people every year. In his day time, Subash runs the Big Data & DevSecOps for the Data Intelligence team at WB.
One of the major aspects when it comes to ingesting and processing is understanding how to bring data together. Today over 70% of data analytics is actually spent in cleaning and parsing the data so value can be derived from it. But this is not trivial due to the large volumes of datasets we deal with. This talk will go over what it takes to understand how we can setup data governance principles that can be adhered by everyone in order to create a better & quicker analytics system.
Will is as a Senior Data Scientist at Netflix in Los Angeles, where he builds machine learning models to predict demand for the movies and TV shows Netflix might want to stream to its subscribers globally. He also occasionally serves as a Data Ambassador for DataKind, bringing state-of-the-art data science practices to bear on problems facing non-profits in the health, education and water sectors since 2013. Previously he worked in the online digital advertising domain in New York. Will received a PhD from Harvard and a BA from Berkeley and has conducted research at Caltech and the University of Chicago, where he specialized in gravitational lensing studies of dark matter and dark energy in the fields of astrophysics and cosmology.
Natasha is a statistician by training with a focus on statistical modeling, programming and theory. She has 14+ years of analytical and data science experience in business applications, consulting and strategy including vendor management and analyst mentorship across multiple industries and verticals. Natasha has an applied mathematics background, programming specialization and Masters in Statistics from UCLA. Being an LA Native, she loves everything about Southern California and has been surrounded by entertainment news and culture most of her life. Her work in NRG directly impacts major entertainment studios and streaming and social media companies in the areas of market research, concept and content optimization and positioning, revenue forecasting and measurement.
“As streaming products begin to proliferate the entertainment space, moviegoing has been in a constant state of flux. Moviegoers have evolved into highly aware and particular consumers whose entertainment share of wallet has become more and more competitive for theatrical films. This talk will highlight how NRG utilizes AI to help shed light on what motivates theatrical attendance leading to accurate Opening Box Office forecasts as far as 1 year out before a film’s release.”
Scott has been working in the data science space for over fifteen years — ever since he escaped from social science grad school in the late 90’s, narrowly avoiding being branded a Ph.D. (still stinging after the M.S.). He feel lucky to have waited long enough for the business world to wake up to the power of combining math, operations research, and computer science to make more money (or at least look really cool).
NBC.com (and associated apps on 15 or so platforms) has transformed in the last decade from what was once ‘brochureware for the Fall schedule’ to being one of the primary ways people consume our entertainment and news content. After making great gains in performance and scalability, during the last year the focus has been on increasing personalization. This meant moving away from editorially-curated lists of shows to user-specific recommendations powered by an algorithm. Not wanted to invent the wheel for this first version, we went with the gold standard of collaborative filtering. I’ll discuss how we implemented it.
Talk abstract coming soon.
Vipin is a Principal at Work-Bench, an enterprise tech focused VC fund and startup community in New York. Work-Bench helps corporate executives across various industry verticals find solutions they need to their biggest technology painpoints, investing in the startups whose solutions resonate across the board. Vipin covers the firm’s investments in enterprise infrastructure, DevOps and AI. Prior to joining Work-Bench, Vipin worked in the Office of the CIO at Bank of America, where his team vetted hundreds of startups a year and onboarded many as vendors. He’s been involved with the majority of Work-Bench’s investments last 3 years, sourced investments in CoreOS, Semmle and currently serves as Board Observer at Algorithmia, an open marketplace for algorithms. Vipin was named a 2018 Forbes 30 Under 30 in the Venture Capital category.
Scott Breitenother is an investor and advisor who specializes in building data driven organizations. He was employee #16 at direct-to-consumer mattress startup Casper and founded the company’s industry-leading Data & Analytics team. In a former life, Scott was a Management Consultant at L.E.K. Consulting (which is probably where he developed his love of frameworks and structure). He has a BS in Business Management from Babson College and a MSc in International Management from London School of Economics. When he’s not blogging about analytics trends at LocallyOptimistic.com, you can find him walking around Brooklyn with his wife and daughter.
“The fastest way to doom an Analytics team (and any hope of building a data-driven organization) is to frequently present unreliable data and analyses. But how do you ensure data quality at scale? Analytics leaders from Casper and Harry’s talk about how to build the Data Quality Flywheel – a scalable approach to data quality that empowers everyone in the organization to promote data quality.”
Michael cut his teeth applying econometric research methods to a variety of fields including environmental economics, child welfare policy, healthcare outcomes, and medical treatment efficacy. Michael was the founding member of the analytics team at new-wave men’s grooming brand Harry’s where he drove data science, data engineering, and anything else needed to empower the organization to “make better decisions faster.” In his spare time Michael drinks too much coffee, reads, and pets dogs around Bed-Stuy.
Gilad is the Head of Data Science and Analytics at BuzzFeed where he leads a team that’s tasked with building state of the art analytics and data products to support entertainment, news, business and tech. Previously Gilad led Data Science at betaworks, and built data products at Microsoft. Gilad’s work has been covered across a wide range of academic journals and publications, most recently focused on propaganda and bias within algorithmic ranking systems.
On a daily basis, BuzzFeed produces and distributes thousands of pieces of content across the largest digital media network in the world, fueled by a world class technology team. In this talk we’ll discuss how data science, machine learning and neural networks help power decisions, run forecasts, quantify audiences, and build unique tools that give our content creators and editors superpowers.
Jennifer Shin is the Founder of 8 Path Solutions, a data science, analytics, and technology company based in NYC. She is an experienced data scientist and management consultant who has successfully led complex, large scale, and high profile projects as a Product Director at NBCUniversal, Director of Data Science at Comcast, Senior Principal Data Scientist at The Nielsen Company and Management Consultant at GE Capital, the Carlyle Group, Fortress Investment Group, the City of New York, and Columbia University.
“Similarity measures are the driving force for nearly every machine learning algorithm and AI driven technology. From cleaning customer information, to ranking product recommendations, to identifying audiences, the application of machine learning and AI in media and entertainment are limitless.”
Christopher Whitely is Senior Director of Data Science and Research at Comcast / Freewheel, working on deep-dive viewership modeling efforts and data-driven advertising campaigns. Previously, as part of Comcast EBI, he led applied analysis projects and developed business intelligence tools to identify drivers of content performance and enhance networks’ programming strategies. Chris also led cross-platform research efforts at the Weather Channel and ad sales research at the national cable networks of NBCU. He holds an MBA from the University of Michigan Ross School of Business and a BA from Middlebury College.
“Marketers are increasingly using data to make more effective media buying decisions and assess the impact of their campaigns. While this is already producing significant benefits, data sets are often limited and have their own biases, making it hard to assess the true impact without a deep understanding of the data. How can data scientists follow a rigorous process for this, and ultimately deliver models and insights to their clients that drive the business? What additional ways can data science and machine learning be applied to the media/entertainment space in the future?”
Igor Uzilevskiy is a Sr. Data Scientist on the Data Integration team at Nielsen, where he works on combining Nielsen data from different sources using statistical methodologies. Igor holds an M.S. in Analytics from Northwestern University.
“Data Integration at Nielsen is used to combine data from different sources, such as various Nielsen panels and surveys, to create modeled single source data. The approach most widely used at Nielsen is referred to as “data fusion” and relies on nearest neighbor matching. For this project we wished to integrate the Nielsen NScore Survey of Consumer Attitudes to Celebrities with three other Nielsen data sources, measuring respectively Consumer Lifestyle Behavior, TV Viewing, and CPG Purchasing. We compared data fusion to logistic regression, and found that logistic regression performed better for this use case according to commonly used metrics.”
Eyal Pfeifel is the CTO and Co-Founder of imperson, a Disney Accelerator alum and developers of conversational AI technology that power premium conversational bots via text, voice, and video.
Olivia manages data products within the Data Strategy team (data engineering, data science, BI, audience activation, marketing sciences). Her primary focus is building in-house tools for analytics, predictive modeling + enhanced audience segmentation to create unique, personalized experiences on-site across all CN brands. A subset of this focus is on a offering called Spire— a highly valued advertising product which enables augmented insights into O+O audiences (marrying purchase data with on-site behavior) + campaign opportunities, as well as partners like Vox, NBCU, Advanced Digital, etc.
Ling is a Senior Research Scientist at Tumblr working with a Data Science and Analytics Team focused on user interest and user behavior analytics with big data. Ling holds her Ph.D. in Statistics from Iowa State University. At Tumblr, she works on many R&D initiatives connecting millions of diverse social media data by leveraging the latest business trends and data science techniques–including machine learning, data mining and predictive models–to create a holistic user experience.
“Tumblr is one of the most popular and vibrant social networks with over 400 million blogs and 160 billion posts, where your interests connect you with your people. In this talk Ling will provide a brief overview of how Tumblr data science and analytics team exploits data science techniques to drive product developments and product operations. Next she will discuss recent projects including real time anomaly detection, user retention and segmentation, and funnel analysis.”
Luis Capelo lead Forbes Media’s Data Products team. Our team is responsible for investigating how Forbes articles are distributed and read, identifying patterns that are then used to improve business metrics via new models and algorithms. Our solutions include an AI-agent that collaborates with writers, a swarm of bots that help our editors distribute our content, and a new analysis tool that traces and predicts how our content is shared.
“At Forbes, we believe that there is great potential in humans and machines working together. We think that machines ought to enhance human abilities, making human work better. That affects how we write stories. In this talk we will be introducing Bertie, our new publishing platform. Bertie is an AI assistant that learns from writers at all times and works with to suggest improvements to their stories. We will discuss Bertie’s features, architecture, and ultimate goals. We will be giving special attention to how we implement an ensemble of machine learning models that, together, makeup a skillset and personality of an AI assistant.”
Sophia Tee is a Principal Data Scientist at Verizon, where she helps guide supply chain strategy in the Planning Analytics Group. She is a native of the tiny island nation of Singapore and graduate of Northwestern University. After beginning her career in finance, Sophia obtained a Masters Degree in Statistics at Yale University purely so that she can tell people she “models professionally.”
Discussion on the pros and cons of the old, current and new machine learning methodologies used for forecasting.
Friederike Schüür is a research engineer at Cloudera Fast Forward Labs, where she imagines what applied machine learning in industry will look like in two years time; a time horizon that fosters ambition and yet provides grounding. She dives into new machine learning capabilities and builds fully functioning prototypes that showcase state-of-the-art technology applied to real use cases. She advises clients on how to make use of new machine learning capabilities, from strategy advising to hands-on collaboration with in-house technical teams. She earned a PhD in Cognitive Neuroscience from University College London and is a long-time data science for social good volunteer with DataKind.
“When we humans learn new tasks, we take advantage of knowledge we gained from learning, or having learned, related tasks. Machines tend to struggle to take advantage of such task relationships. Most machine learning algorithms are trained to master one and one task only. Multi-task learning is an approach to problem solving that allows supervised algorithms to master more than one objective (or task). It works by exposing algorithms to not just one but multiple sets of labels, one for each task. Multi-task trained algorithms learn task relationships in an unsupervised fashion for better performance, akin to human learning. Exposure to multiple sets of labels also nudges algorithms to learn more abstract representations of input data that tend to generalize better. In this talk, I introduce multi-task learning. I cover example applications of multi-task learning to image and text data to explain why it has become an exciting approach for practicing data scientists and I go over the building blocks of a multi-task neural network for text classification (implemented in pytorch) to demonstrate how to build and train multi-task models.”
Amy Yu is Senior Director of Product Strategy & Data Science in the Audience Science team at Viacom, where she leads the design and development of visual data science platforms that enable data-driven decisions across Viacom’s portfolio brands. Amy holds a M.S. in Media Arts and Sciences from MIT Media Lab and graduated from the Jerome Fisher Program in Management and Technology at the University of Pennsylvania.
“In today’s rapidly evolving media landscape, data is emerging as the differentiating factor driving critical innovations and decisions within the content industry. Data science is a critical source of new insights into audience behavior, and is a growing force in shaping how content is created and delivered. This talk will review the key challenges of data science at scale in media, and discuss how Viacom is innovating in the space of big data and advanced audience analytics”
Bio coming soon
Data Scientist and Visualization Engineer
Mollie wears many technical hats including that of a data scientist, a data visualization engineer, and an instructor of both fields. In her career, Mollie has worked on projects involving a wide variety of problems including but not limited to interactive data visualizations, exploratory data analysis, machine learning, corporate data tool creation, course development, instruction, and ideation. Mollie previously worked with Datascope Analytics as a data scientist / consultant and at Metis as a data visualization (D3.js) and data science instructor. In addition to freelance work, she is currently working as a technical mentor with the Data Science for Social Good Fellowship with the University of Chicago. When Mollie is not being a technical nerd, she swing dances as much as possible, listens to educational podcasts, and strives to be all-around fabulous.