Note: The TDS podcast’s current run has ended.
Researchers and business leaders at the forefront of the field unpack the most pressing questions around data science and AI.
php/* */ ?>
Note: The TDS podcast’s current run has ended.
Researchers and business leaders at the forefront of the field unpack the most pressing questions around data science and AI.
Copyright: © The TDS team
Progress in AI has been accelerating dramatically in recent years, and even months. It seems like every other day, there’s a new, previously-believed-to-be-impossible feat of AI that’s achieved by a world-leading lab. And increasingly, these breakthroughs have been driven by the same, simple idea: AI scaling.
For those who haven’t been following the AI scaling sage, scaling means training AI systems with larger models, using increasingly absurd quantities of data and processing power. So far, empirical studies by the world’s top AI labs seem to suggest that scaling is an open-ended process that can lead to more and more capable and intelligent systems, with no clear limit.
And that’s led many people to speculate that scaling might usher in a new era of broadly human-level or even superhuman AI — the holy grail AI researchers have been after for decades.
And while that might sound cool, an AI that can solve general reasoning problems as well as or better than a human might actually be an intrinsically dangerous thing to build.
At least, that’s the conclusion that many AI safety researchers have come to following the publication of a new line of research that explores how modern AI systems tend to solve problems, and whether we should expect more advanced versions of them to perform dangerous behaviours like seeking power.
This line of research in AI safety is called “power-seeking”, and although it’s currently not well understood outside the frontier of AI safety and AI alignment research, it’s starting to draw a lot of attention. The first major theoretical study of power seeking was led by Alex Turner, who’s appeared on the podcast before, and was published in NeurIPS (the world’s top AI conference), for example.
And today, we’ll be hearing from Edouard Harris, an AI alignment researcher and one of my co-founders in the AI safety company (Gladstone AI). Ed’s just completed a significant piece of AI safety research that extends Alex Turner’s original power-seeking work, and that shows what seems to be the first experimental evidence suggesting that we should expect highly advanced AI systems to seek power by default.
What does power seeking really mean though? And does all this imply for the safety of future, general-purpose reasoning systems? That’s what this episode will be all about.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
- 0:00 Intro
- 4:00 Alex Turner's research
- 7:45 What technology wants
- 11:30 Universal goals
- 17:30 Connecting observations
- 24:00 Micro power seeking behaviour
- 28:15 Ed's research
- 38:00 The human as the environment
- 42:30 What leads to power seeking
- 48:00 Competition as a default outcome
- 52:45 General concern
- 57:30 Wrap-up
It’s no secret that a new generation of powerful and highly scaled language models is taking the world by storm. Companies like OpenAI, AI21Labs, and Cohere have built models so versatile that they’re powering hundreds of new applications, and unlocking entire new markets for AI-generated text.
In light of that, I thought it would be worth exploring the applied side of language modelling — to dive deep into one specific language model-powered tool, to understand what it means to build apps on top of scaled AI systems. How easily can these models be used in the wild? What bottlenecks and challenges do people run into when they try to build apps powered by large language models? That’s what I wanted to find out.
My guest today is Amber Teng, and she’s a data scientist who recently published a blog that got quite a bit of attention, about a resume cover letter generator that she created using GPT-3, OpenAI’s powerful and now-famous language model. I thought her project would be make for a great episode, because it exposes so many of the challenges and opportunities that come with the new era of powerful language models that we’ve just entered.
So today we’ll be exploring exactly that: looking at the applied side of language modelling and prompt engineering, understanding how large language models have made new apps not only possible but also much easier to build, and the likely future of AI-powered products.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
- 0:00 Intro
- 2:30 Amber’s background
- 5:30 Using GPT-3
- 14:45 Building prompts up
- 18:15 Prompting best practices
- 21:45 GPT-3 mistakes
- 25:30 Context windows
- 30:00 End-to-end time
- 34:45 The cost of one cover letter
- 37:00 The analytics
- 41:45 Dynamics around company-building
- 46:00 Commoditization of language modelling
- 51:00 Wrap-up
Imagine you’re a big hedge fund, and you want to go out and buy yourself some data. Data is really valuable for you — it’s literally going to shape your investment decisions and determine your outcomes.
But the moment you receive your data, a cold chill runs down your spine: how do you know your data supplier gave you the data they said they would? From your perspective, you’re staring down 100,000 rows in a spreadsheet, with no way to tell if half of them were made up — or maybe more for that matter.
This might seem like an obvious problem in hindsight, but it’s one most of us haven’t even thought of. We tend to assume that data is data, and that 100,000 rows in a spreadsheet is 100,000 legitimate samples.
The challenge of making sure you’re dealing with high-quality data, or at least that you have the data you think you do, is called data observability, and it’s surprisingly difficult to solve for at scale. In fact, there are now entire companies that specialize in exactly that — one of which is Zectonal, whose co-founder Dave Hirko will be joining us for today’s episode of the podcast.
Dave has spent his career understanding how to evaluate and monitor data at massive scale. He did that first at AWS in the early days of cloud computing, and now through Zectonal, where he’s working on strategies that allow companies to detect issues with their data — whether they’re caused by intentional data poisoning, or unintentional data quality problems. Dave joined me to talk about data observability, data as a new vector for cyberattacks, and the future of enterprise data management on this episode of the TDS podcast.
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
Today, we live in the era of AI scaling. It seems like everywhere you look people are pushing to make large language models larger, or more multi-modal and leveraging ungodly amounts of processing power to do it.
But although that’s one of the defining trends of the modern AI era, it’s not the only one. At the far opposite extreme from the world of hyperscale transformers and giant dense nets is the fast-evolving world of TinyML, where the goal is to pack AI systems onto small edge devices.
My guest today is Matthew Stewart, a deep learning and TinyML researcher at Harvard University, where he collaborates with the world’s leading IoT and TinyML experts on projects aimed at getting small devices to do big things with AI. Recently, along with his colleagues, Matt co-authored a paper that introduced a new way of thinking about sensing.
The idea is to tightly integrate machine learning and sensing on one device. For example, today we might have a sensor like a camera embedded on an edge device, and that camera would have to send data about all the pixels in its field of view back to a central server that might take that data and use it to perform a task like facial recognition. But that’s not great because it involves sending potentially sensitive data — in this case, images of people’s faces — from an edge device to a server, introducing security risks.
So instead, what if the camera’s output was processed on the edge device itself, so that all that had to be sent to the server was much less sensitive information, like whether or not a given face was detected? These systems — where edge devices harness onboard AI, and share only processed outputs with the rest of the world — are what Matt and his colleagues call ML sensors.
ML sensors really do seem like they’ll be part of the future, and they introduce a host of challenging ethical, privacy, and operational questions that I discussed with Matt on this episode of the TDS podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
- 3:20 Special challenges with TinyML
- 9:00 Most challenging aspects of Matt’s work
- 12:30 ML sensors
- 21:30 Customizing the technology
- 24:45 Data sheets and ML sensors
- 31:30 Customers with their own custom software
- 36:00 Access to the algorithm
- 40:30 Wrap-up
Deep learning models — transformers in particular — are defining the cutting edge of AI today. They’re based on an architecture called an artificial neural network, as you probably already know if you’re a regular Towards Data Science reader. And if you are, then you might also already know that as their name suggests, artificial neural networks were inspired by the structure and function of biological neural networks, like those that handle information processing in our brains.
So it’s a natural question to ask: how far does that analogy go? Today, deep neural networks can master an increasingly wide range of skills that were historically unique to humans — skills like creating images, or using language, planning, playing video games, and so on. Could that mean that these systems are processing information like the human brain, too?
To explore that question, we’ll be talking to JR King, a CNRS researcher at the Ecole Normale Supérieure, affiliated with Meta AI, where he leads the Brain & AI group. There, he works on identifying the computational basis of human intelligence, with a focus on language. JR is a remarkably insightful thinker, who’s spent a lot of time studying biological intelligence, where it comes from, and how it maps onto artificial intelligence. And he joined me to explore the fascinating intersection of biological and artificial information processing on this episode of the TDS podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
It’s no secret that the US and China are geopolitical rivals. And it’s also no secret that that rivalry extends into AI — an area both countries consider to be strategically critical.
But in a context where potentially transformative AI capabilities are being unlocked every few weeks, many of which lend themselves to military applications with hugely destabilizing potential, you might hope that the US and China would have robust agreements in place to deal with things like runaway conflict escalation triggered by an AI powered weapon that misfires. Even at the height of the cold war, the US and Russia had robust lines of communication to de-escalate potential nuclear conflicts, so surely the US and China have something at least as good in place now… right?
Well they don’t, and to understand the reason why — and what we should do about it — I’ll be speaking to Ryan Fedashuk, a Research Analyst at Georgetown University’s Center for Security and Emerging Technology and Adjunct Fellow at the Center for a New American Security. Ryan recently wrote a fascinating article for Foreign Policy Magazine, where he outlines the challenges and importance of US-China collaboration on AI safety. He joined me to talk about the U.S. and China’s shared interest in building safe AI, how reach side views the other, and what realistic China AI policy looks like on this episode of the TDs podcast.
There’s a website called thispersondoesnotexist.com. When you visit it, you’re confronted by a high-resolution, photorealistic AI-generated picture of a human face. As the website’s name suggests, there’s no human being on the face of the earth who looks quite like the person staring back at you on the page.
Each of those generated pictures are a piece of data that captures so much of the essence of what it means to look like a human being. And yet they do so without telling you anything whatsoever about any particular person. In that sense, it’s fully anonymous human face data.
That’s impressive enough, and it speaks to how far generative image models have come over the last decade. But what if we could do the same for any kind of data?
What if I could generate an anonymized set of medical records or financial transaction data that captures all of the latent relationships buried in a private dataset, without the risk of leaking sensitive information about real people? That’s the mission of Alex Watson, the Chief Product Officer and co-founder of Gretel AI, where he works on unlocking value hidden in sensitive datasets in ways that preserve privacy.
What I realized talking to Alex was that synthetic data is about much more than ensuring privacy. As you’ll see over the course of the conversation, we may well be heading for a world where most data can benefit from augmentation via data synthesis — where synthetic data brings privacy value almost as a side-effect of enriching ground truth data with context imported from the wider world.
Alex joined me to talk about data privacy, data synthesis, and what could be the very strange future of the data lifecycle on this episode of the TDS podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
Two ML researchers with world-class pedigrees who decided to build a company that puts AI on the blockchain. Now to most people — myself included — “AI on the blockchain” sounds like a winning entry in some kind of startup buzzword bingo. But what I discovered talking to Jacob and Ala was that they actually have good reasons to combine those two ingredients together.
At a high level, doing AI on a blockchain allows you to decentralize AI research and reward labs for building better models, and not for publishing papers in flashy journals with often biased reviewers.
And that’s not all — as we’ll see, Ala and Jacob are taking on some of the thorniest current problems in AI with their decentralized approach to machine learning. Everything from the problem of designing robust benchmarks to rewarding good AI research and even the centralization of power in the hands of a few large companies building powerful AI systems — these problems are all in their sights as they build out Bittensor, their AI-on-the-blockchain-startup.
Ala and Jacob joined me to talk about all those things and more on this episode of the TDS podcast.
---
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
As you might know if you follow the podcast, we usually talk about the world of cutting-edge AI capabilities, and some of the emerging safety risks and other challenges that the future of AI might bring. But I thought that for today’s episode, it would be fun to change things up a bit and talk about the applied side of data science, and how the field has evolved over the last year or two.
And I found the perfect guest to do that with: her name is Sadie St. Lawrence, and among other things, she’s the founder of Women in Data — a community that helps women enter the field of data and advance throughout their careers — and she’s also the host of the Data Bytes podcast, a seasoned data scientist and a community builder extraordinaire. Sadie joined me to talk about her founder’s journey, what data science looks like today, and even the possibilities that blockchains introduce for data science on this episode of the towards data science podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
If the name data2vec sounds familiar, that’s probably because it made quite a splash on social and even traditional media when it came out, about two months ago. It’s an important entry in what is now a growing list of strategies that are focused on creating individual machine learning architectures that handle many different data types, like text, image and speech.
Most self-supervised learning techniques involve getting a model to take some input data (say, an image or a piece of text) and mask out certain components of those inputs (say by blacking out pixels or words) in order to get the models to predict those masked out components.
That “filling in the blanks” task is hard enough to force AIs to learn facts about their data that generalize well, but it also means training models to perform tasks that are very different depending on the input data type. Filling in blacked out pixels is quite different from filling in blanks in a sentence, for example.
So what if there was a way to come up with one task that we could use to train machine learning models on any kind of data? That’s where data2vec comes in.
For this episode of the podcast, I’m joined by Alexei Baevski, a researcher at Meta AI one of the creators of data2vec. In addition to data2vec, Alexei has been involved in quite a bit of pioneering work on text and speech models, including wav2vec, Facebook’s widely publicized unsupervised speech model. Alexei joined me to talk about how data2vec works and what’s next for that research direction, as well as the future of multi-modal learning.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
AI scaling has really taken off. Ever since GPT-3 came out, it’s become clear that one of the things we’ll need to do to move beyond narrow AI and towards more generally intelligent systems is going to be to massively scale up the size of our models, the amount of processing power they consume and the amount of data they’re trained on, all at the same time.
That’s led to a huge wave of highly scaled models that are incredibly expensive to train, largely because of their enormous compute budgets. But what if there was a more flexible way to scale AI — one that allowed us to decouple model size from compute budgets, so that we can track a more compute-efficient course to scale?
That’s the promise of so-called mixture of experts models, or MoEs. Unlike more traditional transformers, MoEs don’t update all of their parameters on every training pass. Instead, they route inputs intelligently to sub-models called experts, which can each specialize in different tasks. On a given training pass, only those experts have their parameters updated. The result is a sparse model, a more compute-efficient training process, and a new potential path to scale.
Google has been pushing the frontier of research on MoEs, and my two guests today in particular have been involved in pioneering work on that strategy (among many others!). Liam Fedus and Barrett Zoph are research scientists at Google Brain, and they joined me to talk about AI scaling, sparsity and the present and future of MoE models on this episode of the TDS podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
There’s an idea in machine learning that most of the progress we see in AI doesn’t come from new algorithms of model architectures. instead, some argue, progress almost entirely comes from scaling up compute power, datasets and model sizes — and besides those three ingredients, nothing else really matters.
Through that lens the history of AI becomes the history f processing power and compute budgets. And if that turns out to be true, then we might be able to do a decent job of predicting AI progress by studying trends in compute power and their impact on AI development.
And that’s why I wanted to talk to Jaime Sevilla, an independent researcher and AI forecaster, and affiliate researcher at Cambridge University’s Centre for the Study of Existential Risk, where he works on technological forecasting and understanding trends in AI in particular. His work’s been cited in a lot of cool places, including Our World In Data, who used his team’s data to put together an exposé on trends in compute. Jaime joined me to talk about compute trends and AI forecasting on this episode of the TDS podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
Generating well-referenced and accurate Wikipedia articles has always been an important problem: Wikipedia has essentially become the Internet's encyclopedia of record, and hundreds of millions of people use it do understand the world.
But over the last decade Wikipedia has also become a critical source of training data for data-hungry text generation models. As a result, any shortcomings in Wikipedia’s content are at risk of being amplified by the text generation tools of the future. If one type of topic or person is chronically under-represented in Wikipedia’s corpus, we can expect generative text models to mirror — or even amplify — that under-representation in their outputs.
Through that lens, the project of Wikipedia article generation is about much more than it seems — it’s quite literally about setting the scene for the language generation systems of the future, and empowering humans to guide those systems in more robust ways.
That’s why I wanted to talk to Meta AI researcher Angela Fan, whose latest project is focused on generating reliable, accurate, and structured Wikipedia articles. She joined me to talk about her work, the implications of high-quality long-form text generation, and the future of human/AI collaboration on this episode of the TDS podcast.
---
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Trustworthy AI is one of today’s most popular buzzwords. But although everyone seems to agree that we want AI to be trustworthy, definitions of trustworthiness are often fuzzy or inadequate. Maybe that shouldn’t be surprising: it’s hard to come up with a single set of standards that add up to “trustworthiness”, and that apply just as well to a Netflix movie recommendation as a self-driving car.
So maybe trustworthy AI needs to be thought of in a more nuanced way — one that reflects the intricacies of individual AI use cases. If that’s true, then new questions come up: who gets to define trustworthiness, and who bears responsibility when a lack of trustworthiness leads to harms like AI accidents, or undesired biases?
Through that lens, trustworthiness becomes a problem not just for algorithms, but for organizations. And that’s exactly the case that Beena Ammanath makes in her upcoming book, Trustworthy AI, which explores AI trustworthiness from a practical perspective, looking at what concrete steps companies can take to make their in-house AI work safer, better and more reliable. Beena joined me to talk about defining trustworthiness, explainability and robustness in AI, as well as the future of AI regulation and self-regulation on this episode of the TDS podcast.
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
Until recently, very few people were paying attention to the potential malicious applications of AI. And that made some sense: in an era where AIs were narrow and had to be purpose-built for every application, you’d need an entire research team to develop AI tools for malicious applications. Since it’s more profitable (and safer) for that kind of talent to work in the legal economy, AI didn’t offer much low-hanging fruit for malicious actors.
But today, that’s all changing. As AI becomes more flexible and general, the link between the purpose for which an AI was built and its potential downstream applications has all but disappeared. Large language models can be trained to perform valuable tasks, like supporting writers, translating between languages, or write better code. But a system that can write an essay can also write a fake news article, or power an army of humanlike text-generating bots.
More than any other moment in the history of AI, the move to scaled, general-purpose foundation models has shown how AI can be a double-edged sword. And now that these models exist, we have to come to terms with them, and figure out how to build societies that remain stable in the face of compelling AI-generated content, and increasingly accessible AI-powered tools with malicious use potential.
That’s why I wanted to speak with Katya Sedova, a former Congressional Fellow and Microsoft alumna who now works at Georgetown University’s Center for Security and Emerging Technology, where she recently co-authored some fascinating work exploring current and likely future malicious uses of AI. If you like this conversation I’d really recommend checking out her team’s latest report — it’s called “AI and the future of disinformation campaigns”.
Katya joined me to talk about malicious AI-powered chatbots, fake news generation and the future of AI-augmented influence campaigns on this episode of the TDS podcast.
***
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Imagine, for example, an AI that’s trained to identify cows in images. Ideally, we’d want it to learn to detect cows based on their shape and colour. But what if the cow pictures we put in the training dataset always show cows standing on grass?
In that case, we have a spurious correlation between grass and cows, and if we’re not careful, our AI might learn to become a grass detector rather than a cow detector. Even worse, we could only realize that’s happened once we’ve deployed it in the real world and it runs into a cow that isn’t standing on grass for the first time.
So how do you build AI systems that can learn robust, general concepts that remain valid outside the context of their training data?
That’s the problem of out-of-distribution generalization, and it’s a central part of the research agenda of Irina Rish, a core member of the Mila— Quebec AI Research institute, and the Canadian Excellence Research Chair in Autonomous AI. Irina’s research explores many different strategies that aim to overcome the out-of-distribution problem, from empirical AI scaling efforts to more theoretical work, and she joined me to talk about just that on this episode of the podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Google the phrase “AI over-hyped”, and you’ll find literally dozens of articles from the likes of Forbes, Wired, and Scientific American, all arguing that “AI isn’t really as impressive at it seems from the outside,” and “we still have a long way to go before we come up with *true* AI, don’t you know.”
Amusingly, despite the universality of the “AI is over-hyped” narrative, the statement that “We haven’t made as much progress in AI as you might think™️” is often framed as somehow being an edgy, contrarian thing to believe.
All that pressure not to over-hype AI research really gets to people — researchers included. And they adjust their behaviour accordingly: they over-hedge their claims, cite outdated and since-resolved failure modes of AI systems, and generally avoid drawing straight lines between points that clearly show AI progress exploding across the board. All, presumably, to avoid being perceived as AI over-hypers.
Why does this matter? Well for one, under-hyping AI allows us to stay asleep — to delay answering many of the fundamental societal questions that come up when widespread automation of labour is on the table. But perhaps more importantly, it reduces the perceived urgency of addressing critical problems in AI safety and AI alignment.
Yes, we need to be careful that we’re not over-hyping AI. “AI startups” that don’t use AI are a problem. Predictions that artificial general intelligence is almost certainly a year away are a problem. Confidently prophesying major breakthroughs over short timescales absolutely does harm the credibility of the field.
But at the same time, we can’t let ourselves be so cautious that we’re not accurately communicating the true extent of AI’s progress and potential. So what’s the right balance?
That’s where Sam Bowman comes in. Sam is a professor at NYU, where he does research on AI and language modeling. But most important for today’s purposes, he’s the author of a paper titled, “When combating AI hype, proceed with caution,” in which he explores a trend he calls under-claiming — a common practice among researchers that consists of under-stating the extent of current AI capabilities, and over-emphasizing failure modes in ways that can be (unintentionally) deceptive.
Sam joined me to talk about under-claiming and what it means for AI progress on this episode of the Towards Data Science podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
It’s no secret that AI systems are being used in more and more high-stakes applications. As AI eats the world, it’s becoming critical to ensure that AI systems behave robustly — that they don’t get thrown off by unusual inputs, and start spitting out harmful predictions or recommending dangerous courses of action. If we’re going to have AI drive us to work, or decide who gets bank loans and who doesn’t, we’d better be confident that our AI systems aren’t going to fail because of a freak blizzard, or because some intern missed a minus sign.
We’re now past the point where companies can afford to treat AI development like a glorified Kaggle competition, in which the only thing that matters is how well models perform on a testing set. AI-powered screw-ups aren’t always life-or-death issues, but they can harm real users, and cause brand damage to companies that don’t anticipate them.
Fortunately, AI risk is starting to get more attention these days, and new companies — like Robust Intelligence — are stepping up to develop strategies that anticipate AI failures, and mitigate their effects. Joining me for this episode of the podcast was Yaron Singer, a former Googler, professor of computer science and applied math at Harvard, and now CEO and co-founder of Robust Intelligence. Yaron has the rare combination of theoretical and engineering expertise required to understand what AI risk is, and the product intuition to know how to integrate that understanding into solutions that can help developers and companies deal with AI risk.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Until very recently, the study of human disease involved looking at big things — like organs or macroscopic systems — and figuring out when and how they can stop working properly. But that’s all started to change: in recent decades, new techniques have allowed us to look at disease in a much more detailed way, by examining the behaviour and characteristics of single cells.
One class of those techniques now known as single-cell genomics — the study of gene expression and function at the level of single cells. Single-cell genomics is creating new, high-dimensional datasets consisting of tens of millions of cells whose gene expression profiles and other characteristics have been painstakingly measured. And these datasets are opening up exciting new opportunities for AI-powered drug discovery — opportunities that startups are now starting to tackle head-on.
Joining me for today’s episode is Tali Raveh, Senior Director of Computational Biology at Immunai, a startup that’s using single-cell level data to perform high resolution profiling of the immune system at industrial scale. Tali joined me to talk about what makes the immune system such an exciting frontier for modern medicine, and how single-cell data and AI might be poised to generate unprecedented breakthroughs in disease treatment on this episode of the TDS podcast.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
0:00 Intro
2:00 Tali’s background
4:00 Immune systems and modern medicine
14:40 Data collection technology
19:00 Exposing cells to different drugs
24:00 Labeled and unlabelled data
27:30 Dataset status
31:30 Recent algorithmic advances
36:00 Cancer and immunology
40:00 The next few years
41:30 Wrap-up
If you were scrolling through your newsfeed in late September 2021, you may have caught this splashy headline from The Times of London that read, “Can this man save the world from artificial intelligence?”. The man in question was Mo Gawdat, an entrepreneur and senior tech executive who spent several years as the Chief Business Officer at GoogleX (now called X Development), Google’s semi-secret research facility, that experiments with moonshot projects like self-driving cars, flying vehicles, and geothermal energy. At X, Mo was exposed to the absolute cutting edge of many fields — one of which was AI. His experience seeing AI systems learn and interact with the world raised red flags for him — hints of the potentially disastrous failure modes of the AI systems we might just end up with if we don’t get our act together now.
Mo writes about his experience as an insider at one of the world’s most secretive research labs and how it led him to worry about AI risk, but also about AI’s promise and potential in his new book, Scary Smart: The Future of Artificial Intelligence and How You Can Save Our World. He joined me to talk about just that on this episode of the TDS podcast.
Today’s episode is somewhat special, because we’re going to be talking about what might be the first solid quantitative study of the power-seeking tendencies that we can expect advanced AI systems to have in the future.
For a long time, there’s kind of been this debate in the AI safety world, between:
Unfortunately, recent work in AI alignment — and in particular, a spotlighted 2021 NeurIPS paper — suggests that the AI takeover argument might be stronger than many had realized. In fact, it’s starting to look like we ought to expect to see power-seeking behaviours from highly capable AI systems by default. These behaviours include things like AI systems preventing us from shutting them down, repurposing resources in pathological ways to serve their objectives, and even in the limit, generating catastrophes that would put humanity at risk.
As concerning as these possibilities might be, it’s exciting that we’re starting to develop a more robust and quantitative language to describe AI failures and power-seeking. That’s why I was so excited to sit down with AI researcher Alex Turner, the author of the spotlighted NeurIPS paper on power-seeking, and discuss his path into AI safety, his research agenda and his perspective on the future of AI on this episode of the TDS podcast.
***
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
- 2:05 Interest in alignment research
- 8:00 Two camps of alignment research
- 13:10 The NeurIPS paper
- 17:10 Optimal policies
- 25:00 Two-piece argument
- 28:30 Relaxing certain assumptions
- 32:45 Objections to the paper
- 39:00 Broader sense of optimization
- 46:35 Wrap-up
Until recently, AI systems have been narrow — they’ve only been able to perform the specific tasks that they were explicitly trained for. And while narrow systems are clearly useful, the holy grain of AI is to build more flexible, general systems.
But that can’t be done without good performance metrics that we can optimize for — or that we can at least use to measure generalization ability. Somehow, we need to figure out what number needs to go up in order to bring us closer to generally-capable agents. That’s the question we’ll be exploring on this episode of the podcast, with Danijar Hafner. Danijar is a PhD student in artificial intelligence at the University of Toronto with Jimmy Ba and Geoffrey Hinton and researcher at Google Brain and the Vector Institute.
Danijar has been studying the problem of performance measurement and benchmarking for RL agents with generalization abilities. As part of that work, he recently released Crafter, a tool that can procedurally generate complex environments that are a lot like Minecraft, featuring resources that need to be collected, tools that can be developed, and enemies who need to be avoided or defeated. In order to succeed in a Crafter environment, agents need to robustly plan, explore and test different strategies, which allow them to unlock certain in-game achievements.
Crafter is part of a growing set of strategies that researchers are exploring to figure out how we can benchmark and measure the performance of general-purpose AIs, and it also tells us something interesting about the state of AI: increasingly, our ability to define tasks that require the right kind of generalization abilities is becoming just as important as innovating on AI model architectures. Danijar joined me to talk about Crafter, reinforcement learning, and the big challenges facing AI researchers as they work towards general intelligence on this episode of the TDS podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
2021 has been a wild ride in many ways, but its wildest features might actually be AI-related. We’ve seen major advances in everything from language modeling to multi-modal learning, open-ended learning and even AI alignment.
So, we thought, what better way to take stock of the big AI-related milestones we’ve reached in 2021 than a cross-over episode with our friends over at the Last Week In AI podcast.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
Imagine for a minute that you’re running a profitable business, and that part of your sales strategy is to send the occasional mass email to people who’ve signed up to be on your mailing list. For a while, this approach leads to a reliable flow of new sales, but then one day, that abruptly stops. What happened?
You pour over logs, looking for an explanation, but it turns out that the problem wasn’t with your software; it was with your data. Maybe the new intern accidentally added a character to every email address in your dataset, or shuffled the names on your mailing list so that Christina got a message addressed to “John”, or vice-versa. Versions of this story happen surprisingly often, and when they happen, the cost can be significant: lost revenue, disappointed customers, or worse — an irreversible loss of trust.
Today, entire products are being built on top of datasets that aren’t monitored properly for critical failures — and an increasing number of those products are operating in high-stakes situations. That’s why data observability is so important: the ability to track the origin, transformations and characteristics of mission-critical data to detect problems before they lead to downstream harm.
And it’s also why we’ll be talking to Kevin Hu, the co-founder and CEO of Metaplane, one of the world’s first data observability startups. Kevin has a deep understanding of data pipelines, and the problems that cap pop up if you they aren’t properly monitored. He joined me to talk about data observability, why it matters, and how it might be connected to responsible AI on this episode of the TDS podcast.
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc 0:00
Chapters:
Historically, AI systems have been slow learners. For example, a computer vision model often needs to see tens of thousands of hand-written digits before it can tell a 1 apart from a 3. Even game-playing AIs like DeepMind’s AlphaGo, or its more recent descendant MuZero, need far more experience than humans do to master a given game.
So when someone develops an algorithm that can reach human-level performance at anything as fast as a human can, it’s a big deal. And that’s exactly why I asked Yang Gao to join me on this episode of the podcast. Yang is an AI researcher with affiliations at Berkeley and Tsinghua University, who recently co-authored a paper introducing EfficientZero: a reinforcement learning system that learned to play Atari games at the human-level after just two hours of in-game experience. It’s a tremendous breakthrough in sample-efficiency, and a major milestone in the development of more general and flexible AI systems.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:50 Yang’s background
- 6:00 MuZero’s activity
- 13:25 MuZero to EfficiantZero
- 19:00 Sample efficiency comparison
- 23:40 Leveraging algorithmic tweaks
- 27:10 Importance of evolution to human brains and AI systems
- 35:10 Human-level sample efficiency
- 38:28 Existential risk from AI in China
- 47:30 Evolution and language
- 49:40 Wrap-up
There once was a time when AI researchers could expect to read every new paper published in the field on the arXiv, but today, that’s no longer the case. The recent explosion of research activity in AI has turned keeping up to date with new developments into a full-time job.
Fortunately, people like YouTuber, ML PhD and sunglasses enthusiast Yannic Kilcher make it their business to distill ML news and papers into a digestible form for mortals like you and me to consume. I highly recommend his channel to any TDS podcast listeners who are interested in ML research — it’s a fantastic resource, and literally the way I finally managed to understand the Attention is All You Need paper back in the day.
Yannic is joined me to talk about what he’s learned from years of following, reporting and doing AI research, including the trends, the challenges and the opportunities that he expects are going to shape the course of AI history in coming years.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:20 Yannic’s path into ML
- 7:25 Selecting ML news
- 11:45 AI ethics → political discourse
- 17:30 AI alignment
- 24:15 Malicious uses
- 32:10 Impacts on persona
- 39:50 Bringing in human thought
- 46:45 Math with big numbers
- 51:05 Metrics for generalization
- 58:05 The future of AI
- 1:02:58 Wrap-up
Today, most machine learning algorithms use the same paradigm: set an objective, and train an agent, a neural net, or a classical model to perform well against that objective. That approach has given good results: these types of AI can hear, speak, write, read, draw, drive and more.
But they’re also inherently limited: because they optimize for objectives that seem interesting to humans, they often avoid regions of parameter space that are valuable, but that don’t immediately seem interesting to human beings, or the objective functions we set. That poses a challenge for researchers like Ken Stanley, whose goal is to build broadly superintelligent AIs — intelligent systems that outperform humans at a wide range of tasks. Among other things, Ken is a former startup founder and AI researcher, whose career has included work in academia, at UberAI labs, and most recently at OpenAI, where he leads the open-ended learning team.
Ken joined me to talk about his 2015 book Greatness Cannot Be Planned: The Myth of the Objective, what open-endedness could mean for humanity, the future of intelligence, and even AI safety on this episode of the TDS podcast.
It’s no secret that governments around the world are struggling to come up with effective policies to address the risks and opportunities that AI presents. And there are many reasons why that’s happening: many people — including technical people — think they understand what frontier AI looks like, but very few actually do, and even fewer are interested in applying their understanding in a government context, where salaries are low and stock compensation doesn’t even exist.
So there’s a critical policy-technical gap that needs bridging, and failing to address that gap isn’t really an option: it would mean flying blind through the most important test of technological governance the world has ever faced. Unfortunately, policymakers have had to move ahead with regulating and legislating with that dangerous knowledge gap in place, and the result has been less-than-stellar: widely criticized definitions of privacy and explainability, and definitions of AI that create exploitable loopholes are among some of the more concerning results.
Enter Gillian Hadfield, a Professor of Law and Professor of Strategic Management and Director of the Schwartz Reisman Institute for Technology and Society. Gillian’s background is in law and economics, which has led her to AI policy, and definitional problems with recent and emerging regulations on AI and privacy. But — as I discovered during the podcast — she also happens to be related to Dyllan Hadfield-Menell, an AI alignment researcher whom we’ve had on the show before. Partly through Dyllan, Gillian has also been exploring how principles of AI alignment research can be applied to AI policy, and to contract law. Gillian joined me to talk about all that and more on this episode of the podcast.
---
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
---
AI ethics is often treated as a dry, abstract academic subject. It doesn’t have the kinds of consistent, unifying principles that you might expect from a quantitative discipline like computer science or physics.
But somehow, the ethics rubber has to meet the AI road, and where that happens — where real developers have to deal with real users and apply concrete ethical principles — is where you find some of the most interesting, practical thinking on the topic.
That’s why I wanted to speak with Wendy Foster, the Director of Engineering and Data Science at Shopify. Wendy’s approach to AI ethics is refreshingly concrete and actionable. And unlike more abstract approaches, it’s based on clear principles like user empowerment: the idea that you should avoid forcing users to make particular decisions, and instead design user interfaces that frame AI-recommended actions as suggestions that can be ignored or acted on.
Wendy joined me to discuss her practical perspective on AI ethics, the importance of user experience design for AI products, and how responsible AI gets baked into product at Shopify on this episode of the TDS podcast.
---
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:40 Wendy’s background
- 4:40 What does practice mean?
- 14:00 Different levels of explanation
- 19:05 Trusting the system
- 24:00 Training new folks
- 30:02 Company culture
- 34:10 The core of AI ethics
- 40:10 Communicating with the user
- 44:15 Wrap-up
Over the last two years, the capabilities of AI systems have exploded. AlphaFold2, MuZero, CLIP, DALLE, GPT-3 and many other models have extended the reach of AI to new problem classes. There’s a lot to be excited about.
But as we’ve seen in other episodes of the podcast, there’s a lot more to getting value from an AI system than jacking up its capabilities. And increasingly, one of these additional missing factors is becoming trust. You can make all the powerful AIs you want, but if no one trusts their output — or if people trust it when they shouldn’t — you can end up doing more harm than good.
That’s why we invited Ayanna Howard on the podcast. Ayanna is a roboticist, entrepreneur and Dean of the College of Engineering at Ohio State University, where she focuses her research on human-machine interactions and the factors that go into building human trust in AI systems. She joined me to talk about her research, its applications in medicine and education, and the future of human-machine trust.
---
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:30 Ayanna’s background
- 6:10 The interpretability of neural networks
- 12:40 Domain of machine-human interaction
- 17:00 The issue of preference
- 20:50 Gelman/newspaper amnesia
- 26:35 Assessing a person’s persuadability
- 31:40 Doctors and new technology
- 36:00 Responsibility and accountability
- 43:15 The social pressure aspect
- 47:15 Is Ayanna optimistic?
- 53:00 Wrap-up
On the face of it, there’s no obvious limit to the reinforcement learning paradigm: you put an agent in an environment and reward it for taking good actions until it masters a task.
And by last year, RL had achieved some amazing things, including mastering Go, various Atari games, Starcraft II and so on. But the holy grail of AI isn’t to master specific games, but rather to generalize — to make agents that can perform well on new games that they haven’t been trained on before.
Fast forward to July of this year though and a team of DeepMind published a paper called “Open-Ended Learning Leads to Generally Capable Agents”, which takes a big step in the direction of general RL agents. Joining me for this episode of the podcast is one of the co-authors of that paper, Max Jaderberg. Max came into the Google ecosystem in 2014 when they acquired his computer vision company, and more recently, he started DeepMind’s open-ended learning team, which is focused on pushing machine learning further into the territory of cross-task generalization ability. I spoke to Max about open-ended learning, the path ahead for generalization and the future of AI.
---
Intro music by:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:30 Max’s background
- 6:40 Differences in procedural generations
- 12:20 The qualitative side
- 17:40 Agents’ mistakes
- 20:00 Measuring generalization
- 27:10 Environments and loss functions
- 32:50 The potential of symbolic logic
- 36:45 Two distinct learning processes
- 42:35 Forecasting research
- 45:00 Wrap-up
Bias gets a bad rap in machine learning. And yet, the whole point of a machine learning model is that it biases certain inputs to certain outputs — a picture of a cat to a label that says “cat”, for example. Machine learning is bias-generation.
So removing bias from AI isn’t an option. Rather, we need to think about which biases are acceptable to us, and how extreme they can be. These are questions that call for a mix of technical and philosophical insight that’s hard to find. Luckily, I’ve managed to do just that by inviting onto the podcast none other than Margaret Mitchell, a former Senior Research Scientist in Google’s Research and Machine Intelligence Group, whose work has been focused on practical AI ethics. And by practical, I really do mean the nuts and bolts of how AI ethics can be baked into real systems, and navigating the complex moral issues that come up when the AI rubber meets the road.
***
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
- 0:00 Intro
- 1:20 Margaret’s background
- 8:30 Meta learning and ethics
- 10:15 Margaret’s day-to-day
- 13:00 Sources of ethical problems within AI
- 18:00 Aggregated and disaggregated scores
- 24:02 How much bias will be acceptable?
- 29:30 What biases does the AI ethics community hold?
- 35:00 The overlap of these fields
- 40:30 The political aspect
- 45:25 Wrap-up
As impressive as they are, language models like GPT-3 and BERT all have the same problem: they’re trained on reams of internet data to imitate human writing. And human writing is often wrong, biased, or both, which means language models are trying to emulate an imperfect target.
Language models often babble, or make up answers to questions they don’t understand. And it can make them unreliable sources of truth. Which is why there’s been increased interest in alternative ways to retrieve information from large datasets — approaches that include knowledge graphs.
Knowledge graphs encode entities like people, places and objects into nodes, which are then connected to other entities via edges, which specify the nature of the relationship between the two. For example, a knowledge graph might contain a node for Mark Zuckerberg, linked to another node for Facebook, via an edge that indicates that Zuck is Facebook’s CEO. Both of these nodes might in turn be connected to dozens, or even thousands of others, depending on the scale of the graph.
Knowledge graphs are an exciting path ahead for AI capabilities, and the world’s largest knowledge graphs are trained by a company called Diffbot, whose CEO Mike Tung joined me for this episode of the podcast to discuss where knowledge graphs can improve on more standard techniques, and why they might be a big part of the future of AI.
---
Intro music by:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
0:00 Intro
1:30 The Diffbot dynamic
3:40 Knowledge graphs
7:50 Crawling the internet
17:15 What makes this time special?
24:40 Relation to neural networks
29:30 Failure modes
33:40 Sense of competition
39:00 Knowledge graphs for discovery
45:00 Consensus to find truth
48:15 Wrap-up
Corporate governance of AI doesn’t sound like a sexy topic, but it’s rapidly becoming one of the most important challenges for big companies that rely on machine learning models to deliver value for their customers. More and more, they’re expected to develop and implement governance strategies to reduce the incidence of bias, and increase the transparency of their AI systems and development processes. Those expectations have historically come from consumers, but governments are starting impose hard requirements, too.
So for today’s episode, I spoke to Anthony Habayeb, founder and CEO of Monitaur, a startup focused on helping businesses anticipate and comply with new and upcoming AI regulations and governance requirements. Anthony’s been watching the world of AI regulation very closely over the last several years, and was kind enough to share his insights on the current state of play and future direction of the field.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:45 Anthony’s background
- 6:20 Philosophies surrounding regulation
- 14:50 The role of governments
- 17:30 Understanding fairness
- 25:35 AI’s PR problem
- 35:20 Governments’ regulation
- 42:25 Useful techniques for data science teams
- 46:10 Future of AI governance
- 49:20 Wrap-up
The more powerful our AIs become, the more we’ll have to ensure that they’re doing exactly what we want. If we don’t, we risk building AIs that use dangerously creative solutions that have side-effects that could be undesirable, or downright dangerous. Even a slight misalignment between the motives of a sufficiently advanced AI and human values could be hazardous.
That’s why leading AI labs like OpenAI are already investing significant resources into AI alignment research. Understanding that research is important if you want to understand where advanced AI systems might be headed, and what challenges we might encounter as AI capabilities continue to grow — and that’s what this episode of the podcast is all about. My guest today is Jan Leike, head of AI alignment at OpenAI, and an alumnus of DeepMind and the Future of Humanity Institute. As someone who works directly with some of the world’s largest AI systems (including OpenAI’s GPT-3) Jan has a unique and interesting perspective to offer both on the current challenges facing alignment researchers, and the most promising future directions the field might take.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
0:00 Intro
1:35 Jan’s background
7:10 Timing of scalable solutions
16:30 Recursive reward modeling
24:30 Amplification of misalignment
31:00 Community focus
32:55 Wireheading
41:30 Arguments against the democratization of AIs
49:30 Differences between capabilities and alignment
51:15 Research to focus on
1:01:45 Formalizing an understanding of personal experience
1:04:04 OpenAI hiring
1:05:02 Wrap-up
The recent success of large transformer models in AI raises new questions about the limits of current strategies: can we expect deep learning, reinforcement learning and other prosaic AI techniques to get us all the way to humanlike systems with general reasoning abilities?
Some think so, and others disagree. One dissenting voice belongs to Francesca Rossi, a former professor of computer science, and now AI Ethics Global Leader at IBM. Much of Francesca’s research is focused on deriving insights from human cognition that might help AI systems generalize better. Francesca joined me for this episode of the podcast to discuss her research, her thinking, and her thinking about thinking.
AI research is often framed as a kind of human-versus-machine rivalry that will inevitably lead to the defeat — and even wholesale replacement of — human beings by artificial superintelligences that have their own sense of agency, and their own goals.
Divya Siddarth disagrees with this framing. Instead, she argues, this perspective leads us to focus on applications of AI that are neither as profitable as they could be, nor safe enough to prevent us from potentially catastrophic consequences of dangerous AI systems in the long run. And she ought to know: Divya is an associate political economist and social technologist in the Office of the CTO at Microsoft.
She’s also spent a lot of time thinking about what governments can — and are — doing to shift the framing of AI away from centralized systems that compete directly with humans, and toward a more cooperative model, which would see AI as a kind of facilitation tool that gets leveraged by human networks. Divya points to Taiwan as an experiment in digital democracy that’s doing just that.
2020 was an incredible year for AI. We saw powerful hints of the potential of large language models for the first time thanks to OpenAI’s GPT-3, DeepMind used AI to solve one of the greatest open problems in molecular biology, and Boston Dynamics demonstrated their ability to blend AI and robotics in dramatic fashion.
Progress in AI is accelerating exponentially, and though we’re just over halfway through 2021, this year is already turning into another one for the books. So we decided to partner with our friends over at Let’s Talk AI, a podcast co-hosted by Stanford PhD and former Googler Sharon Zhou, and Stanford PhD student Andrey Kurenkov, that covers current events in AI.
This was a fun chat, and a format we’ll definitely be playing with more in the future :)
Many AI researchers think it’s going to be hard to design AI systems that continue to remain safe as AI capabilities increase. We’ve seen already on the podcast that the field of AI alignment has emerged to tackle this problem, but a related effort is also being directed at a separate dimension of the safety problem: AI interpretability.
Our ability to interpret how AI systems process information and make decisions will likely become an important factor in assuring the reliability of AIs in the future. And my guest for this episode of the podcast has focused his research on exactly that topic. Daniel Filan is an AI safety researcher at Berkeley, where he’s supervised by AI pioneer Stuart Russell. Daniel also runs AXRP, a podcast dedicated to technical AI alignment research.
Cruise is a self-driving car startup founded in 2013 — at a time when most people thought of self-driving cars as the stuff of science fiction. And yet, just three years later, the company was acquired by GM for over a billion dollars, having shown itself to be a genuine player in the race to make autonomous driving a reality. Along the way, the company has had to navigate and adapt to a rapidly changing technological landscape, mixing and matching old ideas from robotics and software engineering with cutting edge techniques like deep learning.
My guest for this episode of the podcast was one of Cruise’s earliest employees. Peter Gao is a machine learning specialist with deep experience in the self-driving car industry, and is also the co-founder of Aquarium Learning, a Y Combinator-backed startup that specializes in improving the performance of machine learning models by fixing problems with the data they’re trained on. We discussed Peter’s experiences in the self-driving car industry, including the innovations that have spun out of self-driving car tech, as well as some of the technical and ethical challenges that need to be overcome to make self-driving cars hit mainstream use around the world.
There are a lot of reasons to pay attention to China’s AI initiatives. Some are purely technological: Chinese companies are producing increasingly high-quality AI research, and they’re poised to become even more important players in AI over the next few years. For example, Huawei recently put together their own version of OpenAI’s massive GPT-3 language model — a feat that leveraged massive scale compute that pushed the limits of current systems, calling for deep engineering and technical know-how.
But China’s AI ambitions are also important geopolitically. In order to build powerful AI systems, you need a lot of compute power. And in order to get that, you need a lot of computer chips, which are notoriously hard to manufacture. But most of the world’s computer chips are currently made in democratic Taiwan, which China claims as its own territory. You can see how quickly this kind of thing can lead to international tension.
Still, the story of US-China AI isn’t just one of competition and decoupling, but also of cooperation — or at least, that’s the case made by my guest today, China AI expert and Stanford researcher Jeffrey Ding. In addition to studying Chinese AI ecosystem as part of his day job, Jeff published the very popular China AI newsletter, which offers a series of translations and analyses of Chinese language articles about AI. Jeff acknowledges the competitive dynamics of AI research, but argues that focusing only on controversial applications of AI — like facial recognition and military applications — causes us to ignore or downplay areas where real collaboration can happen, like language translation for example.
This special episode of the Towards Data Science podcast is a cross-over with our friends over at the Banana Data podcast. We’ll be zooming out and talking about some of the most important current challenges AI creates for humanity, and some of the likely future directions the technology might take.
Few would disagree that AI is set to become one of the most important economic and social forces in human history.
But along with its transformative potential has come concern about a strange new risk that AI might pose to human beings. As AI systems become exponentially more capable of achieving their goals, some worry that even a slight misalignment between those goals and our own could be disastrous. These concerns are shared by many of the most knowledgeable and experienced AI specialists, at leading labs like OpenAI, DeepMind, CHAI Berkeley, Oxford and elsewhere.
But they’re not universal: I recently had Melanie Mitchell — computer science professor and author who famously debated Stuart Russell on the topic of AI risk — on the podcast to discuss her objections to the AI catastrophe argument. And on this episode, we’ll continue our exploration of the case for AI catastrophic risk skepticism with an interview with Oren Etzioni, CEO of the Allen Institute for AI, a world-leading AI research lab that’s developed many well-known projects, including the popular AllenNLP library, and Semantic Scholar.
Oren has a unique perspective on AI risk, and the conversation was lots of fun!
How can you know that a super-intelligent AI is trying to do what you asked it to do?
The answer, it turns out, is: not easily. And unfortunately, an increasing number of AI safety researchers are warning that this is a problem we’re going to have to solve sooner rather than later, if we want to avoid bad outcomes — which may include a species-level catastrophe.
The type of failure mode whereby AIs optimize for things other than those we ask them to is known as an inner alignment failure in the context of AI safety. It’s distinct from outer alignment failure, which is what happens when you ask your AI to do something that turns out to be dangerous, and it was only recognized by AI safety researchers as its own category of risk in 2019. And the researcher who led that effort is my guest for this episode of the podcast, Evan Hubinger.
Evan is an AI safety veteran who’s done research at leading AI labs like OpenAI, and whose experience also includes stints at Google, Ripple and Yelp. He currently works at the Machine Intelligence Research Institute (MIRI) as a Research Fellow, and joined me to talk about his views on AI safety, the alignment problem, and whether humanity is likely to survive the advent of superintelligent AI.
When OpenAI announced the release of their GPT-3 API last year, the tech world was shocked. Here was a language model, trained only to perform a simple autocomplete task, which turned out to be capable of language translation, coding, essay writing, question answering and many other tasks that previously would each have required purpose-built systems.
What accounted for GPT-3’s ability to solve these problems? How did it beat state-of-the-art AIs that were purpose-built to solve tasks it was never explicitly trained for? Was it a brilliant new algorithm? Something deeper than deep learning?
Well… no. As algorithms go, GPT-3 was relatively simple, and was built using a by-then fairly standard transformer architecture. Instead of a fancy algorithm, the real difference between GPT-3 and everything that came before was size:GPT-3 is a simple-but-massive, 175B-parameter model, about 10X bigger than the next largest AI system.
GPT-3 is only the latest in a long line of results that now show that scaling up simple AI techniques can give rise to new behavior, and far greater capabilities. Together, these results have motivated a push toward AI scaling: the pursuit of ever larger AIs, trained with more compute on bigger datasets. But scaling is expensive: by some estimates, GPT-3 cost as much as $5M to train. As a result, only well-resources companies like Google, OpenAI and Microsoft have been able to experiment with scaled models.
That’s a problem for independent AI safety researchers, who want to better understand how advanced AI systems work, and what their most dangerous behaviors might be, but who can’t afford a $5M compute budget. That’s why a recent paper by Andy Jones, an independent researcher specialized in AI scaling, is so promising: Andy’s paper shows that, at least in some contexts, the capabilities of large AI systems can be predicted from those of smaller ones. If the result generalizes, it could give independent researchers the ability to run cheap experiments on small systems, which nonetheless generalize to expensive, scaled AIs like GPT-3. Andy was kind enough to join me for this episode of the podcast.
In 2016, OpenAI published a blog describing the results of one of their AI safety experiments. In it, they describe how an AI that was trained to maximize its score in a boat racing game ended up discovering a strange hack: rather than completing the race circuit as fast as it could, the AI learned that it could rack up an essentially unlimited number of bonus points by looping around a series of targets, in a process that required it to ram into obstacles, and even travel in the wrong direction through parts of the circuit.
This is a great example of the alignment problem: if we’re not extremely careful, we risk training AIs that find dangerously creative ways to optimize whatever thing we tell them to optimize for. So building safe AIs — AIs that are aligned with our values — involves finding ways to very clearly and correctly quantify what we want our AIs to do. That may sound like a simple task, but it isn’t: humans have struggled for centuries to define “good” metrics for things like economic health or human flourishing, with very little success.
Today’s episode of the podcast features Brian Christian — the bestselling author of several books related to the connection between humanity and computer science & AI. His most recent book, The Alignment Problem, explores the history of alignment research, and the technical and philosophical questions that we’ll have to answer if we’re ever going to safely outsource our reasoning to machines. Brian’s perspective on the alignment problem links together many of the themes we’ve explored on the podcast so far, from AI bias and ethics to existential risk from AI.
We all value privacy, but most of us would struggle to define it. And there’s a good reason for that: the way we think about privacy is shaped by the technology we use. As new technologies emerge, which allow us to trade data for services, or pay for privacy in different forms, our expectations shift and privacy standards evolve. That shifting landscape makes privacy a moving target.
The challenge of understanding and enforcing privacy standards isn’t novel, but it’s taken on a new importance given the rapid progress of AI in recent years. Data that would have been useless just a decade ago — unstructured text data and many types of images come to mind — are now a treasure trove of value, for example. Should companies have the right to use data they originally collected at a time when its value was limited, when it no longer is? Do companies have an obligation to provide maximum privacy without charging their customers directly for it? Privacy in AI is as much a philosophical question as a technical one, and to discuss it, I was joined by Eliano Marques, Executive VP of Data and AI at Protegrity, a company that specializes in privacy and data protection for large companies. Eliano has worked in data privacy for the last decade.
When OpenAI developed its GPT-2 language model in early 2019, they initially chose not to publish the algorithm, owing to concerns over its potential for malicious use, as well as the need for the AI industry to experiment with new, more responsible publication practices that reflect the increasing power of modern AI systems.
This decision was controversial, and remains that way to some extent even today: AI researchers have historically enjoyed a culture of open publication and have defaulted to sharing their results and algorithms. But whatever your position may be on algorithms like GPT-2, it’s clear that at some point, if AI becomes arbitrarily flexible and powerful, there will be contexts in which limits on publication will be important for public safety.
The issue of publication norms in AI is complex, which is why it’s a topic worth exploring with people who have experience both as researchers, and as policy specialists — people like today’s Towards Data Science podcast guest, Rosie Campbell. Rosie is the Head of Safety Critical AI at Partnership on AI (PAI), a nonprofit that brings together startups, governments, and big tech companies like Google, Facebook, Microsoft and Amazon, to shape best practices, research, and public dialogue about AI’s benefits for people and society. Along with colleagues at PAI, Rosie recently finished putting together a white paper exploring the current hot debate over publication norms in AI research, and making recommendations for researchers, journals and institutions involved in AI research.
Automated weapons mean fewer casualties, faster reaction times, and more precise strikes. They’re a clear win for any country that deploys them. You can see the appeal.
But they’re also a classic prisoner’s dilemma. Once many nations have deployed them, humans no longer have to be persuaded to march into combat, and the barrier to starting a conflict drops significantly.
The real risks that come from automated weapons systems like drones aren’t always the obvious ones. Many of them take the form of second-order effects — the knock-on consequences that come from setting up a world where multiple countries have large automated forces. But what can we do about them? That’s the question we’ll be taking on during this episode of the podcast with Jakob Foerster, an early pioneer in multi-agent reinforcement learning, and incoming faculty member at the University of Toronto. Jakob has been involved in the debate over weaponized drone automation for some time, and recently wrote an open letter to German politicians urging them to consider the risks associated with the deployment of this technology.
In December 1938, a frustrated nuclear physicist named Leo Szilard wrote a letter to the British Admiralty telling them that he had given up on his greatest invention — the nuclear chain reaction.
"The idea of a nuclear chain reaction won’t work. There’s no need to keep this patent secret, and indeed there’s no need to keep this patent too. It won’t work." — Leo Szilard
What Szilard didn’t know when he licked the envelope was that, on that very same day, a research team in Berlin had just split the uranium atom for the very first time. Within a year, the Manhatta Project would begin, and by 1945, the first atomic bomb was dropped on the Japanese city of Hiroshima. It was only four years later — barely a decade after Szilard had written off the idea as impossible — that Russia successfully tested its first atomic weapon, kicking off a global nuclear arms race that continues in various forms to this day.
It’s a surprisingly short jump from cutting edge technology to global-scale risk. But although the nuclear story is a high-profile example of this kind of leap, it’s far from the only one. Today, many see artificial intelligence as a class of technology whose development will lead to global risks — and as a result, as a technology that needs to be managed globally. In much the same way that international treaties have allowed us to reduce the risk of nuclear war, we may need global coordination around AI to mitigate its potential negative impacts.
One of the world’s leading experts on AI’s global coordination problem is Nicolas Miailhe. Nicolas is the co-founder of The Future Society, a global nonprofit whose primary focus is encouraging responsible adoption of AI, and ensuring that countries around the world come to a common understanding of the risks associated with it. Nicolas is a veteran of the prestigious Harvard Kennedy School of Government, an appointed expert to the Global Partnership on AI, and advises cities, governments, international organizations about AI policy.
We’ve recorded quite a few podcasts recently about the problems AI does and may create, now and in the future. We’ve talked about AI safety, alignment, bias and fairness.
These are important topics, and we’ll continue to discuss them, but I also think it’s important not to lose sight of the value that AI and tools like it bring to the world in the here and now. So for this episode of the podcast, I spoke with Dr Yan Li, a professor who studies data management and analytics, and the co-founder of Techies Without Borders, a nonprofit dedicated to using tech for humanitarian good. Yan has firsthand experience developing and deploying technical solutions for use in poor countries around the world, from Tibet to Haiti.
AI safety researchers are increasingly focused on understanding what AI systems want. That may sound like an odd thing to care about: after all, aren’t we just programming AIs to want certain things by providing them with a loss function, or a number to optimize?
Well, not necessarily. It turns out that AI systems can have incentives that aren’t necessarily obvious based on their initial programming. Twitter, for example, runs a recommender system whose job is nominally to figure out what tweets you’re most likely to engage with. And while that might make you think that it should be optimizing for matching tweets to people, another way Twitter can achieve its goal is by matching people to tweets — that is, making people easier to predict, by nudging them towards simplistic and partisan views of the world. Some have argued that’s a key reason that social media has had such a divisive impact on online political discourse.
So the incentives of many current AIs already deviate from those of their programmers in important and significant ways — ways that are literally shaping society. But there’s a bigger reason they matter: as AI systems continue to develop more capabilities, inconsistencies between their incentives and our own will become more and more important. That’s why my guest for this episode, Ryan Carey, has focused much of his research on identifying and controlling the incentives of AIs. Ryan is a former medical doctor, now pursuing a PhD in machine learning and doing research on AI safety at Oxford University’s Future of Humanity Institute.
As AI systems have become more powerful, an increasing number of people have been raising the alarm about its potential long-term risks. As we’ve covered on the podcast before, many now argue that those risks could even extend to the annihilation of our species by superhuman AI systems that are slightly misaligned with human values.
There’s no shortage of authors, researchers and technologists who take this risk seriously — and they include prominent figures like Eliezer Yudkowsky, Elon Musk, Bill Gates, Stuart Russell and Nick Bostrom. And while I think the arguments for existential risk from AI are sound, and aren’t widely enough understood, I also think that it’s important to explore more skeptical perspectives.
Melanie Mitchell is a prominent and important voice on the skeptical side of this argument, and she was kind enough to join me for this episode of the podcast. Melanie is the Davis Professor of complexity at the Santa Fe Institute, a Professor of computer science at Portland State University, and the author of Artificial Intelligence: a Guide for Thinking Humans — a book in which she explores arguments for AI existential risk through a critical lens. She’s an active player in the existential risk conversation, and recently participated in a high-profile debate with Stuart Russell, arguing against his AI risk position.
Powered by Moore’s law, and a cluster of related trends, technology has been improving at an exponential pace across many sectors. AI capabilities in particular have been growing at a dizzying pace, and it seems like every year brings us new breakthroughs that would have been unimaginable just a decade ago. GPT-3, AlphaFold and DALL-E were developed in the last 12 months — and all of this in a context where the leading machine learning model has been increasing in size tenfold every year for the last decade.
To many, there’s a sharp contrast between the breakneck pace of these advances and the rate at which the laws that govern technologies like AI evolves. Our legal systems are chock full of outdated laws, and politicians and regulators often seem almost comically behind the technological curve. But while there’s no question that regulators face an uphill battle in trying to keep up with a rapidly changing tech landscape, my guest today thinks they have a good shot of doing so — as long as they start to think about the law a bit differently.
His name is Josh Fairfield, and he’s a law and technology scholar and former director of R&D at pioneering edtech company Rosetta Stone. Josh has consulted with U.S. government agencies, including the White House Office of Technology and the Homeland Security Privacy Office, and literally wrote a book about the strategies policymakers can use to keep up with tech like AI.
Paradoxically, it may be easier to predict the far future of humanity than to predict our near future.
The next fad, the next Netflix special, the next President — all are nearly impossible to anticipate. That’s because they depend on so many trivial factors: the next fad could be triggered by a viral video someone filmed on a whim, and well, the same could be true of the next Netflix special or President for that matter.
But when it comes to predicting the far future of humanity, we might oddly be on more solid ground. That’s not to say predictions can be made with confidence, but at least they can be made based on economic analysis and first principles reasoning. And most of that analysis and reasoning points to one of two scenarios: we either attain heights we’ve never imagined as a species, or everything we care about gets wiped out in a cosmic scale catastrophe.
Few people have spent more time thinking about the possible endgame of human civilization as my guest for this episode of the podcast, Stuart Armstrong. Stuart is a Research Fellow at Oxford University’s Future of Humanity Institute, where he studies the various existential risks that face our species, focusing most of his work specifically on risks from AI. Stuart is a fascinating and well-rounded thinker with a fresh perspective to share on just about everything you could imagine, and I highly recommend giving the episode a listen.
For the past decade, progress in AI has mostly been driven by deep learning — a field of research that draws inspiration directly from the structure and function of the human brain. By drawing an analogy between brains and computers, we’ve been able to build computer vision, natural language and other predictive systems that would have been inconceivable just ten years ago.
But analogies work two ways. Now that we have self-driving cars and AI systems that regularly outperform humans at increasingly complex tasks, some are wondering whether reversing the usual approach — and drawing inspiration from AI to inform out approach to neuroscience — might be a promising strategy. This more mathematical approach to neuroscience is exactly what today’s guest, Georg Nortoff, is working on. Georg is a professor of neuroscience, psychiatry, and philosophy at the University of Ottawa, and as part of his work developing a more mathematical foundation for neuroscience, he’s explored a unique and intriguing theory of consciousness that he thinks might serve as a useful framework for developing more advanced AI systems that will benefit human beings.
Most AI researchers are confident that we will one day create superintelligent systems — machines that can significantly outperform humans across a wide variety of tasks.
If this ends up happening, it will pose some potentially serious problems. Specifically: if a system is superintelligent, how can we maintain control over it? That’s the core of the AI alignment problem — the problem of aligning advanced AI systems with human values.
A full solution to the alignment problem will have to involve at least two things. First, we’ll have to know exactly what we want superintelligent systems to do, and make sure they don’t misinterpret us when we ask them to do it (the “outer alignment” problem). But second, we’ll have to make sure that those systems are genuinely trying to optimize for what we’ve asked them to do, and that they aren’t trying to deceive us (the “inner alignment” problem).
Creating systems that are inner-aligned and superintelligent might seem like different problems — and many think that they are. But in the last few years, AI researchers have been exploring a new family of strategies that some hope will allow us to achieve both superintelligence and inner alignment at the same time. Today’s guest, Ethan Perez, is using these approaches to build language models that he hopes will form an important part of the superintelligent systems of the future. Ethan has done frontier research at Google, Facebook, and MILA, and is now working full-time on developing learning systems with generalization abilities that could one day exceed those of human beings.
There’s a minor mystery in economics that may suggest that things are about to get really, really weird for humanity.
And that mystery is this: many economic models predict that, at some point, human economic output will become infinite.
Now, infinities really don’t tend to happen in the real world. But when they’re predicted by otherwise sound theories, they tend to indicate a point at which the assumptions of these theories break down in some fundamental way. Often, that’s because of things like phase transitions: when gases condense or liquids evaporate, some of their thermodynamic parameters go to infinity — not because anything “infinite” is really happening, but because the equations that define a gas cease to apply when those gases become liquids and vice-versa.
So how should we think of economic models that tell us that human economic output will one day reach infinity? Is it reasonable to interpret them as predicting a phase transition in the human economy — and if so, what might that transition look like? These are hard questions to answer, but they’re questions that my guest David Roodman, a Senior Advisor at Open Philanthropy, has thought about a lot.
David has centered his investigations on what he considers to be a plausible culprit for a potential economic phase transition: the rise of transformative AI technology. His work explores a powerful way to think about how, and even when, transformative AI may change how the economy works in a fundamental way.
As AI systems have become more ubiquitous, people have begun to pay more attention to their ethical implications. Those implications are potentially enormous: Google’s search algorithm and Twitter’s recommendation system each have the ability to meaningfully sway public opinion on just about any issue. As a result, Google and Twitter’s choices have an outsized impact — not only on their immediate user base, but on society in general.
That kind of power comes with risk of intentional misuse (for example, Twitter might choose to boost tweets that express views aligned with their preferred policies). But while intentional misuse is an important issue, equally challenging is the problem of avoiding unintentionally bad outputs from AI systems.
Unintentionally bad AIs can lead to various biases that make algorithms perform better for some people than for others, or more generally to systems that are optimizing for things we actually don’t want in the long run. For example, platforms like Twitter and YouTube have played an important role in the increasing polarization of their US (and worldwide) user bases. They never intended to do this, of course, but their effect on social cohesion is arguably the result of internal cultures based on narrow metric optimization: when you optimize for short-term engagement, you often sacrifice long-term user well-being.
The unintended consequences of AI systems are hard to predict, almost by definition. But their potential impact makes them very much worth thinking and talking about — which is why I sat down with Stanford professor, co-director of the Women in Data Science (WiDS) initiative, and host of the WiDS podcast Margot Gerritsen for this episode of the podcast.
As we continue to develop more and more sophisticated AI systems, an increasing number of economists, technologists and futurists have been trying to predict what the likely end point of all this progress might be. Will human beings be irrelevant? Will we offload all of our decisions — from what we want to do with our spare time, to how we govern societies — to machines? And what is the emergence of highly capable and highly general AI systems mean for the future of democracy and governance?
These questions are impossible to answer completely and directly, but it may be possible to get some hints by taking a long-term view at the history of human technological development. That’s a strategy that my guest, Ben Garfinkel, is applying in his research on the future of AI. Ben is a physicist and mathematician who now does research on forecasting risks from emerging technologies at Oxford’s Future of Humanity Institute.
Apart from his research on forecasting the future impact of technologies like AI, Ben has also spent time exploring some classic arguments for AI risk, many of which he disagrees with. Since we’ve had a number of guests on the podcast who do take these risks seriously, I thought it would be worth speaking to Ben about his views as well, and I’m very glad I did.
There’s no question that AI ethics has received a lot of well-deserved attention lately. But ask the average person what ethics AI means, and you’re as likely as not to get a blank stare. I think that’s largely because every data science or machine learning problem comes with a unique ethical context, so it can be hard to pin down ethics principles that generalize to a wide class of AI problems.
Fortunately, there are researchers who focus on just this issue— and my guest today, Sarah Williams, is one of them. Sarah is an associate professor of urban planning and the director of the Civic Data Design Lab at MIT’s School of Architecture and Planning School. Her job is to study applications of data science to urban planning, and to work with policymakers on applying AI in an ethical way. Through that process, she’s distilled several generalizable AI ethics principles that have practical and actionable implications.
This episode was a wide-ranging discussion about everything from the way our ideologies can colour our data analysis to the challenges governments face when trying to regulate AI.
The apparent absence of alien life in our universe has been a source of speculation and controversy in scientific circles for decades. If we assume that there’s even a tiny chance that intelligent life might evolve on a given planet, it seems almost impossible to imagine that the cosmos isn’t brimming with alien civilizations. So where are they?
That’s what Anders Sandberg calls the “Fermi Question”: given the unfathomable size of the universe, how come we have seen no signs of alien life? Anders is a researcher at the University of Oxford’s Future of Humanity Institute, where he tries to anticipate the ethical, philosophical and practical questions that human beings are going to have to face as we approach what could be a technologically unbounded future. That work focuses to a great extent on superintelligent AI and the existential risks it might create. As part of that work, he’s studied the Fermi Question in great detail, and what it implies for the scarcity of life and the value of the human species.
One of the consequences of living in a world where we have every kind of data we could possible want at our fingertips, is that we have far more data available to us than we could possibly review. Wondering which university program you should enter? You could visit any one of a hundred thousand websites that each offer helpful insights, or take a look at ten thousand different program options on hundreds of different universities’ websites. The only snag is that, by the time you finish that review, you probably could have graduated.
Recommender systems allow us to take controlled sips from the information fire hose that’s pointed our way every day of the week, by highlighting a small number of particularly relevant or valuable items from a vast catalog. And while they’re incredibly valuable pieces of technology, they also have some serious ethical failure modes — many of which arise because companies tend to build recommenders to reflect user feedback, without thinking of the broader implications these systems have for society and human civilization.
Those implications are significant, and growing fast. Recommender algorithms deployed by Twitter and Google regularly shape public opinion on the key moral issues of our time — sometimes intentionally, and sometimes even by accident. So rather than allowing society to be reshaped in the image of these powerful algorithms, perhaps it’s time we asked some big questions about the kind of world we want to live in, and worked backward to figure out what our answers would imply for the way we evaluate recommendation engines.
That’s exactly why I wanted to speak with Silvia Milano, my guest for this episode of the podcast. Silvia is an expert of the ethics of recommender systems, and a researcher at Oxford’s Future of Humanity Institute and at the Oxford Internet Institute, where she’s been involved in work aimed at better understanding the hidden impact of recommendation algorithms, and what can be done to mitigate their more negative effects. Our conversation took us led us to consider complex questions, including the definition of identity, the human right to self-determination, and the interaction of governments with technology companies.
Facebook routinely deploys recommendation systems and predictive models that affect the lives of billions of people everyday. That kind of reach comes with great responsibility — among other things, the responsibility to develop AI tools that ethical, fair and well characterized.
This isn’t an easy task. Human beings have spent thousands of years arguing about what “fairness” and “ethics” mean, and haven’t come close to a consensus. Which is precisely why the responsible AI community has to involve as many disparate perspectives as possible in determining what policies to explore and recommend — a practice that Facebook’s Responsible AI team has applied itself.
For this episode of the podcast, I’m joined by Joaquin Quiñonero-Candela, the Distinguished Tech Lead for Responsible AI at Facebook. Joaquin has been at the forefront of the AI ethics and fairness movements for years, and has overseen the formation of Facebook’s responsible AI team. As a result, he’s one of relatively few people with hands-on experience making critical AI ethics decisions at scale, and seeing their effects.
Our conversation covered a lot of ground, from philosophical questions about the definition of fairness, to practical challenges that arise when implementing certain ethical AI frameworks.
Most researchers agree we’ll eventually reach a point where our AI systems begin to exceed human performance at virtually every economically valuable task, including the ability to generalize from what they’ve learned to take on new tasks that they haven’t seen before. These artificial general intelligences (AGIs) would in all likelihood have transformative effects on our economies, our societies and even our species.
No one knows what these effects will be, or when AGI systems will be developed that can bring them about. But that doesn’t mean these things aren’t worth predicting or estimating. The more we know about the amount of time we have to develop robust solutions to important AI ethics, safety and policy problems, the more clearly we can think about what problems should be receiving our time and attention today.
That’s the thesis that motivates a lot of work on AI forecasting: the attempt to predict key milestones in AI development, on the path to AGI and super-human artificial intelligence. It’s still early days for this space, but it’s received attention from an increasing number of the AI safety and AI capabilities researchers. One of those researchers is Owain Evans, whose work at Oxford University’s Future of Humanity Institute is focused on techniques for learning about human beliefs, preferences and values from observing human behavior or interacting with humans. Owain joined me for this episode of the podcast to talk about AI forecasting, the problem of inferring human values, and the ecosystem of research organizations that support this type of research.
With every new technology comes the potential for abuse. And while AI is clearly starting to deliver an awful lot of value, it’s also creating new systemic vulnerabilities that governments now have to worry about and address. Self-driving cars can be hacked. Speech synthesis can make traditional ways of verifying someone’s identity less reliable. AI can be used to build weapons systems that are less predictable.
As AI technology continues to develop and become more powerful, we’ll have to worry more about safety and security. But competitive pressures risk encouraging companies and countries to focus on capabilities research rather than responsible AI development. Solving this problem will be a big challenge, and it’s probably going to require new national AI policies, and international norms and standards that don’t currently exist.
Helen Toner is Director of Strategy at the Center for Security and Emerging Technology (CSET), a US policy think tank that connects policymakers to experts on the security implications of new technologies like AI. Her work spans national security and technology policy, and international AI competition, and she’s become an expert on AI in China, in particular. Helen joined me for a special AI policy-themed episode of the podcast.
What does a neural network system want to do?
That might seem like a straightforward question. You might imagine that the answer is “whatever the loss function says it should do.” But when you dig into it, you quickly find that the answer is much more complicated than that might imply.
In order to accomplish their primary goal of optimizing a loss function, algorithms often develop secondary objectives (known as instrumental goals) that are tactically useful for that main goal. For example, a computer vision algorithm designed to tell faces apart might find it beneficial to develop the ability to detect noses with high fidelity. Or in a more extreme case, a very advanced AI might find it useful to monopolize the Earth’s resources in order to accomplish its primary goal — and it’s been suggested that this might actually be the default behavior of powerful AI systems in the future.
So, what does an AI want to do? Optimize its loss function — perhaps. But a sufficiently complex system is likely to also manifest instrumental goals. And if we don’t develop a deep understanding of AI incentives, and reliable strategies to manage those incentives, we may be in for an unpleasant surprise when unexpected and highly strategic behavior emerges from systems with simple and desirable primary goals. Which is why it’s a good thing that my guest today, David Krueger, has been working on exactly that problem. David studies deep learning and AI alignment at MILA, and joined me to discuss his thoughts on AI safety, and his work on managing the incentives of AI systems.
The leap from today’s narrow AI to a more general kind of intelligence seems likely to happen at some point in the next century. But no one knows exactly how: at the moment, AGI remains a significant technical and theoretical challenge, and expert opinion about what it will take to achieve it varies widely. Some think that scaling up existing paradigms — like deep learning and reinforcement learning — will be enough, but others think these approaches are going to fall short.
Geordie Rose is one of them, and his voice is one that’s worth listening to: he has deep experience with hard tech, from founding D-Wave (the world’s first quantum computing company), to building Kindred Systems, a company pioneering applications of reinforcement learning in industry that was recently acquired for $350 million dollars.
Geordie is now focused entirely on AGI. Through his current company, Sanctuary AI, he’s working on an exciting and unusual thesis. At the core of this thesis is the idea is that one of the easiest paths to AGI will be to build embodied systems: AIs with physical structures that can move around in the real world and interact directly with objects. Geordie joined me for this episode of the podcast to discuss his AGI thesis, as well as broader questions about AI safety and AI alignment.
The fields of AI bias and AI fairness are still very young. And just like most young technical fields, they’re dominated by theoretical discussions: researchers argue over what words like “privacy” and “fairness” mean, but don’t do much in the way of applying these definitions to real-world problems.
Slowly but surely, this is all changing though, and government oversight has had a big role to play in that process. Laws like GDPR — passed by the European Union in 2016 —are starting to impose concrete requirements on companies that want to use consumer data, or build AI systems with it. There are pros and cons to legislating machine learning, but one thing’s for sure: there’s no looking back. At this point, it’s clear that government-endorsed definitions of “bias” and “fairness” in AI systems are going to be applied to companies (and therefore to consumers), whether they’re well-developed and thoughtful or not.
Keeping up with the philosophy of AI is a full-time job for most, but actually applying that philosophy to real-world corporate data is its own additional challenge. My guest for this episode of the podcast is doing just that: Nicolai Baldin is a former Cambridge machine learning researcher, and now the founder and CEO of Synthesized, a startup that specializes in helping companies apply privacy, AI fairness and bias best practices to their data. Nicolai is one of relatively few people working on concrete problems in these areas, and has a unique perspective on the space as a result.
No one knows for sure what it’s going to take to make artificial general intelligence work. But that doesn’t mean that there aren’t prominent research teams placing big bets on different theories: DeepMind seems to be hoping that a brain emulation strategy will pay off, whereas OpenAI is focused on achieving AGI by scaling up existing deep learning and reinforcement learning systems with more data, more compute.
Ben Goertzel —a pioneering AGI researcher, and the guy who literally coined the term “AGI” — doesn’t think either of these approaches is quite right. His alternative approach is the strategy currently being used by OpenCog, an open-source AGI project he first released in 2008. Ben is also a proponent of decentralized AI development, due to his concerns about centralization of power through AI, as the technology improves. For that reason, he’s currently working on building a decentralized network of AIs through SingularityNET, a blockchain-powered AI marketplace that he founded in 2017.
Ben has some interesting and contrarian views on AGI, AI safety, and consciousness, and he was kind enough to explore them with me on this episode of the podcast.
Progress in AI capabilities has consistently surprised just about everyone, including the very developers and engineers who build today’s most advanced AI systems. AI can now match or exceed human performance in everything from speech recognition to driving, and one question that’s increasingly on people’s minds is: when will AI systems be better than humans at AI research itself?
The short answer, of course, is that no one knows for sure — but some have taken some educated guesses, including Nick Bostrom and Stuart Russell. One common hypothesis is that once an AI systems are better than a human at improving their own performance, we can expect at least some of them to do so. In the process, these self-improving systems would become an even more powerful system that they were previously—and therefore, even more capable of further self-improvement. With each additional self-improvement step, improvements in a system’s performance would compound. Where this all ultimately leads, no one really has a clue, but it’s safe to say that if there’s a good chance that we’re going to be creating systems that are capable of this kind of stunt, we ought to think hard about how we should be building them.
This concern among many others has led to the development of the rich field of AI safety, and my guest for this episode, Robert Miles, has been involved in popularizing AI safety research for more than half a decade through two very successful YouTube channels, Robert Miles and Computerphile. He joined me on the podcast to discuss how he’s thinking about AI safety, what AI means for the course of human evolution, and what our biggest challenges will be in taming advanced AI.
When it comes to machine learning, we’re often led to believe that bigger is better. It’s now pretty clear that all else being equal, more data, more compute, and larger models add up to give more performance and more generalization power. And cutting edge language models have been growing at an alarming rate — by up to 10X each year.
But size isn’t everything. While larger models are certainly more capable, they can’t be used in all contexts: take, for example, the case of a cell phone or a small drone, where on-device memory and processing power just isn’t enough to accommodate giant neural networks or huge amounts of data. The art of doing machine learning on small devices with significant power and memory constraints is pretty new, and it’s now known as “tiny ML”. Tiny ML unlocks an awful lot of exciting applications, but also raises a number of safety and ethical questions.
And that’s why I wanted to sit down with Matthew Stewart, a Harvard PhD researcher focused on applying tiny ML to environmental monitoring. Matthew has worked with many of the world’s top tiny ML researchers, and our conversation focused on the possibilities and potential risks associated with this promising new field.
In the early 1900s, all of our predictions were the direct product of human brains. Scientists, analysts, climatologists, mathematicians, bankers, lawyers and politicians did their best to anticipate future events, and plan accordingly.
Take physics, for example, where every task we think of as part of the learning process, from data collection to cleaning to feature selection to modeling, all had to happen inside a physicist’s head. When Einstein introduced gravitational fields, what he was really doing was proposing a new feature to be added to our model of the universe. And the gravitational field equations that he put forward at the same time were an update to that very model.
Einstein didn’t come up with his new model (or “theory” as physicists call it) of gravity by running model.fit()
in a jupyter notebook. In fact, he never outsourced any of the computations that were needed to develop it to machines.
Today, that’s somewhat unusual, and most of the predictions that the world runs on are generated in part by computers. But only in part — until we have fully general artificial intelligence, machine learning will always be a mix of two things: first, the constraints that human developers impose on their models, and second, the calculations that go into optimizing those models, which we outsource to machines.
The human touch is still a necessary and ubiquitous component of every machine learning pipeline, but it’s ultimately limiting: the more of the learning pipeline that can be outsourced to machines, the more we can take advantage of computers’ ability to learn faster and from far more data than human beings. But designing algorithms that are flexible enough to do that requires serious outside-of-the-box thinking — exactly the kind of thinking that University of Toronto professor and researcher David Duvenaud specializes in. I asked David to join me for the latest episode of the podcast to talk about his research on more flexible and robust machine learning strategies.
Human beings are collaborating with artificial intelligences on an increasing number of high-stakes tasks. I’m not just talking about robot-assisted surgery or self-driving cars here — every day, social media apps recommend content to us that quite literally shapes our worldviews and our cultures. And very few of us even have a basic idea of how these all-important recommendations are generated.
As time goes on, we’re likely going to become increasingly dependent on our machines, outsourcing more and more of our thinking to them. If we aren’t thoughtful about the way we do this, we risk creating a world that doesn’t reflect our current values or objectives. That’s why the domain of human/AI collaboration and interaction is so important — and it’s the reason I wanted to speak to Berkeley AI researcher Dylan Hadfield-Menell for this episode of the Towards Data Science podcast. Dylan’s work is focused on designing algorithms that could allow humans and robots to collaborate more constructively, and he’s one of a small but growing cohort of AI researchers focused on the area of AI ethics and AI alignment.
As AI systems have become more powerful, they’ve been deployed to tackle an increasing number of problems.
Take computer vision. Less than a decade ago, one of the most advanced applications of computer vision algorithms was to classify hand-written digits on mail. And yet today, computer vision is being applied to everything from self-driving cars to facial recognition and cancer diagnostics.
Practically useful AI systems have now firmly moved from “what if?” territory to “what now?” territory. And as more and more of our lives are run by algorithms, an increasing number of researchers from domains outside computer science and engineering are starting to take notice. Most notably among these are philosophers, many of whom are concerned about the ethical implications of outsourcing our decision-making to machines whose reasoning we often can’t understand or even interpret.
One of the most important voices in the world of AI ethics has been that of Dr Annette Zimmermann, a Technology & Human Rights Fellow at the Carr Center for Human Rights Policy at Harvard University, and a Lecturer in Philosophy at the University of York. Annette is has focused a lot of her work on exploring the overlap between algorithms, society and governance, and I had the chance to sit down with her to discuss her views on bias in machine learning, algorithmic fairness, and the big picture of AI ethics.
If you walked into a room filled with objects that were scattered around somewhat randomly, how important or expensive would you assume those objects were?
What if you walked into the same room, and instead found those objects carefully arranged in a very specific configuration that was unlikely to happen by chance?
These two scenarios hint at something important: human beings have shaped our environments in ways that reflect what we value. You might just learn more about what I value by taking a 10 minute stroll through my apartment than by spending 30 minutes talking to me as I try to put my life philosophy into words.
And that’s a pretty important idea, because as it turns out, one of the most important challenges in advanced AI today is finding ways to communicate our values to machines. If our environments implicitly encode part of our value system, then we might be able to teach machines to observe it, and learn about our preferences without our having to express them explicitly.
The idea of leveraging deriving human values from the state of an human-inhabited environment was first developed in a paper co-authored by Berkeley PhD and incoming DeepMind researcher Rohin Shah. Rohin has spent the last several years working on AI safety, and publishes the widely read AI alignment newsletter — and he was kind enough to join us for this episode of the Towards Data Science podcast, where we discussed his approach to AI safety, and his thoughts on risk mitigation strategies for advanced AI systems.
Reinforcement learning can do some pretty impressive things. It can optimize ad targeting, help run self-driving cars, and even win StarCraft games. But current RL systems are still highly task-specific. Tesla’s self-driving car algorithm can’t win at StarCraft, and DeepMind’s AlphaZero algorithm can with Go matches against grandmasters, but can’t optimize your company’s ad spend.
So how do we make the leap from narrow AI systems that leverage reinforcement learning to solve specific problems, to more general systems that can orient themselves in the world? Enter Tim Rocktäschel, a Research Scientist at Facebook AI Research London and a Lecturer in the Department of Computer Science at University College London. Much of Tim’s work has been focused on ways to make RL agents learn with relatively little data, using strategies known as sample efficient learning, in the hopes of improving their ability to solve more general problems. Tim joined me for this episode of the podcast.
Where do we want our technology to lead us? How are we falling short of that target? What risks might advanced AI systems pose to us in the future, and what potential do they hold? And what does it mean to build ethical, safe, interpretable, and accountable AI that’s aligned with human values?
That’s what this year is going to be about for the Towards Data Science podcast. I hope you join us for that journey, which starts today with an interview with my brother Ed, who apart from being a colleague who’s worked with me as part of a small team to build the SharpestMinds data science mentorship program, is also collaborating with me on a number of AI safety, alignment and policy projects. I thought he’d be a perfect guest to kick off this new year for the podcast.
Networking is the most valuable career advancement skill in data science. And yet, almost paradoxically, most data scientists don’t spend any time on it at all. In some ways, that’s not terribly surprising: data science is a pretty technical field, and technical people often prefer not to go out of their way to seek social interactions. We tend to think of networking with other “primates who code” as a distraction at best, and an anxiety-inducing nightmare at worst.
So how can data scientists overcome that anxiety, and tap into the value of network-building, and develop a brand for themselves in the data science community? That’s the question that brings us to this episode of the podcast. To answer it, I spoke with repeat guest Sanyam Bhutani — a top Kaggler, host of the Chai Time Data Science Show, Machine Learning Engineer and AI Content Creator at H2O.ai, about the unorthodox networking strategies that he’s leveraged to become a fixture in the machine learning community, and to land his current role.
We’ve talked a lot about “full stack” data science on the podcast. To many, going full-stack is one of those long-term goals that we never get to. There are just too many algorithms and data structures and programming languages to know, and not enough time to figure out software engineering best practices around deployment and building app front-ends.
Fortunately, a new wave of data science tooling is now making full-stack data science much more accessible by allowing people with no software engineering background to build data apps quickly and easily. And arguably no company has had such explosive success at building this kind of tooling than Streamlit, which is why I wanted to sit down with Streamlit founder Adrien Treuille and gamification expert Tim Conkling to talk about their journey, and the importance of building flexible, full-stack data science apps.
It’s no secret that data science is an area where brand matters a lot.
In fact, if there’s one thing I’ve learned from A/B testing ways to help job-seekers get hired at SharpestMinds, it’s that blogging, having a good presence on social media, making open-source contributions, podcasting and speaking at meetups is one of the best ways to get noticed by employers.
Brand matters. And if there’s one person who has a deep understanding of the value of brand in data science — and how to build one — it’s data scientist and YouTuber Ken Jee. Ken not only has experience as a data scientist and sports analyst, having worked at DraftKings and GE, but he’s also founded a number of companies — and his YouTube channel, with over 60 000 subscribers, is one of his main projects today.
For today’s episode, I spoke to Ken about brand-building strategies in data science, as well as job search tips for anyone looking to land their first data-related role.
If you’re interested in upping your coding game, or your data science game in general, then it’s worth taking some time to understand the process of learning itself.
And if there’s one company that’s studied the learning process more than almost anyone else, it’s Codecademy. With over 65 million users, Codecademy has developed a deep understanding of what it takes to get people to learn how to code, which is why I wanted to speak to their Head of Data Science, Cat Zhou, for this episode of the podcast.
Data science is about much more than jupyter notebooks, because data science problems are about more than machine learning.
What data should I collect? How good does my model need to be to be “good enough” to solve my problem? What form should my project take for it to be useful? Should it be a dashboard, a live app, or something else entirely? How do I deploy it? How do I make sure something awful and unexpected doesn’t happen when it’s deployed in production?
None of these questions can be answered by importing sklearn
and pandas
and hacking away in a jupyter notebook. Data science problems take a unique combination of business savvy and software engineering know-how, and that’s why Emmanuel Ameisen wrote a book called Building Machine Learning Powered Applications: Going from Idea to Product. Emmanuel is a machine learning engineer at Stripe, and formerly worked as Head of AI at Insight Data Science, where he oversaw the development of dozens of machine learning products.
Our conversation was focused on the missing links in most online data science education: business instinct, data exploration, model evaluation and deployment.
Project-building is the single most important activity that you can get up to if you’re trying to keep your machine learning skills sharp or break into data science. But a project won’t do you much good unless you can show it off effectively and get feedback to iterate on it — and until recently, there weren’t many places you could turn to to do that.
A recent open-source initiative called MadeWithML is trying to change that, by creating an easily shareable repository of crowdsourced data science and machine learning projects, and its founder, former Apple ML researcher and startup founder Goku Mohandas, sat down with me for this episode of the TDS podcast to discuss data science projects, his experiences doing research in industry, and the MadeWithML project.
It’s cliché to say that data cleaning accounts for 80% of a data scientist’s job, but it’s directionally true.
That’s too bad, because fun things like data exploration, visualization and modelling are the reason most people get into data science. So it’s a good thing that there’s a major push underway in industry to automate data cleaning as much as possible.
One of the leaders of that effort is Ihab Ilyas, a professor at the University of Waterloo and founder of two companies, Tamr and Inductiv, both of which are focused on the early stages of the data science lifecycle: data cleaning and data integration. Ihab knows an awful lot about data cleaning and data engineering, and has some really great insights to share about the future direction of the space — including what work is left for data scientists, once you automate away data cleaning.
There’s been a lot of talk about the future direction of data science, and for good reason. The space is finally coming into its own, and as the Wild West phase of the mid-2010s well and truly comes to an end, there’s keen interest among data professionals to stay ahead of the curve, and understand what their jobs are likely to look like 2, 5 and 10 years down the road.
And amid all the noise, one trend is clearly emerging, and has already materialized to a significant degree: as more and more of the data science lifecycle is automated or abstracted away, data professionals can afford to spend more time adding value to companies in more strategic ways. One way to do this is to invest your time deepening your subject matter expertise, and mastering the business side of the equation. Another is to double down on technical skills, and focus on owning more and more of the data stack —particularly including productionization and deployment stages.
My guest for today’s episode of the Towards Data Science podcast has been down both of these paths, first as a business-focused data scientist at Spotify, where he spent his time defining business metrics and evaluating products, and second as a data engineer at Better.com, where his focus has shifted towards productionization and engineering. During our chat, Kenny shared his insights about the relative merits of each approach, and the future of the field.
Reinforcement learning has gotten a lot of attention recently, thanks in large part to systems like AlphaGo and AlphaZero, which have highlighted its immense potential in dramatic ways. And while the RL systems we’ve developed have accomplished some impressive feats, they’ve done so in a fairly naive way. Specifically, they haven’t tended to confront multi-agent problems, which require collaboration and competition. But even when multi-agent problems have been tackled, they’ve been addressed using agents that just assume other agents are an uncontrollable part of the environment, rather than entities with rich internal structures that can be reasoned and communicated with.
That’s all finally changing, with new research into the field of multi-agent RL, led in part by OpenAI, Oxford and Google alum, and current FAIR research scientist Jakob Foerster. Jakob’s research is aimed specifically at understanding how reinforcement learning agents can learn to collaborate better and navigate complex environments that include other agents, whose behavior they try to model. In essence, Jakob is working on giving RL agents a theory of mind.
Data science can look very different from one company to the next, and it’s generally difficult to get a consistent opinion on the question of what a data scientist really is.
That’s why it’s so important to speak with data scientists who apply their craft at different organizations — from startups to enterprises. Getting exposure to the full spectrum of roles and responsibilities that data scientists are called on to execute is the only way to distill data science down to its essence.
That’s why I wanted to chat with Ian Scott, Chief Science Officer at Deloitte Omnia, Deloitte’s AI practice. Ian was doing data science as far back as the late 1980s, when he was applying statistical modeling to data from experimental high energy physics as par of his PhD work at Harvard. Since then, he’s occupied strategic roles at a number of companies, most recently including Deloitte, where he leads significant machine learning and data science projects.
Machine learning in grad school and machine learning in industry are very different beasts. In industry, deployment and data collection become key, and the only thing that matters is whether you can deliver a product that real customers want, fast enough to meet internal deadlines. In grad school, there’s a different kind of pressure, focused on algorithm development and novelty. It’s often difficult to know which path you might be best suited for, but that’s why it can be so useful to speak with people who’ve done both — and bonus points if their academic research experience comes from one of the top universities in the world.
For today’s episode of the Towards Data Science podcast, I sat down with Will Grathwohl, a PhD student at the University of Toronto, student researcher at Google AI, and alum of MIT and OpenAI. Will has seen cutting edge machine learning research in industry and academic settings, and has some great insights to share about the differences between the two environments. He’s also recently published an article on the fascinating topic of energy models in which he and his co-authors propose a unique way of thinking about generative models that achieves state-of-the-art performance in computer vision tasks.
One of the themes that I’ve seen come up increasingly in the past few months is the critical importance of product thinking in data science. As new and aspiring data scientists deepen their technical skill sets and invest countless hours doing practice problems on leetcode, product thinking has emerged as a pretty serious blind spot for many applicants. That blind spot has become increasingly critical as new tools have emerged that abstract away a lot of what used to be the day-to-day gruntwork of data science, allowing data scientists more time to develop subject matter expertise and focus on the business value side of the product equation.
If there’s one company that’s made a name for itself for leading the way on product-centric thinking in data science, it’s Shopify. And if there’s one person at Shopify who’s spent the most time thinking about product-centered data science, it’s Shopify’s Head of Data Science and Engineering, Solmaz Shahalizadeh. Solmaz has had an impressive career arc, which included joining Shopify in its pre-IPO days, back in 2013, and seeing the Shopify data science team grow from a handful of people to a pivotal organization-wide effort that tens of thousands of merchants rely on to earn a living today.
Machine learning isn’t rocket science, unless you’re doing it at NASA. And if you happen to be doing data science at NASA, you have something in common with David Meza, my guest for today’s episode of the podcast.
David has spent his NASA career focused on optimizing the flow of information through NASA’s many databases, and ensuring that that data is harnessed with machine learning and analytics. His current focus is on people analytics, which involves tracking the skills and competencies of employees across NASA, to detect people who have abilities that could be used in new or unexpected ways to meet needs that the organization has or might develop.
Nick Pogrebnyakov is a Senior Data Scientist at Thomson Reuters, an Associate Professor at Copenhagen Business School, and the founder of Leverness, a marketplace where experienced machine learning developers can find contract work with companies. He’s a busy man, but he agreed to sit down with me for today’s TDS podcast episode, to talk about his day job ar Reuters, as well as the machine learning and data science job landscape.
One Thursday afternoon in 2015, I got a spontaneous notification on my phone telling me how long it would take to drive to my favourite restaurant under current traffic conditions. This was alarming, not only because it implied that my phone had figured out what my favourite restaurant was without ever asking explicitly, but also because it suggested that my phone knew enough about my eating habits to realize that I liked to go out to dinner on Thursdays specifically.
As our phones, our laptops and our Amazon Echos collect increasing amounts of data about us — and impute even more — data privacy is becoming a greater and greater concern for research as well as government and industry applications. That’s why I wanted to speak to Harvard PhD student and frequent Towards Data Science contributor Matthew Stewart about to get an introduction to some of the key principles behind data privacy. Matthew is a prolific blogger, and his research work at Harvard is focused on applications of machine learning to environmental sciences, a topic we also discuss during this episode.
There’s been a lot of talk in data science circles about techniques like AutoML, which are dramatically reducing the time it takes for data scientists to train and tune models, and create reliable experiments. But that trend towards increased automation, greater robustness and reliability doesn’t end with machine learning: increasingly, companies are focusing their attention on automating earlier parts of the data lifecycle, including the critical task of data engineering.
Today, many data engineers are unicorns: they not only have to understand the needs of their customers, but also how to work with data, and what software engineering tools and best practices to use to set up and monitor their pipelines. Pipeline monitoring in particular is time-consuming, and just as important, isn’t a particularly fun thing to do. Luckily, people like Sean Knapp — a former Googler turned founder of data engineering startup Ascend.io — are leading the charge to make automated data pipeline monitoring a reality.
We had Sean on this latest episode of the Towards Data Science podcast to talk about data engineering: where it’s at, where it’s going, and what data scientists should really know about it to be prepared for the future.
For the last decade, advances in machine learning have come from two things: improved compute power and better algorithms. These two areas have become somewhat siloed in most people’s thinking: we tend to imagine that there are people who build hardware, and people who make algorithms, and that there isn’t much overlap between the two.
But this picture is wrong. Hardware constraints can and do inform algorithm design, and algorithms can be used to optimize hardware. Increasingly, compute and modelling are being optimized together, by people with expertise in both areas.
My guest today is one of the world’s leading experts on hardware/software integration for machine learning applications. Max Welling is a former physicist and currently works as VP Technologies at Qualcomm, a world-leading chip manufacturer, in addition to which he’s also a machine learning researcher with affiliations at UC Irvine, CIFAR and the University of Amsterdam.
Coronavirus quarantines fundamentally change the dynamics of learning, and the dynamics of the job search. Just a few months ago, in-person bootcamps and college programs, live networking events where people exchanged handshakes and business cards were the way the world worked, but now, no longer. With that in mind, many aspiring techies are asking themselves how they should be adjusting their gameplan to keep up with learning or land that next job, given the constraints of an ongoing pandemic and impending economic downturn.
That’s why I wanted to talk to Rubén Harris, CEO and co-founder of Career Karma, a startup that helps aspiring developers find the best coding bootcamp for them. He’s got a great perspective to share on the special psychological and practical challenges of navigating self-learning and the job search, and he was kind enough to make the time to chat with me for this latest episode of the Towards Data Science podcast.
One great way to get ahead in your career is to make good bets on what technologies are going to become important in the future, and to invest time in learning them. If that sounds like something you want to do, then you should definitely be paying attention to graph databases.
Graph databases aren’t exactly new, but they’ve become increasingly important as graph data (data that describe interconnected networks of things) has become more widely available than ever. Social media, supply chains, mobile device tracking, economics and many more fields are generating more graph data than ever before, and buried in these datasets are potential solutions for many of our biggest problems.
That’s why I was so excited to speak with Denise Gosnell and Matthias Broecheler, respectively the Chief Data Officer and Chief Technologist at DataStax, a company specialized in solving data engineering problems for enterprises. Apart from their extensive experience working with graph databases at DataStax, and Denise and Matthias have also recently written a book called The Practitioner’s Guide to Graph Data, and were kind enough to make the time for a discussion about the basics of data engineering and graph data for this episode of the Towards Data Science Podcast.
One of the most interesting recent trends in machine learning has been the combination of different types of data in order to be able to unlock new use cases for deep learning. If the 2010s were the decade of computer vision and voice recognition, the 2020s may very well be the decade we finally figure out how to make machines that can see and hear the world around them, making them that much more context-aware and potentially even humanlike.
The push towards integrating diverse data sources has received a lot of attention, from academics as well as companies. And one of those companies is Twenty Billion Neurons, and its founder Roland Memisevic, is our guest for this latest episode of the Towards Data Science podcast. Roland is a former academic who’s been knee-deep in deep learning since well before the hype that was sparked by AlexNet in 2012. His company has been working on deep learning-powered developer tools, as well as an automated fitness coach that combines video and audio data to keep users engaged throughout their workout routines.
If I were to ask you to explain why you’re reading this blog post, you could answer in many different ways.
For example, you could tell me “it’s because I felt like it”, or “because my neurons fired in a specific way that led me to click on the link that was advertised to me”. Or you might go even deeper and relate your answer to the fundamental laws of quantum physics.
The point is, explanations need to be targeted to a certain level of abstraction in order to be effective.
That’s true in life, but it’s also true in machine learning, where explainable AI is getting more and more attention as a way to ensure that models are working properly, in a way that makes sense to us. Understanding explainability and how to leverage it is becoming increasingly important, and that’s why I wanted to speak with Bahador Khaleghi, a data scientist at H20.ai whose technical focus is on explainability and interpretability in machine learning.
Most of us want to change our identities. And we usually have an idealized version of ourselves that we aspire to become — one who’s fitter, smarter, healthier, more famous, wealthier, more centered, or whatever.
But you can’t change your identity in a fundamental way without also changing what you do in your day-to-day life. You don’t get fitter without working out regularly. You don’t get smarter without studying regularly.
To change yourself, you must first change your habits. But how do you do that?
Recently, books like Atomic Habits and Deep Work have focused on answering that question in general terms, and they’re definitely worth reading. But habit formation in the context of data science, analytics, machine learning, and startups comes with a unique set of challenges, and deserves attention in its own right. And that’s why I wanted to sit down with today’s guest, Russell Pollari.
Russell may now be the CTO of the world’s largest marketplace for income share mentorships (and the very same company I work at every day!) but he was once — and not too long ago — a physics PhD student with next to no coding ability and a classic case of the grad school blues. To get to where he is today, he’s had to learn a lot, and in his quest to optimize that process, he’s focused a lot of his attention on habit formation and self-improvement in the context of tech, data science and startups.
Revenues drop unexpectedly, and management pulls aside the data science team into a room. The team is given its marching orders: “your job,” they’re told, “is to find out what the hell is going on with our purchase orders.”
That’s a very open-ended question, of course, because revenues and signups could drop for any number of reasons. Prices may have increased. A new user interface might be confusing potential customers. Seasonality effects might have to be considered. The source of the problem could be, well, anything.
That’s often the position data scientists find themselves in: rather than having a clear A/B test to analyze, they frequently are in the business of combing through user funnels to ensure that each stage is working as expected.
It takes a very detail-oriented and business-savvy team to pull off an investigation with that broad a scope, but that’s exactly what Medium has: a group of product-minded data scientists dedicated to investigating anomalies and identifying growth opportunities hidden in heaps of user data. They were kind enough to chat with me and talk about how Medium does data science for this episode of the Towards Data Science podcast.
If you want to know where data science is heading, it helps to know where it’s been. Very few people have that kind of historical perspective, and even fewer combine it with an understanding of cutting-edge tooling that hints at the direction the field might be taking in the future.
Luckily for us, one of them is Cameron Davidson-Pillon, the former Director of Data Science at Shopify. Cameron has been knee-deep in data science and estimation theory since 2012, when the space was still coming into its own. He’s got a great high-level perspective not only on technical issues but also on hiring and team-building, and he was kind enough to join us for today’s episode of the Towards Data Science podcast.
It’s easy to think of data science as a purely technical discipline: after all, it exists at the intersection of a number of genuinely technical topics, from statistics to programming to machine learning.
But there’s much more to data science and analytics than solving technical problems — and there’s much more to the data science job search than coding challenges and Kaggle competitions as well. Landing a job or a promotion as a data scientist calls on a ton of career skills and soft skills that many people don’t spend nearly enough time honing.
On this episode of the podcast, I spoke with Emily Robinson, an experienced data scientist and blogger with a pedigree that includes Etsy and DataCamp, about career-building strategies. Emily’s got a lot to say about the topic, particularly since she just finished authoring a book entitled “Build a Career in Data Science” with her co-author Jacqueline Nolis. The book explores a lot of great, practical strategies for moving data science careers forward, many of which we discussed during our conversation.
Most of us believe that decisions that affect us should be made rationally: they should be reached by following a reasoning process that combines data we trust with a logic that we find acceptable.
As long as human beings are making these decisions, we can probe at that reasoning to find out whether we agree with it. We can ask why we were denied that bank loan, or why a judge handed down a particular sentence, for example.
Today however, machine learning is automating away more and more of these important decisions, and as a result, our lives are increasingly governed by decision-making processes that we can’t interrogate or understand. Worse, machine learning algorithms can exhibit bias or make serious mistakes, so a black-box-ocracy risks becoming more like a dystopia than even the most imperfect human-designed systems we have today.
That’s why AI ethics and AI safety have drawn so much attention in recent years, and why I was so excited to talk to Alayna Kennedy, a data scientist at IBM whose work is focused on the ethics of machine learning, and the risks associated with ML-based decision-making. Alayna has consulted with key players in the US government’s AI effort, and has expertise applying machine learning in industry as well, through previous work on neural network modelling and fraud detection.
In mid-January, China launched an official investigation into a string of unusual pneumonia cases in Hubei province. Within two months, that cluster of cases would snowball into a full-blown pandemic, with hundreds of thousands — perhaps even millions — of infections worldwide, with the potential to unleash a wave of economic damage not seen since the 1918 Spanish influenza or the Great Depression.
The exponential growth that led us from a few isolated infections to where we are today is profoundly counterintuitive. And it poses many challenges for the epidemiologists who need to pin down the transmission characteristics of the coronavirus, and for the policy makers who must act on their recommendations, and convince a generally complacent public to implement life-saving social distancing measures.
With the coronas in full bloom, I thought now would be a great time to reach out to Jeremy Howard, co-founder of the incredibly popular Fast.ai machine learning education site. Along with his co-founder Rachel Thomas, Jeremy authored a now-viral report outlining a data-driven case for concern regarding the coronavirus.
It’s easy to think of data scientists as “people who explore and model data”. Bur in reality, the job description is much more flexible: your job as a data scientist is to solve problems that people actually have with data.
You’ll notice that I wrote “problems that people actually have” rather than “build models”. It’s relatively rare that the problems people have actually need to be solved using a predictive model. Instead, a good visualization or interactive chart is almost always the first step of the problem-solving process, and can often be the last as well.
And you know who understands visualization strategy really, really well? Plotly, that’s who. Plotly is a company that builds a ton of great open-source visualization, exploration and data infrastructure tools (and some proprietary commercial ones, too). Today, their tooling is being used by over 50 million people worldwide, and they’ve developed a number of tools and libraries that are now industry standard. So you can imagine how excited I was to speak with Plotly co-founder and Chief Product Officer Chris Parmer.
Chris had some great insights to share about data science and analytics tooling, including the future direction he sees the space moving in. But as his job title suggests, he’s also focused on another key characteristic that all great data scientists develop early on: product instinct (AKA: “knowing what to build next”).
Most machine learning models are used in roughly the same way: they take a complex, high-dimensional input (like a data table, an image, or a body of text) and return something very simple (a classification or regression output, or a set of cluster centroids). That makes machine learning ideal for automating repetitive tasks that might historically have been carried out only by humans.
But this strategy may not be the most exciting application of machine learning in the future: increasingly, researchers and even industry players are experimenting with generative models, that produce much more complex outputs like images and text from scratch. These models are effectively carrying out a creative process — and mastering that process hugely widens the scope of what can be accomplished by machines.
My guest today is Xander Steenbrugge, and his focus is on the creative side of machine learning. In addition to consulting with large companies to help them put state-of-the-art machine learning models into production, he’s focused a lot of his work on more philosophical and interdisciplinary questions — including the interaction between art and machine learning. For that reason, our conversation went in an unusually philosophical direction, covering everything from the structure of language, to what makes natural language comprehension more challenging than computer vision, to the emergence of artificial general intelligence, and how all these things connect to the current state of the art in machine learning.
I can’t remember how many times I’ve forgotten something important.
I’m sure it’s a regular occurrence though: I constantly forget valuable life lessons, technical concepts and useful bits of statistical theory. What’s worse, I often forget these things after working bloody hard to learn them, so my forgetfulness is just a giant waste of time and energy.
That’s why I jumped at the chance to chat with Iain Harlow, VP of Science at Cerego — a company that helps businesses build training courses for their employees by optimizing the way information is served to maximize retention and learning outcomes.
Iain knows a lot about learning and has some great insights to share about how you can optimize your own learning, but he’s also got a lot of expertise solving data science problems and hiring data scientists — two things that he focuses on in his work at Cerego. He’s also a veteran of the academic world, and has some interesting observations to share about the difference between research in academia and research in industry.
You train your model. You check its performance with a validation set. You tweak its hyperparameters, engineer some features and repeat. Finally, you try it out on a test set, and it works great!
Problem solved? Well, probably not.
Five years ago, your job as a data scientist might have ended here, but increasingly, the data science life cycle is expanding to include the steps after basic testing. This shouldn’t come as a surprise: now that machine learning models are being used for life-or-death and mission-critical applications, there’s growing pressure on data scientists and machine learning engineers to ensure that effects like feature drift are addressed reliably, that data science experiments are replicable, and that data infrastructure is reliable.
This episode’s guest is Luke Marsden, and he’s made these problems the focus of this work. Luke is the founder and CEO of Dotscience, a data infrastructure startup that’s creating a git-like tool for data science version control. Luke has spent most of his professional life working on infrastructure problems at scale, and has a lot to say about the direction data science and MLOps are heading in.
When I think of the trends I’ve seen in data science over the last few years, perhaps the most significant and hardest to ignore has been the increased focus on deployment and productionization of models. Not all companies need models deployed to production, of course but at those that do, there’s increasing pressure on data science teams to deliver software engineering along with machine learning solutions.
That’s why I wanted to sit down with Adam Waksman, Head of Core Technology at Foursquare. Foursquare is a company built on data and machine learning: they were one of the first fully scaled social media-powered recommendation services that gained real traction, and now help over 50 million people find restaurants and services in countries around the world.
Our conversation covered a lot of ground, from the interaction between software engineering and data science, to what he looks for in new hires, to the future of the field as a whole.
In this podcast interview, YK (CS Dojo) interviews Chanchal Chatterjee, who’s an AI leader at Google.
Podcast interview with one of our top data science writers, Will Koehrsen.
Let’s go! Here’s Will’s article about what he learned from writing a data science article every week for a year: https://towardsdatascience.com/what-i-learned-from-writing-a-data-science-article-every-week-for-a-year-201c0357e0ce
This episode was hosted by YK from CS Dojo: https://www.instagram.com/ykdojo/
Getting hired as a data scientist, machine learning engineer or data analyst is hard. And if there’s one person who’s spent a *lot* of time thinking about why that is, and what you can do about it if you’re trying to break into the field, it’s Edouard Harris.
Ed is the co-founder of SharpestMinds, a data science mentorship program that’s free until you get a job. He also happens to be my brother, which makes this our most nepostistic episode yet.
If there’s one trend that not nearly enough data scientists seem to be paying attention to heading into 2020, it’s this: data scientists are becoming product people.
Five years ago, that wasn’t the case at all: data science and machine learning were all the rage, and managers were impressed by fancy analytics and build over-engineered predictive models. Today, a healthy dose of reality has set in, and most companies see data science as a means to an end: it’s way of improving the experience of real users and real, paying customers, and not a magical tool whose coolness is self-justifying.
At the same time, as more and more tools continue to make it easier and easier for people who aren’t data scientists to build and use predictive models, data scientists are going to have to get good at new things. And that means two things: product instinct, and data storytelling.
That’s why we wanted to chat with Nate Nichols, a data scientist turned VP of Product Architecture at Narrative Science — a company that’s focused on addressing data communication. Nate is also the co-author of Let Your People Be People, a (free) book on data storytelling.
In this podcast episode, Helen Ngo and YK (aka CS Dojo) discuss deep fake, NLP, and women in data science.
In this podcast interview, YK (aka CS Dojo) asks Ian Xiao about why he thinks machine learning is more boring than you may think.
Original article: https://towardsdatascience.com/data-science-is-boring-1d43473e353e
The other day, I interviewed Jeremie Harris, a SharpestMinds cofounder, for the Towards Data Science podcast and YouTube channel. SharpestMinds is a startup that helps people who are looking for data science jobs by finding mentors for them.
In my opinion, their system is interesting in a way that a mentor only gets paid when their mentee lands a data science job. I wanted to interview Jeremie because I had previously spoken to him on a different occasion, and I wanted to personally learn more about his story, as well as his thoughts on today’s data science job market.
Hi! It's YK here from CS Dojo. In this episode, I interviewed Jessica Li from Kaggle about how she worked with NASA to predict snowmelt patterns using deep learning. Hope you enjoy!
One question I’ve been getting a lot lately is whether graduate degrees — especially PhDs — are necessary in order to land a job in data science. Of course, education requirements vary widely from company to company, which is why I think the most informative answers to this question tend to come not from recruiters or hiring managers, but from data scientists with those fancy degrees, who can speak to whether they were actually useful.
That’s far from the only reason I wanted to sit down with Rachael Tatman for this episode of the podcast though. In addition to holding a PhD in computational sociolinguistics, Rachael is a data scientist at Kaggle, and a popular livestreaming coder (check out her Twitch stream here). She’s has a lot of great insights about breaking into data science, how to get the most out of Kaggle, the future of NLP, and yes, the value of graduate degrees for data science roles.
One thing that you might not realize if you haven’t worked as a data scientist in very large companies is that the problems that arise at enterprise scale (and well as the skills that are needed to solve them) are completely different from those you’re likely to run into at a startup.
Scale is a great thing for many reasons: it means access to more data sources, and usually more resources for compute and storage. But big companies can take advantage of these things only by fostering successful collaboration between and among large teams (which is really, really hard), and have to contend with unique data sanitation challenges that can’t be addressed without reinventing practically the entire data science life cycle.
So I’d say it’s a good thing we booked Sanjeev Sharma, Vice President of Data Modernization and Strategy at Delphix, for today’s episode. Sanjeev’s specialty is helping huge companies with significant technical debt modernize and upgrade their data pipelines, and he’s seen the ins and outs of data science at enterprise scale for longer than almost anyone.
A few years ago, there really wasn’t much of a difference between data science in theory and in practice: a jupyter notebook and a couple of imports were all you really needed to do meaningful data science work. Today, as the classroom overlaps less and less with the realities of industry, it’s becoming more and more important for data scientists to develop the ability to learn independently and go off the beaten path.
Few people have done so as effectively as Sanyam Bhutani, who among other things is an incoming ML engineer at H2O.ai, a top-1% Kaggler, popular blogger and host of the Chai Time Data Science Podcast. Sanyam has a unique perspective on the mismatch between what’s taught in the classroom and what’s required in industry: he started doing ML contract work while still in undergrad, and has interviewed some of the world’s top-ranked Kagglers to better understand where the rubber meets the data science road.
The trend towards model deployment, engineering and just generally building “stuff that works” is just the latest step in the evolution of the now-maturing world of data science. It’s almost guaranteed not to be the last one though, and staying ahead of the data science curve means keeping an eye on what trends might be just around the corner. That’s why we asked Ben Lorica, O’Reilly Media’s Chief Data Scientist, to join us on the podcast.
Not only does Ben have a mile-high view of the data science world (he advises about a dozen startups and organizes multiple world-class conferences), but he also has a perspective that spans two decades of data science evolution.
Each week, I have dozens of conversations with people who are trying to break into data science. The main topic of the conversations varies, but it’s rare that I walk away without getting a question like, “Do you think I have a shot in data science given my unusual background in [finance/physics/stats/economics/etc]?”.
From now on, my answer to that question will be to point them to today’s guest, George John Jordan Thomas Aquinas Hayward.
George [names omitted] Hayward’s data science career is a testament to the power of branding and storytelling. After completing a JD/MBA at Stanford and reaching top-ranked status in Hackerrank’s SQL challenges, he went on to work on contract for a startup at Google, and subsequently for a number of other companies. Now, you might be tempted to ask how comedy and law could possibly lead to a data science career.
For today’s podcast, we spoke with someone who is laser-focused on considering this second possibility: the idea that data science is becoming an engineer’s game. Serkan Piantino served as the Director of Engineering for Facebook AI Research, and now runs machine learning infrastructure startup Spell. Their goal is to make dev tools for data scientists that make it as easy to train models on the cloud as it is to train them locally. That experience, combined with his time at Facebook, have given him a unique perspective on the engineering best practices that data scientists should use, and the future of the field as a whole.
I’ve said it before and I’ll say it again: “data science” is an ambiguous job title. People use the term to refer to data science, data engineering, machine learning engineering and analytics roles, and that’s bad enough. But worse still, being a “data scientist” means completely different things depending on the scale and stage of the company you’re working at. A data scientist at a small startup might have almost nothing in common with a data scientist at a massive enterprise company, for example.
So today, we decided to talk to someone who’s seen data science at both scales. Jay Feng started his career working in analytics and data science at Jobr, which was acquired by Monster.com (which was itself acquired by an even bigger company). Among many other things, his story sheds light on a question that you might not have thought about before: what happens to data scientists when their company gets acquired?
Most software development roles are pretty straightforward: someone tells you what to build (usually a product manager), and you build it. What’s interesting about data science is that although it’s a software role, it doesn’t quite follow this rule.
That’s because data scientists are often the only people who can understand the practical business consequences of their work. There’s only one person on the team who can answer questions like, “What does the variance in our cluster analysis tell us about user preferences?” and “ What are the business consequences of our model’s ROC score?”, and that person is the data scientist. In that sense, data scientists have a very important responsibility not to leave any insights on the table, and to bring business instincts to bare even when they’re dealing with deeply technical problems.
For today’s episode, we spoke with Rocio Ng, a data scientist at LinkedIn, about the need for strong partnerships between data scientists and product managers, and the day-to-day dynamic between those roles at LinkedIn. Along the way, we also talked about one of the most common mistakes that early career data scientists make: focusing too much on that first role.
If you’ve been following developments in data science over the last few years, you’ll know that the field has evolved a lot since its Wild West phase in the early/mid 2010s. Back then, a couple of Jupyter notebooks with half-baked modeling projects could land you a job at a respectable company, but things have since changed in a big way.
Today, as companies have finally come to understand the value that data science can bring, more and more emphasis is being placed on the implementation of data science in production systems. And as these implementations have required models that can perform on larger and larger datasets in real-time, an awful lot of data science problems have become engineering problems.
That’s why we sat down with Akshay Singh, who among other things has worked in and managed data science teams at Amazon, League and the Chan-Zuckerberg Initiative (formerly Meta.com).
It’s easy to think of data science as a technical discipline, but in practice, things don’t really work out that way. If you’re going to be a successful data scientist, people will need to believe that you can add value in order to hire you, people will need to believe in your pet project in order to endorse it within your company, and people will need to make decisions based on the insights you pull out of your data.
Although it’s easy to forget about the human element, managing it is one of the most useful skills you can develop if you want to climb the data science ladder, and land that first job, or that promotion you’re after. And that’s exactly why we sat down with Susan Holcomb, the former Head of Data at Pebble, the world’s first smartwatch company.
When Pebble first hired her, Susan was fresh out of grad school in physics, and had never led a team, or interacted with startup executives. As the company grew, she had to figure out how to get Pebble’s leadership to support her effort to push the company in a more data-driven direction, at the same time as she managed a team of data scientists for the first time.
You import your data. You clean your data. You make your baseline model.
Then, you tune your hyperparameters. You go back and forth from random forests to XGBoost, add feature selection, and tune some more. Your model’s performance goes up, and up, and up.
And eventually, the thought occurs to you: when do I stop?
Most data scientists struggle with this question on a regular basis, and from what I’ve seen working with SharpestMinds, the vast majority of aspiring data scientists get the answer wrong. That’s why we sat down with Tan Vachiramon, a member of the Spatial AI team Oculus, and former data scientist at Airbnb.
Tan has seen data science applied in two very different industry settings: once, as part of a team whose job it was to figure out how to understand their customer base in the middle of a the whirlwind of out-of-control user growth (at Airbnb); and again in a context where he’s had the luxury of conducting far more rigorous data science experiments under controlled circumstances (at Oculus).
My biggest take-home from our conversation was this: if you’re interested in working at a company, it’s worth taking some time to think about their business context, because that’s the single most important factor driving the kind of data science you’ll be doing there. Specifically:
model.save()
and walking away.To most data scientists, the jupyter notebook is a staple tool: it’s where they learned the ropes, it’s where they go to prototype models or explore their data — basically, it’s the default arena for their all their data science work.
But Joel Grus isn’t like most data scientists: he’s a former hedge fund manager and former Googler, and author of Data Science From Scratch. He currently works as a research engineer at the Allen Institute for Artificial Intelligence, and maintains a very active Twitter account.
Oh, and he thinks you should stop using Jupyter noteoboks. Now.
When you ask him why, he’ll provide many reasons, but a handful really stand out:
a = 1
in the first cell of your notebook. In a later cell, you assign it a new value, say a = 3
. This results is fairly predictable behavior as long as you run your notebook in order, from top to bottom. But if you don’t—or worse still, if you run the a = 3
cell and delete it later — it can be hard, or impossible to know from a simple inspection of the notebook what the true state of your variables is. Overall, Joel’s objections to Jupyter notebooks seem to come in large part from his somewhat philosophical view that data scientists should follow the same set of best practices that any good software engineers would. For instance, Joel stresses the importance of writing unit tests (even for data science code), and is a strong proponent of using type annotation (if you aren’t familiar with that, you should definitely learn about it here).
But even Joel thinks Jupyter notebooks have a place in data science: if you’re poking around at a pandas dataframe to do some basic exploratory data analysis, it’s hard to think of a better way to produce helpful plots on the fly than the trusty ol’ Jupyter notebook.
Whatever side of the Jupyter debate you’re on, it’s hard to deny that Joel makes some compelling points. I’m not personally shutting down my Jupyter kernel just yet, but I’m guessing I’ll be firing up my favorite IDE a bit more often in the future.