From September 5-7, I attended the O’Reilly Artificial Intelligence (AI) Conference in San Francisco. Wednesday was a tutorial day, where sessions were intended to be very hands-on. We built models in each of the tutorials that would have been jaw-dropping even two years ago, but now are increasingly commonplace. Thursday and Friday had keynotes in the morning and breakout sessions in the late morning/afternoon. As usual, the conference was a fire hose of information, and I was exhausted by the end of each day.
I’ve been reflecting on what themes emerged considering the keynotes and all of the sessions I attended. The overarching theme, in contrast to 2016, was something like “now AI is really happening – what do we need to think about?”. The answers included these points:
- Heavy infrastructure and other capital investments – cloud platforms, specialized chipsets, even physical infrastructure to support upcoming AI-powered products.
- Significant chipset development on multiple fronts, including an open source effort from UC Berkeley. Chips being developed (and some already here) are domain-specific; often coupled with software as in Intel’s Nervana initiative. In many ways, this seems analogous to AGI yielding to domain-specific solutions.
- Continued framework development to support “citizen data science” – the barrier to entry to creating some amazing working models is getting lower and lower.
- Recognition of a serious need for thought and policies around ethics, bias, mindful of our collective responsibility, and the risk of not planning ahead.
- I heard at least three people use the term “unicorn”, representing a normalization of the term, replacing “purple squirrel”.
- As for the core algorithms powering AI: in addition to deep learning (essentially stacked neural networks) for all kinds of classification, especially vision-based, there was a focus on genetic/evolutionary algorithms to discover models, and GANs for a variety of purposes including optimizing models and synthesizing data.
- Python seems to be winning the “R vs. Python” war. R was not mentioned during the conference in any of the sessions I attended.
Wednesday Tutorial Sessions
Bringing AI Into the Enterprise
I had originally planned to go to this all-day session, but it was cancelled at the last minute. This blurb intrigued me: “Kristian Hammond provides a practical framework for understanding your role in problem solving and decision making… Rather than focusing on the technologies alone, this framework ensures that as you build, evaluate, and compare different systems, you’ll understand and be able to articulate how they work and the resulting impact.”
Through my colleague Glenn Kapetansky (thanks Glenn!), I met Kristian a few years ago and thought he was extremely thoughtful and intelligent. I ended up going to two half-day sessions instead.
Distributed deep learning in the cloud: Build an end-to-end application involving computer vision and geospatial data
Presenters Mary Wahl and Banibrata De (both of Microsoft) were enthusiastic and clearly engaged in their topic. Microsoft recently formed an “AI for Earth” team that “helps NPOs apply AI to challenges in conservation biology and environmental science”.
Using Jupyter notebooks hosted on a Microsoft Azure instance of their Data Science Virtual Machine, we were able to quickly build a model to classify ground cover in aerial photos as grass, water, trees, etc. The power and speed of creating those models, and using Jupyter notebooks to capture both Python code and text with contextual information, was impressive.
Building intelligent mobile applications in healthcare
In this session, the presenters walked us through building a deep learning model using a working example that helps clinicians identify possible lung diseases. The tutorial yielded a finished model that could classify radiological images into about a dozen primary diseases with accuracy at or better than an expert (human) panel.
*Pictured in Photo: Dr. Mazen Zawaideh, University of Washington Medical Center
A question from the audience about model evaluation was illuminating to me. The question was “how do you know when the model is optimized?” This was interesting because it’s the way a data scientist thinks if he or she isn’t mentally embedding the work in context of a real life solution. The answer, from Dr. Zawaideh: it doesn’t have to be fully optimized to be valuable, just better than human experts.
A few of the keynotes on Thursday that were more interesting to me included the following:
Meredith Whitaker – Founder, AI Now Institute; NYU
Meredith Whittaker is a distinguished research scientist at New York University and the founder of Google’s Open Research Group. She’s also the co-founder and co-director of the AI Now Institute at NYU, dedicated to researching the social implications of AI.
She was the first person at the conference who spoke clearly about the need to be clear-eyed about the potential downfalls of AI. She pointed out that only about seven companies have the ability to create AI at a large scale – putting the future of AI into the hands of a small group of leaders at those businesses. The talk ended with relevant cautions on ethics, normalization of behavior, and bias in data.
Soups Ranjan – Coinbase
Soups talked about how Coinbase uses a serious ML workflow, powered by AWS (especially SageMaker), to detect potential fraud. Using cloud platforms has resulted in cutting down the training time for machine learning models from 20 hours to 20 minutes. Amazon, Microsoft, and Google are running full speed ahead creating platforms with a huge amount of processing power, backed by continually optimized algorithms accessible via API. I have to believe this is the future of AI at scale.
Greg Brockman – OpenAI
In the only talk to buck the trend of talking about narrowly-defined, domain-specific solutions, Greg wanted to talk about “Artificial General Intelligence”, meaning broad humanlike intelligence rather than point solutions like playing chess. He pointed out that there was a lot of historical skepticism around other major advances and used that as a basis to believe in the idea that AGI could come soon. Despite that, he continued the trend of equating AGI with deep learning, which is a major assumption that, in my opinion, is premature at this point. The money quote: “It’s hard to put your finger on what deep learning can’t do”. While I applaud trying to move the AI community towards its roots of AGI, I wasn’t clear what his point was, and I don’t believe that it’s a foregone conclusion that AGI depends on deep learning.
David Martinez, from MIT’s Lincoln Laboratory, gave a great talk around a topic he called “Human-Machine Teaming”. His decades of experience and pragmatic approach to cybersecurity (his specialty) resonated with me.
He spent much of his talk laying out the case that the last step in a canonical AI architecture (or workflow) is Human-Machine Teaming, and how to assess how much human involvement is required in an AI workflow based on consequences of actions in a domain, our confidence in an algorithm, and how much labeled data exists for that domain.
One striking point was his belief that Adversarial AI (that is, using algorithms to act as “attackers” of other algorithms and models) is important to address fragility. As one example, we no longer know how to debug AI/ML models, but an intelligent adversary could uncover issues that need to be addressed.
Explaining Machine Learning Models
By far the best-attended session I attended was one called “Explaining Machine Learning Models”. Because many ML models are opaque, everyone wants to know how to create an explainable (transparent) model. A few key points:
- A single loss function is not enough for some industries, e.g. finance, insurance.
- To explain their “reasoning”, models can output visualizations (usually the most intuitive), text, or examples by analogy.
- It turns out there are ways to create interpretable models, even Neural Network based models; there’s a lot of research in this area.
One of the people I respect the most in the field is Peter Norvig. He gave a talk outlining some great case studies of AI at work. From his session blurb: “…the most exciting aspect is the diversity of applications in fields far astray from the original breakthrough areas, as well as the diversity of the people making these applications”. He pointed the way to the future as befits someone who is heavily credentialed and currently a Director of Research at Google.
The speaker was engaging and brought a breath of creativity into the conference mix. Evolutionary (a.k.a. genetic) algorithms search an “unknown space” instead of trying to model the world using what we already know. These algorithms, as you might guess from the name, attempt to mimic the process of evolution. Though resulting models are evaluated in the same way as traditional models, they’re created automatically rather than manually by a data scientist.
Knowledge Graphs for AI
In one of the last sessions I attended, Mike Tung made a few inflammatory statements like (paraphrasing): “despite well-known successes, AI is stupid and easily tricked because it lacks an underlying knowledge base.” He revealed himself to be in the “Symbolists” tribe described by Pedro Domingos in his book The Master Algorithm. That is, and I have to agree, blind methods of “intelligence” not backed by real-world knowledge are fragile. There have been several large-scale attempts to capture real-world knowledge, including Cyc – the largest fact-based knowledge base in existence.
One area of focus from multiple enterprises including the speaker’s is automating knowledge graph generation from existing documents.
I’m glad I went to the conference and thank Trexin for the opportunity. Believe it or not, I’ve omitted a lot of my notes in the interest of space. One thing is certain, there’s an astounding amount of time, energy, and money being invested in AI globally, which will likely only increase in the coming decades.
It’s easy to predict two things with fairly high confidence: one, there will be major breakthroughs in the next few years in AI applied to a variety of domains; and two, the kind of algorithms that would support building the original definition of Artificial Intelligence are not likely to happen anytime soon.
I welcome your questions and comments.