A Perfect Match: Artificial Intelligence and Crowdsourcing

A Perfect Match: Artificial Intelligence and Crowdsourcing

Over the past decade, the field of artificial intelligence (AI) has seen striking developments. There are now dozens of domains in which AI programs are performing at least as well as (if not better than) humans. Crowdsourcing on the other end has matured and reached a certain saturation; and while companies and organizations do see crowdsourcing as a valuable and cost-efficient mechanism, there is a growing awareness of its limitations.

This is now changing with the convergence of AI and crowdsourcing: the latter has evolved into a more pragmatic approach for corporates and organizations, who access the crowd not only for co-creation of products or their ingenuity but rather as trainers for AI systems. The other side of the equation is the growing awareness of the potential AI has in empowering crowds and enhancing their value.

In fact, AI and crowdsourcing are a perfect match. Take for example machine learning (ML) – a subset of AI. ML algorithms build a mathematical model of sample data, known as “training data”, to make predictions or decisions without being explicitly programmed to perform the task. This outcome is the result of two powerful forces in the evolution of information retrieval and analysis: natural language processing (NLP), and crowdsourcing.

Today we are accessing the knowledge of the crowd to label data – for instance to label a picture to be a cat or a dog. But this is just the beginning; crowds can provide data sets – for example, an individual or a group of people could provide data regarding their health, which an AI-system could collect and analyze, thus supporting the development of new treatments. And much like current crowd-based labor markets, where people are getting paid for performing micro-tasks (e.g. Amazon’s Mechanical Turk, or Witkey) – the crowd will either get paid for its data or will provide data for free to help advance a good cause for humanity (as in the case of CancerBase). Another exciting example is that of  sbv IMPROVER – a platform that is developing a crowdsourced and AI-based classification challenge on microbiome samples from inflammatory bowel disease patients.

Combining AI with the insights of annotators and the information provided and encoded by large numbers of humans, which traditionally has been performed by small numbers of people (usually experts), can allow companies and organizations to acquire large labeled datasets at lower cost, thus making R&D and production processes more efficient and less costly.

The other side of the equation is how ML can be harnessed to leverage the complementary strengths of humans and computational agents to solve crowdsourcing tasks. One of the main challenges of managing large online communities, especially communities who produce large quantities of data and information, is the challenge of synthesizing this information, i.e. creating knowledge – insights, patterns, predictions and more. Most communities rely heavily on human moderators, and can use analytics tools to analyze quantifiable participation patterns, but not the (sometimes fuzzy) content produced by humans.

Another problem with manually managing online crowds is the need to identify the right target audience, a sub-group if you will, for a specific ask, task or message. Take marketers that manage online communities, for example: they need to reach people who can contribute to their campaigns and not trolls, or just irrelevant participants. Remember that invaluable communication from a marketer is perceived as spam and could cause reputational damage.

Crowdsourced-AI, however, helps to overcome such problems. It enables human-like intelligence, allowing individuals and crowd subsets to be better analyzed and targeted. Even more so, AI-based engines can identify patterns in crowd behavior as well as discursive patterns (as in the case of an exciting Israeli startup – Epistema).  An AI-based crowdsourcing, or a crowd-based AI would allow to bridge between the basic level of data and information – which machines can analyze much better than human beings – and the knowledge domain, which is in the highest cognitive level, currently only performed by humans.

This vision of crowdsourcing our opinions and combining them with AI‘s superhuman ability to crunch data and come up with patterns, stands behind IBM’s “Project Debater – Speech by Crowd” as IBM dubs it – a “new and experimental cloud-based AI platform for crowdsourcing decision support.” It solicits arguments for and against a specific topic from as many humans as possible and then uses them to create debate speeches.

Looking even deeper into the future, there is a third type of AI-and-crowdsourcing convergence, only in this case, the crowd isn’t composed of people, but machines. This is what I call “Machine Sourcing” – the fourth component of the “Big Knowledge” revolution, along with AI, crowds and advanced data visualization. Machine Sourcing is based on machines that can not only write code, program and develop new algorithms, but create an algorithm that can analyze other algorithms, decide which ones serve a certain purpose, and use them to create another set of algorithms – or neural networks. Google’s AutoML is an early example of such existing technology: “neural nets that can design neural nets”. Wow!

sbv IMPROVER is sponsoring the Boldest Scientific Project in the international BOLD Awards. Details on the final five nominees in this category are available here. The winner will be announced at a black-tie award ceremony being held very soon at the campus home of  accelerator hub  H-FARM near Vencie, Italy, on April 5 2019. You can apply if you think you’re BOLD enough for one of the remaining event tickets for a unique gala dinner evening with the award winners, sponsors and the teams from H-FARM and Crowdsourcing Week.

Author: Dr Shay Hershkovitz

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest

Leave a Reply

Your email address will not be published. Required fields are marked *