How do you go about organizing information produced by thousands or tens of thousands of people? This is exactly what for instance Wikipedia with their over 100 000 editors is doing, and they’re doing it remarkably well. Similarly, a citizen science project called Zooniverse has over the years managed to coordinate the input of over a million participants to projects ranging from investigating the wildlife in Michigan to discovering variable stars out there in the sky. This is what crowdsourcing is all about: harnessing the power of many to do something significant, together and as supported by some type of novel software solution.
Coined in 2006, crowdsourcing refers to the portmanteau of crowds and outsourcing - outsourcing to the crowds. Crowdsourcing is typically done online and facilitated by some web-based tool or service. Obviously, working together is nothing new as a concept for humans but, if you think of it, as an Internet-era technique crowdsourcing is just a late teenager. Surely we can expect to hear more about this novel form of work in the future!
Crowdsourcing is a pivotal technique in crowd computing, a field of science that focuses on work that is too difficult for computers alone. This is a comforting thought. For us humans, it feels good to be needed right now, in the age of AI. We’re reminded daily about how AI is slowly taking all our jobs, including soon even the creative ones. Solutions suggested to this potentially devastating problem range from universal basic income to the creation of new types of jobs that we cannot yet even dream of.
Our hopes are not lost though, not just yet. To build services and digital tools for humans, AI has only humans to learn from and imitate. As Brabham et al. put it, “No one knows everything, everyone knows something and all knowledge resides in humanity; digitalisation and communication technologies must become central in this coordination of far flung genius”  . Personally, I like that idea a lot. To build for us humans, even the master algorithm needs us humans. To this end, crowd computing comes in as a field of science that is inherently interested in building the future together, in smooth collaboration between humans and AI. Now, all we have to ensure is that the algorithm actually builds nut just for us but indeed for all of us.
Crowdsourcing, the Good, the Bad and the Ugly
So, with crowdsourcing, you basically take a large information-heavy task, distribute it to the crowds in smaller subtasks, aggregate the results, and you have effectively crowdsourced something. Silicon Valley has perfected this type of data collection as a byproduct. I’ll be the first to admit, I’m a bit of a Google fanboy. But Google is really rather undeniably brilliant in their crowdsourcing game. For instance, back in the day Google Voice was a kind of a big thing. They offered free landline numbers for people for their calling needs - way before WhatsApp or Skype calls were a thing. Free! It’s a magic word. Or think of Google Maps - a brilliant map service, again yours for free! Or Google Translation - yet another free much-needed service! All these services are great examples of how Google leverages the power of crowds to gather masses of data to improve their services. Voice calls were ultimately transcribed to perfect voice recognition algorithms. Google Maps is enabled by its users who provide real-time traffic data and help complete the information layers by providing knowledge e.g. about opening hours or even real-world pictures about locations. And translation? They provided translation services for free for an entire decade, during which millions of website users helped Google perfect the product, again for free, by suggesting better translations on websites.
The big companies need a lot of data and they need it labeled. Academics and smaller companies often have the same needs: data must be first processed so that machines can learn to understand it. It’s just as described earlier: algorithms are merely learning to imitate us. This has given rise to a plethora of paid labour markets online, where you can literally pay pennies for a datapoint. The biggest ones, such as Amazon’s Mecahnical Turk, boast with hundreds of thousands of available workers that are accessible via a simple user interface or even programmatically through Application Programming Interfaces (APIs). The thing is, these global workforces have no unions and nobody is e.g. looking after minimum wages or working conditions of the globally distributed online workforce with no names but just worker identifiers. It’s a wild west out there, described even as a “Poorly Paid Hell” by The Atlantic just two years ago.
Doing it the Oulu Way
As academics, we have a moral responsibility to not only pay well when conducting research with - or directly about - this novel form of digital work but also find ways to create systems that benefit everyone. My all-time favourite crowdsourcing systems are those that work somehow two-way: Get data but provide benefits, true win-win way. This is what the Crowd Computing Research Group at the Center for Ubiquitous Computing is also focusing on. We build systems that somehow rely on crowdsourcing but that also aim to offer something back for the users.
One example is our new project, ICON: Interventions and Contextual Understanding for Low Back Pain Research. In ICON, we crowdsource various types of data using people’s smartphones for Low Back Pain research. Low back pain is globally one of the most burdensome diseases, and experts estimate that it affects the lives of up to 80% of all people at some point in their lives. We’re working toward a holistic understanding of the lived lives of not only people who suffer from low back pain but also those who are not yet experiencing any serious symptoms, i.e. potential future patients.
ICON mixes together various types of crowdsourcing: mobile sensing for collecting objective sensor data about peoples’ lives in a passive fashion, active crowdsourcing for collecting self-reports on how back pain is affecting daily lives, and online crowdsourcing for building a repository of potential self-care techniques. These data are analysed through various machine learning approaches to try to make treatment interventions to the participants: what should they do and when exactly, in an attempt to mitigate back pain.
This project is largely built on top of our earlier successful ventures together with the Finnish Institute of Occupational Health, Melbourne University in Australia, and the University of Tokyo, Japan, in which we essentially leveraged wisdom of the crowds to figure out what self-care techniques do people employ for mitigating low back pain.
In addition to digital health solutions, our group works on various other topics such as how to augment creativity with the crowds. Since the labour markets have plenty of people available on-demand for various types of creative work, we have looked into helping writers be more creative about their craft or providing feedback to web designers using the crowds. And stepping out the box of online platforms, we have been looking into crowdsourcing novel forms of health data that could be one day in the future useful in building AI-based diagnostic solutions. We are also interested in ethics and ownership of all this data which we really see as play dough for scientists: It is the fundamental building block that powers the future solutions but it can only be harvested from us complicated humans, making it quite a fascinating dance of human-factors and technological considerations together.
Onwards, My Noble Steed
Where do we go next? There’s no competition - algorithms are far better than us humans in so many things. A great example is various healthcare innovations; Medical Doctors benefit greatly from the help of algorithms in diagnosis but their contextual understanding - as well as understanding the patient, the human factors - are needed to deliver the message and fine-tune the treatment suggestions to fit the patient’s life situation. And certainly it seems there’s room for everyone, and looking at AI as a competitor is not perhaps the right thing to do. We must let it do what it does best and see this revolution as an opportunity to do what we do best: Guide it and focus on helping the algorithm to best understand and work for us.
 Brabham DC, "Crowdsourcing as a Model for Problem Solving", in Convergence: The International Journal of Research into New Media Technologies, 2008
Photo: Vlad Tchompalov, Unsplash
Last updated: 13.10.2020