FaceApp, the leading mobile application of the latest fashion viral , consisting of uploading our photos to obtain approximations of what we would look like in several decades (or being of the opposite sex), is also making headlines because of the threat it represents to our Privacy.
Among the many and varied possible uses of these images, in addition to the usual advertising use, the possibility of using our photos to train facial recognition algorithms stands out : the more images of faces we provide to these AI systems, the better they will ‘understand’ which are the defining patterns of the human face and how it moves, how it evolves over time, etc.
FaceApp is one more example of how we voluntarily renounce our privacy
“We can remove pieces of data that can identify you and share anonymous data with other parties.”
“We may also combine your information with other information so that it is no longer associated with you and share that aggregated information.”
If you are one of those who have resisted the temptation to use FaceApp, you should not lower your guard either: you still cannot be sure that your face is not being used as training material for facial recognition AIs.
And it is that dozens of databases, compiled by both companies and academic researchers, currently house thousands or millions of images, not always transferred with the user’s consent . And each of them provides material to several different artificial intelligence projects, in many cases because they are publicly accessible.
What is the source of all those images? Social networks, apps and photo storage and / or editing websites, public broadcast webcams, online dating services, etc …
The controversy with FaceApp is not new: we all remember the viral challenge of # 10yearchallenge , launched at the beginning of this year. Kate O’Neill, editor of Wired, reminded us how easy we were making the big social media companies, by using this label, to use our face to train facial and image recognition algorithms.
The US media recently revealed that Ever, a free app that offers unlimited space to make backup copies of our photos and videos, was not satisfied with the subscriptions of its premium users, and that since 2013 it has been monetizing the multimedia material of millions of free account users , without their knowledge.
Ever thus created a “private dataset of 13,000 million photos and videos […] in continuous expansion” which he boasted about on his website, although without publicly linking it to the mobile app. He then used the dataset to provide training data to the company’s facial recognition technology, which it then sold to law enforcement and private companies.
Even now, the reference may be too vague even for the few users who dare to review it:
“To allow you to organize your files and allow you to share them with the right people, Ever uses facial recognition technologies as part of its service. Your files can be used to improve and train our products and technologies.
Some of these technologies may be used in other of our products and services for corporate clients, including our facial recognition offerings for companies. “
It was also recently revealed that Microsoft had quietly deleted its MS Celeb database , presented in 2016 as the world’s largest dataset focused on facial recognition : it contained more than 10 million photos of approximately 100,000 people, collected without asking permission from people who they appeared in the photos because they were all ‘public personalities’ … it just turned out that not all of them were.
While it was active, several large companies (such as Nvidia, Hitachi, IBM, Panosonic or the Chinese giants Sensetime, Megvii and Alibaba) used the content of MS Celeb for their own facial recognition projects .
The scandal unleashed as a result of this revealed that two other large datasets (the ‘Brainwash’ from Stanford University and the ‘Duke MTMC’ from Duke University) had also been deleted in recent months and for similar reasons.
And at least the first one was used, like MS Celeb, by Megvii , a Chinese government AI vendor linked to the Uighur ethnic profiling project . There are also references to the use of the material from both datasets in numerous academic papers published by institutions on four continents.
But the activist who uncovered these three cases remembers that the damage has already been done , because
“You can’t make a dataset disappear, once you publish it and people download it, it exists on hard drives around the world [and] there is no way to stop them from continuing to publish it, or use it for their own purposes.”
Kim Zetter, an American journalist specializing in cybersecurity was one of the people whose face became part, without her knowing it, of the Microsoft dataset:
“All of us are just think to feed all these surveillance systems. The idea that all this could be being shared with foreign governments and armies is simply egregious.”
Matt Zeiler, founder and CEO of Clarifai, an AI startup, has stated that his company developed a dataset of images of faces using OkCupid as a source , a website that he had access to thanks to the fact that some of the founders of the same were investors of Clarifai. These images were used to develop a platform capable of identifying the age, sex and race of the detected faces.
That would also have been the destination of the images that Clarifai gathered through her Insecam platform. This, baptized thus in reference to the ‘insecure cameras’ that broadcast their images in the open on the Internet without their users knowing it, was forced to close before the image collection process began, thanks to the protests of employees of the same company .