Computer vision meets Dutch music video

RNW archive

This article is part of the RNW archive. RNW is the former Radio Netherlands Worldwide or Wereldomroep, which was founded as the Dutch international public broadcaster in 1947. In 2011, the Dutch government decided to cut funding and shift RNW from the ministry of Education, Culture and Science to the ministry of Foreign Affairs. More information about RNW Media’s current activities can be found at

New York University is using a video clip by Dutch electro band C-Mon & Kypski for research purposes. The clip for the track More Is Less contains footage of thousands of different people. NYU’s Courant Institute of Mathematical Sciences is using the video in its computer vision project teaching computers to recognise human shapes.

The clip, made a year and a half ago by video artists Roel Wouters and Jonathan Puckey, shows the band members demonstrating a number of poses, gestures and facial expressions. The website was used to invite members of the public to submit a webcam shot of themselves imitating the band. An updated version of the video appeared on the website once an hour until it contained over 33,000 different people making the same movements against totally different backgrounds.


Researchers working on computer vision at New York University saw the video clip which was picked up by Google as a good example of crowdsourcing - using the audience as a creative source. Graham Taylor was immediately enthusiastic.

“We want to teach a computer to recognize people in different poses. And we wanted to be able to recognize different people and different genders and different types of clothing and recognize that those people are all in the same pose, regardless of what the actual content of the image is. So this video of C-Mon & Kypski is wonderful because it has this type of data. It has many different people performing the same pose, but under a wide variety of settings.”

New concept
They used the video to develop an algorithm which helps computers recognize human shapes and movements. Computer vision is used in the games industry, for surveillance cameras and in cars which brake automatically for pedestrians.

Simon Akkermans, alias C-Mon, was pleasantly surprised to have contributed to scientific research, albeit indirectly.

“It’s really great, of course. The kind of thing you can only dream about. Mind you, we get all kinds of requests all year round, often for art projects. The video will soon be shown at the Vienna Biennale, for instance. There are always people who have seen it and want to do something with it. As soon as the clip appeared, it attracted a lot of interest. Fair enough, I think, since it is kind of a new concept.”

The combination of music clip and computer science was new to Graham Taylor too and has generated excellent publicity for his project.

“I definitely haven’t learned yet of any other collaborations between progressive electro bands and computer scientists, so I think it’s something that is pretty unique. And I am very happy about it, because it allows us to get the word out a little bit more easily about machine learning and computer vision and what’s going on in this field, at a level that most people identify with. It’s a fun project and it’s fun to talk about, as opposed to a lot of the technical details we go into sometimes when we discuss our work.”

The researchers have almost finished their work with the video clip and plan to announce their findings at the IEEE Conference on Computer Vision and Pattern Recognition in Colorado Springs in two weeks time.

More about the computer vision project