Nesta’s Big and open data for the common good event this week went together with the publication of their new report, Data for good. The event brought together many of the contributors to the report, with some interesting projects using “big data” methods on third sector data and issues. It is well worth reading.
The most interesting discussions at the event were not about statistical methods though. One question asked was about the infrastructure needed to use all the big and open data coming available – who is going to fund and build it? And what format will it be in: if it is too complex to obtain and process, it won’t really be available “for all”. Very good points.
There are also unsolved issues still with anonymisation, especially with health data which is of special interest to me at the moment. A proper debate needs to take place about the ethics of sharing and using public data, often for commercial purposes. And something that I’ve also been pondering about is the use of Twitter data. Yes it is an interesting data source, but there must be a danger that it will be used too much, just because it is so easy to get hold of. For more on this topic, Emma Uprichard has interesting things to say from a social science perspective, in a paper with the great title Big data, little questions?