Reducing data friction with Data Packages

Everyone who has been sending data from one user to another and sharing it between applications knows that there is always a lot of friction involved. Some questions are always a pain to answer. Who created this? When was this updated? How is the data licensed? 

Figure 1: Moving data always includes friction. This data is +40 million celltower locations from OpenCellID. Source: Gispo Ltd.

Frictionless is an open-source toolkit that aims to bring simplicity to data flows and answers these questions. Containerization has been a growing trend in software development already for several years and Frictionless Data Packages aim to bring the same approach to data. 

A Data Package is a simple container format used to describe and package a collection of data. The container comes with the metadata and the actual dataset (or URL/POSIX path). A Data Package can contain any kind of data. At the same time, Data Packages can be specialized and enriched for specific types of data. The idea of Frictionless data has already been around for some time and the full Data Package spec was written in 2017  but especially in the geospatial field it is still very rare to see Data Packages. 

For QGIS users it is good to note that despite the name, Data Package is not the same thing as GeoPackage. But just like GeoPackages, now it is possible to create Frictionless Data Packages with QGIS. 

Spatial Data Packages with QGIS 

Last year we developed the first version of Spatial Data Package plugin for QGIS. The developed QGIS plugin can also export the styles used in QGIS with the Data Package. We have been working on the development with a Swiss company cividi. They are focused on solving urban planning challenges with a data-driven approach. 

With the plugin users can export their QGIS projects to the interactive platform cividi has been developing. Besides geometry and the entered metadata, the plugin also embeds the QGIS styles from the project inside the data container. This is a good example showing how the Data Package specification can be extended if necessary.

The plugin has a lot of resemblance to the Unfolded plugin we also developed last year. 

Figure 2: sample of a Data Package created with the plugin.

You can read a description of the plugin workflow from cividi’s blog

We are currently developing a few new features to the plugin, but it would be great to hear feedback from Frictionless data experts and Data Package users. Leave us comments for future improvements you would like to see! Open an issue or a pull request on GitHub or send us an email at info@gispo.fi

Topi Tjukanov is MSc and Bachelor of Business Administration who is interested in data crushing, mind-boggling visualizations and open source software. Freetime activities consist of travelling, reading and being a football enthusiast.