If you follow someone from #TeamEBP on Twitter, you may have noticed that last week we installed a LoRaWAN gateway of The Things Network in our office building. And like some of my colleagues you may have wondered (or wonder now): What is this all about?
While mobile and WiFi networks drain your mobile phone battery quickly with increasing data transfer rates, LoRa takes the opposite approach. Only very little data can be sent over the network to minimize power consumption. Take for example the optimizing of garbage collection by installing sensors on waste bins, a solution that is already more widespread than I expected. You would certainly use batteries, maybe combined with energy harvesting, rather than connect every garbage container throughout a city to the power grid.
Have you ever noticed the amazing anticipation of IoT ideas in „Frau Holle„? Bread calling out: „Oh, take me out. Take me out, or I’ll burn. I’ve been thoroughly baked for a long time.“ (Image source: Public domain).
LoRaWAN Gateways serve as transparent bridges for the end-to-end encrypted communication between sensors and devices out in the field and central network servers (you can read more about the technology here). One big advantage of LoRa is that you only need a few of these gateways to cover a whole city.
While commercial companies are working on LoRa networks (e.g. Swisscom or Digimondo), the afore-mentioned The Things Network (that now EBP is a part of) is an interesting open initiative. With The Things Network, an enthusiastic community is building LoRa networks in cities all around the world. These networks are free and open to use for everybody. At EBP, we immediately felt favourably towards that idea and are excited to share some of our company’s bandwidth with the community behind The Things Network.
As an additional benefit, we thus expand our playground to experiment with IoT and new networking technologies. Our order for additional hardware to build some LoRa test devices is out and we are looking forward to do some soldering. So stay tuned for more LoRa news here. Or indeed, join the revolution yourself!
When new concepts and trends in technology occur, it is a good idea to get first-hand practical experience before adding to that first rising slope in the hype cycle. So, during the last months I made some experiments that I now want to share with you.
Traditionally two members of #TeamEBP visit Esri’s annually DevSummit in order to hear the latest from the world of ArcGIS – and beyond. This year, my colleague Sarah Schöni and I had the chance to fly to California. In this post, we’d like to summarize the highlights from our point of view:
The overall theme: „Web GIS is a System of Engagement“
The Keynote: Douglas Crockford
The State of Esri Technology
Python is now a first class citizen in Esri’s world
What else is new and cool? Insights and Vector Tiles!
One more thing…
The overall theme: „Web GIS is a System of Engagement“
Esri usually has an overall theme that they want to get across, such as mobile in 2011, online in 2012 or platform in 2014. This year’s theme „engagement“ is based on Geoffrey Moore’s paper on „Systems of Engagement and the Future of Enterprise IT“: In the past, organizations have built transactional tools and systems specifically designed for their business processes. The systems are mostly static, very accurate, mostly complete and tightly controlled – they are systems of records. With the advent of consumer IT, we’re moving closer to systems of engagement, where the focus is on interaction, collaboration, openness and immediate answers.
Esri has transferred Moore’s theory of systems of engagement to GIS: They use the term „Web GIS“ as a synonym for a geo-information system of engagement: In this sense, a Web GIS is built on distributed servers, web clients, several focussed apps and it provides an open, real-time environment for engagement in your organization. If you are interested, you can read Jack Dangermond’s post about Esri’s vision.
The Keynote: Douglas Crockford
The State of the Esri Technology
It seems that Esri’s server technology has reached maturity. ArcGIS for Server consists of two core components: the backend (the actual ArcGIS server software) and the frontend (the so-called Portal for ArcGIS). The backend has been around for nearly a decade (anyone remembers 9.0?) and the frontend is basically a self-hosted version of ArcGIS Online.
Desktop: ArcGIS Pro has been announced 2 years ago and is now in version 1.2. It is close to becoming mainstream, but Esri stresses that ArcMap – the long-running desktop solution – will continue to be developed and supported for the next 10 years. However, new features (like generation of vector tiles) are unlikely to be developed for the „old“ platform.
Runtime: For developing independent GIS applications, ArcGIS Engine was the go-to solution in Esri’s world. With ArcGIS Runtime and the announcement of the Quartz architecture, there is now a new architecture to depend on in the future. At the time of writing, there is no final release yet (though beta versions for mobile are available). It is expected that versions for iOS and Android will be released in the second quarter, while the other versions (.Net, Tamarin, Java, Qt) will be released in the Q3.
My recommendation regarding the transition of the three Esri components mentioned above: For every new project, you now have to carefully choose between the old and the new technology. There is no general advice on what is best, because it depends on the requirements of your project. If in doubt, you may consider to ask us to help you out ;-).
Python is a first class citizen in the Esri world
Talking about migration: Python has been recommended as your first option for extending ArcGIS platform functionalities. One reason is that migrating Python code from ArcMap to ArcGIS Pro is much simpler than migrating .Net code, because the ArcPy library has not changed much (except arcpy.mapping and of course some necessary adaptions due to the shift from Python 2.x to Python 3.x). So, to quote an Esri staff member: „Use more Python and less ArcObjects“.
But there was a lot more on Python, like ArcGIS integration with the packaging manager Conda and the outlook that Jupyter notebooks (formerly known as IPython notebooks) will be part of the ArcGIS platform (probably late 2016, maybe early 2017). I’m quite excited about the Jupyter integration, because then you may edit, explore and share your ArcGIS Python sessions and even take advantage of the power of SciPy, pandas and other great Python modules. Unfortunately, there weren’t too many details available on this.
What else is new and cool? Insights and Vector Tiles!
Last, but not least, we want to talk about two new cool things that have been unveiled at this year’s DevSummit:
Insights for ArcGIS: This demonstration was the most impressive one and was much talked about during the conference: It is basically „GIS for Data Scientists“. Just have a look at the product page or watch the 8-minute video and you get a glimpse of how easy GIS can be: Just drag-n-drop a county outline on a map of points and you get an aggregated view. Or select a slice of a histogram and the corresponding features in the map as well as on a scatter plot are highlighted.
Vector Tiles: Vector tiles have been announced last year, but now you can generate them from ArcGIS Pro and publish them directly on your ArcGIS Portal. At least with vector tiles, the old saying „Raster is faster, but vector is corrector“ does not hold anymore: Publishing the entire world as vector tiles takes 8 hours on a desktop machine (with 16 GB RAM and SSD) and consumes about 13 GB of disk space. Compare this to weeks of processing and dozens of terabytes of disk space for traditional raster tiles. As Esri adopted the MapBox specification for vector tiles, the tiles should eventually be consumable by non-Esri clients (and also non-Esri tiles by ArcGIS clients). But these setups are apparently work in progress and may yield unexpected results at the moment.
In the „projects“ series we occasionally highlight some of our projects. Today, these projects encompass a geodata portal for the Swiss Army, a metadata portal for meteorologists, a cloud-based aggregation infrastructure for geoinformation and a process support for a biodiversity research team.
Geographic information portal for the Swiss Army
The Swiss Army was looking to standardise its use of geodata. EBP was commissioned to help the Army develop a geographic information portal.
The so-called „Mil User Platform“ is to be realised on the basis of the Swiss Federal Office of Topography’s (swisstopo) „Federal Administration User Platform“.
EBP supported the Military Geographic Information Service of the Armed Forces Joint Staff in the conception, initiation and installation phases of the Mil User Platform project. We manage the relevant business analyses and also model the requirements to be met by the GeoInfo Portal V in SPARX Enterprise Architect.
OSCAR: Metadata for meteorology – convenient global access
To manage metadata, the World Meteorological Organization (WMO) is setting up what is known as the Observing Systems Capability And Review tool (OSCAR). This tool promises to facilitate the proper use of meteorological measurement data, provide a global overview of the available weather stations; and help facilitate the task the WMO member states have of administering these stations.
Working closely with the WMO, MeteoSwiss is developing and operating the OSCAR surface module. EBP helps MeteoSwiss realise the project using the HERMES-5 methodology.
Working in the capacity of a project manager, EBP has designed and realised the scalable operation of the portal. The software components used for the system are based on open-source technologies and were developed by Sourcepole AG.
Working in close cooperation with the cloud-service provider CloudSigma, we set up the infrastructure for the application’s operation in a Switzerland-based computing centre. Thanks to the use of dynamic scaling, our solution can react flexibly to load and request volume fluctuations.
Process consulting and implementation of the ALL-EMA database
In the context of its ALL-EMA long-term study, the Swiss research institute Agroscope is gaining a better understanding of biodiversity in Switzerland by gathering field data on flora and habitat types. Before launching the first season of fieldwork, Agroscope wanted to improve its ALL-EMA data system.
EBP supported Agroscope in migrating its ALL-EMA project infrastructure to a comprehensive system with a central repository and efficient processes for data management and analysis.
The scope of the development included tools for importing field data, sampling design and exporting derived data in relevant exchange formats. The ALL-EMA architecture, data sources, workflows, responsibilities and IT security measures were recorded in a system manual and data documentations.
Today, an interview about the story behind his map has been published on GeoHipster. If you are only remotely interested in maps and information visualization, I strongly urge you to read the short interview here.
If you follow my private blog, you might have seen that I also made Twitter maps sometimes, e.g. here for GeoHipster (thumbs up for Atanas & Co.’s initiative!) and here for SwissGIS:
The day before yesterday I updated the SwissGIS Twitter map. In doing so I thought: heck, I should probably renew the old network visualisation of a few dozen Twitter accounts as well! I keep adding people to the list when I come across their accounts; hence the list has now grown to over 200 members.
So, I dusted off my Python code for querying the Twitter API, obtaining profile metrics and building the follower network between the accounts on the SwissGIS list. I plugged the resulting dataset into Gephi, configured the visualisation, and used the superb add-on by OII’s Scott Hale to export the whole shebang to sigma.js.
You can find the result by clicking here or on this graphic (best viewed on desktop, tablet also okay):
Each node in this network is a Twitter account. Links represent follower-relationships between the accounts, the link having the colour of the account that follows the other. The network is clustered into so-called modularity classes based on its topology. Similarly to the last time I plotted a (much younger) SwissGIS network, you can find, e.g., that the blue cluster encompasses mostly French-speaking Twitter users. Also similarly to last time, Esri Switzerland becomes a rather distinct and marked cluster (in purple) with very few errors of omission and commission. This is the inherent (and at times very revealing) power of networks and the strong homophily in all of us – also, the origin of concepts like that of the filter bubble.
The nodes in the visualisation are sized according to the number of followers a node or account has within the SwissGIS network. Not within Twitter at large! E.g., in ‚general Twitter‘, @swiss_geoportal has many more followers than @geobeerch, however, within SwissGIS the two are very similar regarding this metric.
Clicking onto a node reveals additional attributes such as the account name, the profile picture, the age of the account, number of tweets, and average number of tweets per month. It also shows mutual following relationships, which followers follow this account, and which accounts this account follows (both one-directional). The accounts in these lists are themselves clickable, i.e. you can navigate through the network via the users that are contained in it. There’s also a very basic search function that acts on account names for when you can’t find a user that you are interested in.
Importantly, Twitter accounts who were not accessible at the time of data collection (e.g., accounts that are configured to be private) cannot show up in this network, as – simplifying here – no data can be collected about them through the Twitter API.
Enjoy exploring the network of Switzerland-based geo and GIS enthusiasts. And shoot me a tweet or an e-mail if you discover anything interesting (or if you simply enjoyed the visualisation or this post)!
PS: You can easily subscribe to the SwissGIS Twitter list in, for example, Tweetdeck or Hootsuite in order to stay on top of geo/GIS news from Switzerland (expect a mix of (predominantly) English, German, French and a little Italian). By the way: following a list means you get to see all the tweets by the list members, whether you follow them personally or not.
In the „projects“ series we would like to highlight from time to time some projects that our company conducted. Today, these projects encompass online collaboration platforms for projects, geodata infrastructures and services, and spatial ETL using FME.
Jinsha: Collaboration platform for an international project team
As part of an international team, our experts investigate the influence of climate change on water management in China. In order to support the project team, EBP built a collaboration platform based on Microsoft Sharepoint.
The Sharepoint platforms facilitates the communication between team members and project documentation, and simplifies project management. At any time, all team members can access common assets and documents can be edited collaboratively and simultaneously.
Project initiation of the Swiss Federal Department of Foreign Affairs Geodata Infrastructure
The Swiss Federal Department of Foreign Affairs (FDFA) requires a networked information landscape in order fulfill its tasks. Geographic information and data are an essential part of this information landscape for operational awareness. EBP assisted the FDFA in the initiation phase of the project „Geodata Infrastructure FDFE“ according to the federal standard for project management, Hermes 5.
We derived the requirements for such a system using interviews and stakeholder workshops. In a Hermes study we documented the situation analysis, aims, requirements and approaches, suggested and described various solutions and formulated a recommendation.
The geodata of the cadastral survey in the Canton of Schwyz is managed by the municipalities. The canton publishes these data centrally. In order to facilitate the canton’s task, we assisted Schwy in developping an automated import of Interlis data into the cantonal geodata infrastructure (Oracle and PostGIS) using FME as the state-of-the-art ETL tool.
Using our tool, the Canton of Schwyz can import survey data at the press of a button. The data is then served to the authorities and the public, e.g. in the cantonal WebGIS, from the central databases.
These days, data and data scientists (and data engineers?) seem to rule the world. Companies are data-driven, problems are solved using data-driven methods and national intelligence agencies (arguably: also online retailers) extensively collect all the data they can get hold of.
The data-value stack is to be read from the bottom up to the top. The idea of the stack suggests: the value of the data arises from raw datathrough various steps up the pyramid. The link to Maslow’s hierarchy of needs that the authors make implies that the upper levels of the pyramid build and rely upon the lower levels, i.e. you cannot effect actions without first collecting data at the records level, then cleaning and aggregating, exploring and inferring. In my opinion, this is a feasible approach and obviously the framework works well for some cases.
However: looking at the stack, the approach reminds me of a blind chicken which randomly picks and picks until it eventually finds a valuable corn to eat. More intelligent animals have some expertise to enhance the „random-pick“ – i.e., purely bottom-up – approach: Based on its experience, intelligence and/or guts, the intelligent chicken efficiently picks the most valuable food right from the start.
I admit, I know nothing about behavioural biology to support the claims in the previous paragraph. And yes, millions of blind chickens may help. But what I really want to say is: expertise matters, also in the data-driven world – we cannot yet proclaim the end of theory.
But how does expertise come into play in the above mentioned data-value stack? Its design principle is that higher levels depend on lower levels. I would propose a similarly shaped expertise-value stack, which aligns alongside the data-value stack. That stack would look as follows (on the left):
The expertise-value stack complements the steps in the data-value stack with the following levels of expertise:
Wisdom: Use your wisdom for strategic decisions.
Application of Interdisciplinary Knowledge: Use and combine your knowledge from different subject matter domains.
Application of Domain Knowledge: Apply your subject matter knowledge to the problem.
Information Collection: Conduct targeted collection and filtering of relevant information, like reports, opinions or results of relevant research.
Problem Comprehension: Before doing anything, make sure you understand the problem at hand from one or several perspectives: e.g. from the perspective of the user, provider or politician.
Obviously, the idea of domain experts collaborating with, and supporting, data scientists is not new. Indeed it has been noted that subject experts may make the difference. And this is why an interdisciplinary approach (edit 2016-02-23: i.e. leveraging both expertise-value and data-value) has advantages over a pure data driven approach. Unfortunately, the benefit of including subject experts does not come for free: It takes much time to talk to each other and you need to find good counterparts to succeed. But in the long run, this interaction will pay off.
In the early 80ies GIS were based primarily on scripts. Using scripts, GI specialists cleaned, edited and visualized spatial data. Some readers might recall the ARC/INFO era and its scripting language Arc Macro Language – AML.
About 20 years later, at the end of the 90ies, the first GUI-centric object-oriented GIS appeared on the stage (for example, ArcGIS Desktop in 1998). This second step with the more efficient programming technique was enabled by more performant hardware.
New technologies to provide data and services emerged with the rapid advent and development of the Web. A building stone of these service-oriented architectures (SOAs) was, for example, the Web Map Services (WMS) specification that was adopted in 2000 (Version 1.0).
Finally, virtualization of hardware and centralization of computing centers initiated the fourth phase leading to cloud-based GIS portals. Storage space and computing power have become scalable commodities. ArcGIS Online, launched in 2012, is a prominent example of this fourth phase.
Now the question is: what comes next?
Smart and connected systems
From the past we can learn: New technological abilities lead to new applications. They substiantially influence the further evolution of GIS. Among the contenders for the most relevant (to GIS) technologies and developments I see:
the Internet of Things (IoT) and
Future GIS applications will be more and more smart and networked. They will require a technical infrastructure which is composed of several layers: embedded components, network communications, a cloud-based platform or system, tools for providing authentification and authorization, and gateways to include external data sources as well as in-house data (see the figure below, adapted from Porter and Heppelmann).
Geoinformation systems have evolved quite rapidly in recent years and the future seems to be more exciting than ever: All major IT trends such as Internet of Things (IoT), big data or real-time systems are directly related to our professional domain: Smart devices are spatially located or even moving in time; big data and real-time systems almost always need locational analytics. This is why we got interested when we heard about the „Sense Your City Art Challenge„, a competition to make sense of a network of DIY sensors, which are spread over 7 cities in 3 continents. To be honest, our interest was not so much drawn to the „art“ aspect, at the end we are engineers and feel more at home with data and technology. And there is real-time sensor data available within the challenge: About 14 sensor nodes in every city deliver approximately 5 measurements every 10 second, such as temperature, humidity or air quality. The sensor is data freely available. When we looked at the numbers, we realized that data had some surprising properties, for example the temperature within varies quite a bit within one city.
Our goal: Live data quality assessment for a sensor network
So, we took the challenge a bit differently and more from an engineering perspective: How to implement a real-time quality assessment system for sensor data? As an example, we took the following questions, which need re-evaluated as new sensor data comes in:
Are there enough sensors that provide information about the sensors?
How much does the sensor measurements vary within a city?
How do the sensor measurements compare to external data?
A heatmap showing a condensed view of the analysis for each of the cities, labeled with numbers 2 to 8. For example, we want to show the visualize number of sensor values for each city within a time frame of 30 seconds. The darker the blue bucket, the more sensor signals we got. Light buckets indicate a low number og signals in the time frame.
A gauge showing the number of sensor signals for a city and a linechart with the minimum, maximum and average temperature for a city.
We haven’t yet got around to show real weather data, although it is processed internally.
Some implementation details
For the technically inclined: Our implementation is based on Microsoft’s Azure, one of many cloud computing platforms available. Specifically, we used three main components: Event Hubs, Stream Analytics and WebSockets.
We started building our solution using Azure Event Hubs, a highly scalable publish-subscribe infrastructue. It could take in millions of events per second, so we have room to grow with only 170’000 data points per hour. Every ten seconds, we pull the raw data from the official data sources and push the resulting data stream to an Azure Event Hub.
For the real-time analysis, we tried Azure Stream Analytics, a fully managed stream processing solution, which can take event hubs as an input source. With Stream Analytics, you can analyze incoming data within a certain time window and immediately push the result back to another event hub. For our example, Stream Analytics aggregates the raw signal data every 3 to 4 seconds and calculates average value, minimum value, maximum value and standard deviation for 30 seconds within a city.
Finally, there is a server component, which transforms the event hub into WebSockets. With WebSockets, we can establish a direct connection between the data stream and a (modern) browser client.
Admittedly, this is a very early version of a live quality assessment system for real-time sensor data. However, it shows the potential: We can define a set of data quality indicators like number of active sensors or the variation coefficient. These indicators can be computed as the data streams into the system. Using Azure Stream Analytics, we could incorporate tens of thousands of sensors, instead of only hundred and we’d still have the same performance without changing a line of code.
Of course, there is room for improvements:
Ideally the sensor would push its data directly into the Azure EventHub instead of using a polling service as intermediate.
Exploiting historical data, like a comparison between the live data and date from a week ago.
Integrating more and different data sources for the data analysis.
Do you have any question? Send me an e-mail at email@example.com or leave a comment.