LoRaWAN: IoT Network for the Future?

If you follow someone from #TeamEBP on Twitter, you may have noticed that last week we installed a LoRaWAN gateway of The Things Network in our office building. And like some of my colleagues you may have wondered (or wonder now): What is this all about?

Is EBP now into selling parrots (of course we could call our parrot Polly, not Lora)? Or are we supporting an alternative Zurich radio station? Good guesses. But it is of course neither of those two: LoRaWAN stands for Long Range Wide Area Network, a technology for low power wireless telecommunication networks. LoRaWAN gateways are intended to be used by battery operated sensors and other low power devices, nowadays better known as the Internet of Things (IoT), to transfer their data to the internet.

While mobile and WiFi networks drain your mobile phone battery quickly with increasing data transfer rates, LoRa takes the opposite approach. Only very little data can be sent over the network to minimize power consumption. Take for example the optimizing of garbage collection by installing sensors on waste bins, a solution that is already more widespread than I expected. You would certainly use batteries, maybe combined with energy harvesting, rather than connect every garbage container throughout a city to the power grid.


Have you ever noticed the amazing anticipation of IoT ideas in „Frau Holle„? Bread calling out: „Oh, take me out. Take me out, or I’ll burn. I’ve been thoroughly baked for a long time.“ (Image source: Public domain).

LoRaWAN Gateways serve as transparent bridges for the end-to-end encrypted communication between sensors and devices out in the field and central network servers (you can read more about the technology here). One big advantage of LoRa is that you only need a few of these gateways to cover a whole city.

While commercial companies are working on LoRa networks (e.g. Swisscom or Digimondo), the afore-mentioned The Things Network (that now EBP is a part of) is an interesting open initiative. With The Things Network, an enthusiastic community is building LoRa networks in cities all around the world. These networks are free and open to use for everybody. At EBP, we immediately felt favourably towards that idea and are excited to share some of our company’s bandwidth with the community behind The Things Network.

The Things Network Zurich coverage map with the EBP gateway
The Things Network Zurich coverage map with the EBP gateway

As an additional benefit, we thus expand our playground to experiment with IoT and new networking technologies. Our order for additional hardware to build some LoRa test devices is out and we are looking forward to do some soldering. So stay tuned for more LoRa news here. Or indeed, join the revolution yourself!

Forget Apps – Bots Are the Future of Geo

Update 2016-12-01: We have a second version of our bot and a landing page at http://www.traindelaybot.ch. It now supports Facebook Messenger, Skype und Slack.

The title of this blog post may seem hyperbolic, but during the last few months there has been an increasing
buzz on personal assistants and bots
like Amazon’s Echo, Apple’s Siri or Facebook’s „M“ – the latter being a shopping assistant in Facebook’s Messenger app. Some people proclaim that we experience a new generation of user interfaces: The conversational UX where users interact with a system by „simply saying“ what they want. Last week, [Microsoft introduced developer frameworks for conversational bots](Microsoft introduced conversational frameworks) as a core part of their cloud strategy. And just yesterday, Facebook announced their Bot API for their Messenger.

When new concepts and trends in technology occur, it is a good idea to get first-hand practical experience before adding to that first rising slope in the hype cycle. So, during the last months I made some experiments that I now want to share with you.

Not yet an uprising, but the bots are slowly coming… photo: Francesco Mondada, Michael Bonani, CC-SA3.0, Source

„Forget Apps – Bots Are the Future of Geo“ weiterlesen

2016 Esri Partner Conference and Developer Summit

Traditionally two members of #TeamEBP visit Esri’s annually DevSummit in order to hear the latest from the world of ArcGIS – and beyond. This year, my colleague Sarah Schöni and I had the chance to fly to California. In this post, we’d like to summarize the highlights from our point of view:

  • The overall theme: „Web GIS is a System of Engagement“
  • The Keynote: Douglas Crockford
  • The State of Esri Technology
  • Python is now a first class citizen in Esri’s world
  • What else is new and cool? Insights and Vector Tiles!
  • One more thing…
Sarah and me with two friendly developers…
Sarah and I with two friendly developers…

The overall theme: „Web GIS is a System of Engagement“

Esri usually has an overall theme that they want to get across, such as mobile in 2011, online in 2012 or platform in 2014. This year’s theme „engagement“ is based on Geoffrey Moore’s paper on „Systems of Engagement and the Future of Enterprise IT“: In the past, organizations have built transactional tools and systems specifically designed for their business processes. The systems are mostly static, very accurate, mostly complete and tightly controlled – they are systems of records. With the advent of consumer IT, we’re moving closer to systems of engagement, where the focus is on interaction, collaboration, openness and immediate answers.

Esri has transferred Moore’s theory of systems of engagement to GIS: They use the term „Web GIS“ as a synonym for a geo-information system of engagement: In this sense, a Web GIS is built on distributed servers, web clients, several focussed apps and it provides an open, real-time environment for engagement in your organization. If you are interested, you can read Jack Dangermond’s post about Esri’s vision.

Slide for WebGIS as a System of Engagement
Slide for System of Engagement

The Keynote: Douglas Crockford

One highlight of a conference is the keynote and this year we were fortunate to be able to listen to Douglas Crockford who is one of the leading figures in the development of the JavaScript language. His keynote was both entertaining and insightful. Although my main programming language of choice is not JavaScript, I highly enjoyed his talk. You can re-watch the keynote here. One highlight was the comparison between the relationship of Java and JavaScript and the relationship of Star Trek and Star Wars:


Of course, JavaScript has to be Star Wars!

The State of the Esri Technology

It seems that Esri’s server technology has reached maturity. ArcGIS for Server consists of two core components: the backend (the actual ArcGIS server software) and the frontend (the so-called Portal for ArcGIS). The backend has been around for nearly a decade (anyone remembers 9.0?) and the frontend is basically a self-hosted version of ArcGIS Online.

Currently, Esri is in a transition phase for three important technology components, namely Desktop, Runtime and JavaScript API:

  • Desktop: ArcGIS Pro has been announced 2 years ago and is now in version 1.2. It is close to becoming mainstream, but Esri stresses that ArcMap – the long-running desktop solution – will continue to be developed and supported for the next 10 years. However, new features (like generation of vector tiles) are unlikely to be developed for the „old“ platform.
  • Runtime: For developing independent GIS applications, ArcGIS Engine was the go-to solution in Esri’s world. With ArcGIS Runtime and the announcement of the Quartz architecture, there is now a new architecture to depend on in the future. At the time of writing, there is no final release yet (though beta versions for mobile are available). It is expected that versions for iOS and Android will be released in the second quarter, while the other versions (.Net, Tamarin, Java, Qt) will be released in the Q3.
  • JavaScript API: The ArcGIS JavaScript API is currently in version 3. I always recommend developers to have a look at the sample code page to get a feel of what the API can do for them. There is a lot to explore, but one thing you might be missing in version 3 is 3D (no pun intended). Last month, we’ve already written on the upcoming version 4 which handles 2D and 3D equivalently and allows to easily switch between the two dimensions while writing the code. Additionally, the API calls are much simpler now – with the drawback that older code probably has to be rewritten. For this reason I think it is more than a change in version numbers, but actually a similar big transition as we experience with Desktop and Runtime. Again, I recommend to have a look at the sample pages for the beta version to get a feel of what can be done now and in the future. The nice Esri folks at the DevSummit told me that there will be a comparison page between the functionalities of the two API versions, so stay tuned for more info. Update 2016-05-09: The page is now available and very comperehensive.

My recommendation regarding the transition of the three Esri components mentioned above: For every new project, you now have to carefully choose between the old and the new technology. There is no general advice on what is best, because it depends on the requirements of your project. If in doubt, you may consider to ask us to help you out ;-).

Python is a first class citizen in the Esri world

Talking about migration: Python has been recommended as your first option for extending ArcGIS platform functionalities. One reason is that migrating Python code from ArcMap to ArcGIS Pro is much simpler than migrating .Net code, because the ArcPy library has not changed much (except arcpy.mapping and of course some necessary adaptions due to the shift from Python 2.x to Python 3.x). So, to quote an Esri staff member: „Use more Python and less ArcObjects“.

But there was a lot more on Python, like ArcGIS integration with the packaging manager Conda and the outlook that Jupyter notebooks (formerly known as IPython notebooks) will be part of the ArcGIS platform (probably late 2016, maybe early 2017). I’m quite excited about the Jupyter integration, because then you may edit, explore and share your ArcGIS Python sessions and even take advantage of the power of SciPy, pandas and other great Python modules. Unfortunately, there weren’t too many details available on this.

A screenshot of an ArcGIS Jupyter notebook.

What else is new and cool? Insights and Vector Tiles!

Last, but not least, we want to talk about two new cool things that have been unveiled at this year’s DevSummit:

  • Insights for ArcGIS: This demonstration was the most impressive one and was much talked about during the conference: It is basically „GIS for Data Scientists“. Just have a look at the product page or watch the 8-minute video and you get a glimpse of how easy GIS can be: Just drag-n-drop a county outline on a map of points and you get an aggregated view. Or select a slice of a histogram and the corresponding features in the map as well as on a scatter plot are highlighted.
  • Vector Tiles: Vector tiles have been announced last year, but now you can generate them from ArcGIS Pro and publish them directly on your ArcGIS Portal. At least with vector tiles, the old saying „Raster is faster, but vector is corrector“ does not hold anymore: Publishing the entire world as vector tiles takes 8 hours on a desktop machine (with 16 GB RAM and SSD) and consumes about 13 GB of disk space. Compare this to weeks of processing and dozens of terabytes of disk space for traditional raster tiles. As Esri adopted the MapBox specification for vector tiles, the tiles should eventually be consumable by non-Esri clients (and also non-Esri tiles by ArcGIS clients). But these setups are apparently work in progress and may yield unexpected results at the moment.

One more thing

Where to go from here? I recommend to have a look at the presentation videos that are already published on Esri’s video portal, for example start with the ArcGIS platform overview.

But there is one more thing and a personal note: I would like to plug my lightning talk that I gave during the DevSummit. It was about a topic, that I am planning to expand on this blog in the future:


Stay tuned…

Example projects: Military and meteo data, geoinformation aggregation and process consulting

In the „projects“ series we occasionally highlight some of our projects. Today, these projects encompass a geodata portal for the Swiss Army, a metadata portal for meteorologists, a cloud-based aggregation infrastructure for geoinformation and a process support for a biodiversity research team.

Geographic information portal for the Swiss Army

The Swiss Army was looking to standardise its use of geodata. EBP was commissioned to help the Army develop a geographic information portal.

The so-called „Mil User Platform“ is to be realised on the basis of the Swiss Federal Office of Topography’s (swisstopo) „Federal Administration User Platform“.

EBP supported the Military Geographic Information Service of the Armed Forces Joint Staff in the conception, initiation and installation phases of the Mil User Platform project. We manage the relevant business analyses and also model the requirements to be met by the GeoInfo Portal V in SPARX Enterprise Architect.

→ find out more


OSCAR: Metadata for meteorology – convenient global access

© MeteoSchweiz

To manage metadata, the World Meteorological Organization (WMO) is setting up what is known as the Observing Systems Capability And Review tool (OSCAR). This tool promises to facilitate the proper use of meteorological measurement data, provide a global overview of the available weather stations; and help facilitate the task the WMO member states have of administering these stations.

Working closely with the WMO, MeteoSwiss is developing and operating the OSCAR surface module. EBP helps MeteoSwiss realise the project using the HERMES-5 methodology.

→ find out more


KKGEO: Operation of the cloud-based aggregation infrastructure

The Office of the Conference of Cantonal Geoinformation Service Providers (KKGEO) has established an aggregation infrastructure for the Switzerland-wide publication of harmonised cantonal spatial data: geodienste.ch.

Working in the capacity of a project manager, EBP has designed and realised the scalable operation of the portal. The software components used for the system are based on open-source technologies and were developed by Sourcepole AG.

Working in close cooperation with the cloud-service provider CloudSigma, we set up the infrastructure for the application’s operation in a Switzerland-based computing centre. Thanks to the use of dynamic scaling, our solution can react flexibly to load and request volume fluctuations.

→ find out more


Process consulting and implementation of the ALL-EMA database

In the context of its ALL-EMA long-term study, the Swiss research institute Agroscope is gaining a better understanding of biodiversity in Switzerland by gathering field data on flora and habitat types. Before launching the first season of fieldwork, Agroscope wanted to improve its ALL-EMA data system.

EBP supported Agroscope in migrating its ALL-EMA project infrastructure to a comprehensive system with a central repository and efficient processes for data management and analysis.

The scope of the development included tools for importing field data, sampling design and exporting derived data in relevant exchange formats. The ALL-EMA architecture, data sources, workflows, responsibilities and IT security measures were recorded in a system manual and data documentations.

→ find out more

GeoHipster interview with Ralph

Did you know that our very own Ralph Straumann is on the advisory board of the international GIS community website „GeoHipster“? And if you also want to become an advisor to an international community, you may have to start with producing brilliant maps: One of Ralph’s maps has been published last year in the 2015 GeoHipster calendar (check out the month of February).

Today, an interview about the story behind his map has been published on GeoHipster. If you are only remotely interested in maps and information visualization, I strongly urge you to read the short interview here.

Infoviz: Population of Swiss cities and cantons
Infoviz: Population of Swiss cities and cantons

By the way: GeoHipster published another calendar for 2016 with new maps submitted by the community. Check it out and order it here (spoiler: also this one will contain a map by Ralph).


Swiss GIS network on Twitter

Out of curiosity and 2.5 years ago, I analysed the network of Swiss GIS twitterers (article in German, French, Italian). That analysis inspired the creation of the GeoBeer event series (of which we had the 11th instalment just a few days ago) and the Twitter list by the name of ‚SwissGIS‘. You can find that one here.

If you follow my private blog, you might have seen that I also made Twitter maps sometimes, e.g. here for GeoHipster (thumbs up for Atanas & Co.’s initiative!) and here for SwissGIS:

The day before yesterday I updated the SwissGIS Twitter map. In doing so I thought: heck, I should probably renew the old network visualisation of a few dozen Twitter accounts as well! I keep adding people to the list when I come across their accounts; hence the list has now grown to over 200 members.

So, I dusted off my Python code for querying the Twitter API, obtaining profile metrics and building the follower network between the accounts on the SwissGIS list. I plugged the resulting dataset into Gephi, configured the visualisation, and used the superb add-on by OII’s Scott Hale to export the whole shebang to sigma.js.

You can find the result by clicking here or on this graphic (best viewed on desktop, tablet also okay):

Each node in this network is a Twitter account. Links represent follower-relationships between the accounts, the link having the colour of the account that follows the other. The network is clustered into so-called modularity classes based on its topology. Similarly to the last time I plotted a (much younger) SwissGIS network, you can find, e.g., that the blue cluster encompasses mostly French-speaking Twitter users. Also similarly to last time, Esri Switzerland becomes a rather distinct and marked cluster (in purple) with very few errors of omission and commission. This is the inherent (and at times very revealing) power of networks and the strong homophily in all of us – also, the origin of concepts like that of the filter bubble.

The nodes in the visualisation are sized according to the number of followers a node or account has within the SwissGIS network. Not within Twitter at large! E.g., in ‚general Twitter‘, @swiss_geoportal has many more followers than @geobeerch, however, within SwissGIS the two are very similar regarding this metric.

Clicking onto a node reveals additional attributes such as the account name, the profile picture, the age of the account, number of tweets, and average number of tweets per month. It also shows mutual following relationships, which followers follow this account, and which accounts this account follows (both one-directional). The accounts in these lists are themselves clickable, i.e. you can navigate through the network via the users that are contained in it. There’s also a very basic search function that acts on account names for when you can’t find a user that you are interested in.

Importantly, Twitter accounts who were not accessible at the time of data collection (e.g., accounts that are configured to be private) cannot show up in this network, as – simplifying here – no data can be collected about them through the Twitter API.

Enjoy exploring the network of Switzerland-based geo and GIS enthusiasts. And shoot me a tweet or an e-mail if you discover anything interesting (or if you simply enjoyed the visualisation or this post)!


PS: You can easily subscribe to the SwissGIS Twitter list in, for example, Tweetdeck or Hootsuite in order to stay on top of geo/GIS news from Switzerland (expect a mix of (predominantly) English, German, French and a little Italian). By the way: following a list means you get to see all the tweets by the list members, whether you follow them personally or not.

Example projects: Project platforms, geodata for the DFA and automated data import

In the „projects“ series we would like to highlight from time to time some projects that our company conducted. Today, these projects encompass online collaboration platforms for projects, geodata infrastructures and services, and spatial ETL using FME.

Jinsha: Collaboration platform for an international project team

As part of an international team, our experts investigate the influence of climate change on water management in China. In order to support the project team, EBP built a collaboration platform based on Microsoft Sharepoint.

The Sharepoint platforms facilitates the communication between team members and project documentation, and simplifies project management. At any time, all team members can access common assets and documents can be edited collaboratively and simultaneously.

→ find out more


Project initiation of the Swiss Federal Department of Foreign Affairs Geodata Infrastructure

The Swiss Federal Department of Foreign Affairs (FDFA) requires a networked information landscape in order fulfill its tasks. Geographic information and data are an essential part of this information landscape for operational awareness. EBP assisted the FDFA in the initiation phase of the project „Geodata Infrastructure FDFE“ according to the federal standard for project management, Hermes 5.

We derived the requirements for such a system using interviews and stakeholder workshops. In a Hermes study we documented the situation analysis, aims, requirements and approaches, suggested and described various solutions and formulated a recommendation.

→ find out more


Cadastral surveying: Data import using FME

The geodata of the cadastral survey in the Canton of Schwyz is managed by the municipalities. The canton publishes these data centrally. In order to facilitate the canton’s task, we assisted Schwy in developping an automated import of Interlis data into the cantonal geodata infrastructure (Oracle and PostGIS) using FME as the state-of-the-art ETL tool.

Using our tool, the Canton of Schwyz can import survey data at the press of a button. The data is then served to the authorities and the public, e.g. in the cantonal WebGIS, from the central databases.

→ find out more

Data Value and Expertise Value

These days, data and data scientists (and data engineers?) seem to rule the world. Companies are data-driven, problems are solved using data-driven methods and national intelligence agencies (arguably: also online retailers) extensively collect all the data they can get hold of.

The data-driven approach is formalised in the Jurney-Warden Data-Value Stack:

The Jurney-Warden Data-Value stack Source: https://www.safaribooksonline.com/library/view/agile-data-science/9781449326890/ch05.html
The Jurney-Warden Data-Value stack (source)

The data-value stack is to be read from the bottom up to the top. The idea of the stack suggests: the value of the data arises from raw data through various steps up the pyramid. The link to Maslow’s hierarchy of needs that the authors make implies that the upper levels of the pyramid build and rely upon the lower levels, i.e. you cannot effect actions without first collecting data at the records level, then cleaning and aggregating, exploring and inferring. In my opinion, this is a feasible approach and obviously the framework works well for some cases.

However: looking at the stack, the approach reminds me of a blind chicken which randomly picks and picks until it eventually finds a valuable corn to eat. More intelligent animals have some expertise to enhance the „random-pick“ – i.e., purely bottom-up – approach: Based on its experience, intelligence and/or guts, the intelligent chicken efficiently picks the most valuable food right from the start.

I admit, I know nothing about behavioural biology to support the claims in the previous paragraph. And yes, millions of blind chickens may help. But what I really want to say is: expertise matters, also in the data-driven world – we cannot yet proclaim the end of theory.

But how does expertise come into play in the above mentioned data-value stack? Its design principle is that higher levels depend on lower levels. I would propose a similarly shaped expertise-value stack, which aligns alongside the data-value stack. That stack would look as follows (on the left):

Expertise-Value stack (left) and Data-Value stack (right)
Expertise-Value stack (left) and Data-Value stack (right)

The expertise-value stack complements the steps in the data-value stack with the following levels of expertise:

  • Wisdom: Use your wisdom for strategic decisions.
  • Application of Interdisciplinary Knowledge: Use and combine your knowledge from different subject matter domains.
  • Application of Domain Knowledge: Apply your subject matter knowledge to the problem.
  • Information Collection: Conduct targeted collection and filtering of relevant information, like reports, opinions or results of relevant research.
  • Problem Comprehension: Before doing anything, make sure you understand the problem at hand from one or several perspectives: e.g. from the perspective of the user, provider or politician.

Obviously, the idea of domain experts collaborating with, and supporting, data scientists is not new. Indeed it has been noted that subject experts may make the difference. And this is why an interdisciplinary approach (edit 2016-02-23: i.e. leveraging both expertise-value and data-value) has advantages over a pure data driven approach. Unfortunately, the benefit of including subject experts does not come for free: It takes much time to talk to each other and you need to find good counterparts to succeed. But in the long run, this interaction will pay off.

If you are interested talking to Swiss data and information experts with an interdisciplinary approach, come and talk to the team at EBP. Contact me for details. (And thanks to Ralph for editing this post)

GIS 5.0 – Smart and connected

Recently I came across an interesting article by Dave Peters. He outlines the evolution of GIS in four development phases:

  1. In the early 80ies GIS were based primarily on scripts. Using scripts, GI specialists cleaned, edited and visualized spatial data. Some readers might recall the ARC/INFO era and its scripting language Arc Macro Language – AML.
  2. About 20 years later, at the end of the 90ies, the first GUI-centric object-oriented GIS appeared on the stage (for example, ArcGIS Desktop in 1998). This second step with the more efficient programming technique was enabled by more performant hardware.
  3. New technologies to provide data and services emerged with the rapid advent and development of the Web. A building stone of these service-oriented architectures (SOAs) was, for example, the Web Map Services (WMS) specification that was adopted in 2000 (Version 1.0).
  4. Finally, virtualization of hardware and centralization of computing centers initiated the fourth phase leading to cloud-based GIS portals. Storage space and computing power have become scalable commodities. ArcGIS Online, launched in 2012, is a prominent example of this fourth phase.

Now the question is: what comes next?

The steps in GIS software evolution. What's next?
The steps in GIS software evolution. What’s next?

Smart and connected systems

From the past we can learn: New technological abilities lead to new applications. They substiantially influence the further evolution of GIS. Among the contenders for the most relevant (to GIS) technologies and developments I see:

  • indoor navigation,
  • the Internet of Things (IoT) and
  • real-time sytems

Future GIS applications will be more and more smart and networked. They will require a technical infrastructure which is composed of several layers: embedded components, network communications, a cloud-based platform or system, tools for providing authentification and authorization, and  gateways to include external data sources as well as in-house data (see the figure below, adapted from Porter and Heppelmann).

The architecture of future smart, networked GIS applications
The architecture of future smart, connected GIS applications (adapted from Porter and Heppelmann)

The IT Division of Ernst Basler + Partner (EBP Informatics) has already amassed solid experience with the components in such a system (see our reference projects). Also in our blog posts we engage with these future developments, most recently with regards to the real-time quality assessment of data streams.

Do you have any questions or comments on these topics? We would like to hear from you!


Internet of Things: Live Data Quality Assessment for a Sensor Network

TL;DR: We believe that connected devices and real-time data analytics are the next big things in GIS. Here is a live dashboard for a sensor network in 7 cities around the world.

Geoinformation systems have evolved quite rapidly in recent years and the future seems to be more exciting than ever: All major IT trends such as Internet of Things (IoT), big data or real-time systems are directly related to our professional domain: Smart devices are spatially located or even moving in time; big data and real-time systems almost always need locational analytics. This is why we got interested when we heard about the „Sense Your City Art Challenge„, a competition to make sense of a network of DIY sensors, which are spread over 7 cities in 3 continents. To be honest, our interest was not so much drawn to the „art“ aspect, at the end we are engineers and feel more at home with data and technology. And there is real-time sensor data available within the challenge: About 14 sensor nodes in every city deliver approximately 5 measurements every 10 second, such as temperature, humidity or air quality. The sensor is data freely available. When we looked at the numbers, we realized that data had some surprising properties, for example the temperature within varies quite a bit within one city.

Screenshot Story Map
Screenshot of our story map for Sense Your City.


Our goal: Live data quality assessment for a sensor network

So, we took the challenge a bit differently and more from an engineering perspective: How to implement a real-time quality assessment system for sensor data? As an example, we took the following questions, which need re-evaluated as new sensor data comes in:

  • Are there enough sensors that provide information about the sensors?
  • How much does the sensor measurements vary within a city?
  • How do the sensor measurements compare to external data?

Our solution: A live dashboard with real-time statistics 

My colleague Patrick Giedemann and I started late last week and developed a live dashboard with real-time statistics for the sensor network of seven cities. The dashboard is implemented with a story map containing one world view and seven views on city-level. The components of the views are:

  • A heatmap showing a condensed view of the analysis for each of the cities, labeled with numbers 2 to 8. For example, we want to show the visualize number of sensor values for each city within a time frame of 30 seconds. The darker the blue bucket, the more sensor signals we got. Light buckets indicate a low number og signals in the time frame.
  • Another heatmap, which calculates coefficient of variation for each city, again with a time frame of 30 seconds.
  • A gauge showing the number of sensor signals for a city and a linechart with the minimum, maximum and average temperature for a city.

We haven’t yet got around to show real weather data, although it is processed internally.

Some implementation details

For the technically inclined: Our implementation is based on Microsoft’s Azure, one of many cloud computing platforms available. Specifically, we used three main components: Event Hubs, Stream Analytics and WebSockets.

Graphic from Microsoft Azure documentation. We used Event Hubs, Stream Analytics and WebSockets instead of a data store.
  • We started building our solution using Azure Event Hubs, a highly scalable publish-subscribe infrastructue. It could take in millions of events per second, so we have room to grow with only 170’000 data points per hour. Every ten seconds, we pull the raw data from the official data sources and push the resulting data stream to an Azure Event Hub.
  • For the real-time analysis, we tried Azure Stream Analytics, a fully managed stream processing solution, which can take event hubs as an input source. With Stream Analytics, you can analyze incoming data within a certain time window and immediately push the result back to another event hub. For our example, Stream Analytics aggregates the raw signal data every 3 to 4 seconds and calculates  average value, minimum value, maximum value and standard deviation for 30 seconds within a city.
  • Finally, there is a server component, which transforms the event hub into WebSockets. With WebSockets, we can establish a direct connection between the data stream and a (modern) browser client.

What’s next?

Admittedly, this is a very early version of a live quality assessment system for real-time sensor data. However, it shows the potential: We can define a set of data quality indicators like number of active sensors or the variation coefficient. These indicators can be computed as the data streams into the system. Using Azure Stream Analytics, we could incorporate tens of thousands of sensors, instead of only hundred and we’d still have the same performance without changing a line of code.

Of course, there is room for improvements:

  • Ideally the sensor would push its data directly into the Azure EventHub instead of using a polling service as intermediate.
  • Exploiting historical data, like a comparison between the live data and date from a week ago.
  • Integrating more and different data sources for the data analysis.

Do you have any question? Send me an e-mail at stephan.heuel@ebp.ch or leave a comment.