[LeMo Project] Big Data Methodologies, Tools and Infrastructures

“The use of Big Data has the potential to change dramatically the way transportation is done.
The technologies needed to store, manage and analyse Big Data are complex and require high skills and expertise. Vendors, especially in the USA, are leading the game, by offering various software data platforms, which integrate both the storage and management of data together with advanced data analytics.
To choose the right technology for the domain at hand is a difficult task which requires a high knowledge of the software technologies provided and the requirements of the application.”

The goal is to create value out of this amount of data, by providing a comprehensive picture of what’s happening, using business analytics, leveraging big data tools and predictive analytics, to help transportation agencies improve operations, reduce costs and hopefully better serve travelers.

The technical challenge is that much of this Big Data is non-standard data (e.g., social, geospatial or sensor-generated data that does not an easy fit into traditional, structured, relational data warehouses or databases).

An additional challenge is that with such an amount of real-time structured and unstructured data captured from a variety of sources, it is difficult to determine which data is most valuable. Terabytes of data are collected and result in an added complexity to the underlying IT infrastructures.

These terabytes of data require immense amounts of storage in silo after silo of transportation operator data centers. In order to analyze Big Data, an appropriate Data Infrastructure needs to be in place to:

  1. store and maintain data
  2. analyze data
  3. present results in a clear visual way

Several Big Data platforms have been proposed recently, open source and proprietary. In order to tackle the demands and challenges in the transportation domain, an optimal stack of Big Data technologies needs to be selected and designed based on the application requirements.

This is not an easy task.

This report, which is a follow up of Deliverable 1.1, offers an in-depth introduction to relevant technologies for Big Data Analytics and Big Data Management. It also looks at how these technologies are applied to build a Big Data Platform suitable for the transport sector. We present in detail how application-specific benchmarking can be used in order to evaluate which Big Data technologies are most suited for the domain. We conclude the report with an applied example of using data analytics for urban mobility.

This document offers the reader a technical insight into existing Big Data technologies at various levels: software management, data platform, and application. In order to evaluate which specific software components in the Big Data stack are more suitable for transport applications, with high volume and high-velocity requirements, a benchmarking approach is presented.

The future of data analytics in transportation has many applications and opportunities.

The main challenge is using significantly improved technologies and methods to gather and understand the data in order for business decisions to be informed by better insights


This report is part of the LeMO project which has received funding by the European Union’s Horizon 2020 research and innovation programme under grant agreement number 770038.

The content of this report reflects only the authors’ view. The European Commission and Innovation and Networks Executive Agency (INEA) are not responsible for any use that may be made of the information it contains.

You may also like...