In the current era of big data, huge quantities of valuable data, which may be of different levels of veracity, are being generated at a rapid rate. Embedded into these big data are implicit, previously unknown and potentially useful information and valuable knowledge that can be discovered by data science solutions, which apply techniques like data mining. There has been a trend that more and more collections of these big data have been made openly available in science, government and non-profit organizations so that people could collaboratively study and analysis these open big data. In this article, we focus on open big data for public transit because public transit (e.g., bus) as a means of transportation is a vital part of many people's lives. As time is a precious resource, bus delays could negatively affect commuters' plans. Unfortunately, they are inevitable. Hence, many existing works focused on predicting bus delays. However, predicting on-time or early buses is also important. For instance, commuters who come to a bus stop on time may still miss their buses if the buses leave early. So, in this article, we examine open big data about bus performance (e.g., early, on-time, and late stops). We analyze the data with frequent pattern mining and make predictions with decision-tree based classification. For illustration, we perform predictive analytics on real-life open big data available on Winnipeg Open Data Portal, about bus performance from Winnipeg Transit. It shows the benefits of predictive analytics on open big data for supporting smart transportation services.

Predictive analytics on open big data for supporting smart transportation services

Cuzzocrea, Alfredo
2020-01-01

Abstract

In the current era of big data, huge quantities of valuable data, which may be of different levels of veracity, are being generated at a rapid rate. Embedded into these big data are implicit, previously unknown and potentially useful information and valuable knowledge that can be discovered by data science solutions, which apply techniques like data mining. There has been a trend that more and more collections of these big data have been made openly available in science, government and non-profit organizations so that people could collaboratively study and analysis these open big data. In this article, we focus on open big data for public transit because public transit (e.g., bus) as a means of transportation is a vital part of many people's lives. As time is a precious resource, bus delays could negatively affect commuters' plans. Unfortunately, they are inevitable. Hence, many existing works focused on predicting bus delays. However, predicting on-time or early buses is also important. For instance, commuters who come to a bus stop on time may still miss their buses if the buses leave early. So, in this article, we examine open big data about bus performance (e.g., early, on-time, and late stops). We analyze the data with frequent pattern mining and make predictions with decision-tree based classification. For illustration, we perform predictive analytics on real-life open big data available on Winnipeg Open Data Portal, about bus performance from Winnipeg Transit. It shows the benefits of predictive analytics on open big data for supporting smart transportation services.
Predictive analytics
Winnipeg open data
big data
frequent patterns
large-scale systems
on-time performance
open data
software engineering
transportation data
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/340258
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact