Dear Rodica and friends,
Thank you Rodica for the topic, it is extremely interesting and important to discuss. Firstly, I would like to begin asking for a general definition of Big Data, and a specific definition of Big Data related with public transport. What does Big Data mean? What does Big Data mean for sustainable public transportation?
The first traditional source of information (wikipedia) could help us to have a first general definition:
"Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time. Big data "size" is a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data. Big data is a set of techniques and technologies that require new forms of integration to uncover large hidden values from large datasets that are diverse, complex, and of a massive scale.
In a 2001 research report and related lectures, META Group (now Gartner) analyst Doug Laney defined data growth challenges and opportunities as being three-dimensional, i.e. increasing volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources). Gartner, and now much of the industry, continue to use this "3Vs" model for describing big data. In 2012, Gartner updated its definition as follows: "Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization." Additionally, a new V "Veracity" is added by some organizations to describe it.
If Gartner’s definition (the 3Vs) is still widely used, the growing maturity of the concept fosters a more sound difference between big data and Business Intelligence, regarding data and their use:
- Business Intelligence uses descriptive statistics with data with high information density to measure things, detect trends etc.;
- Big data uses inductive statistics and concepts from nonlinear system identification  to infer laws (regressions, nonlinear relationships, and causal effects) from large sets of data with low information densityto reveal relationships, dependencies and perform predictions of outcomes and behaviors.
A more recent, consensual definition states that "Big Data represents the Information assets characterized by such a High Volume, Velocity and Variety to require specific Technology and Analytical Methods for its transformation into Value"."
Please, if someone of you could contribute with a more specificdefinition of Big Data (from the public transport perspective), do not hesitate to share it in this space.
Secondly, I think it is important to identify the positive and negative impacts of the use or implementation in urban public transport planning. One positive could be the improvement and increase of the information in order to study, analyse and plan (research and monitor) the past, present and future demands of flows of persons (commuters, tourists, etc), loads (deliveries, products, etc), and information inside cities. The increase of accessibility to this information (like Rodika´s example link), gives the opportunity to the inhabitants to understand how these flows interact with the city, and to monitor the decision making from the local governments in terms of design, planning and construction of public transport infrastructure. The latter is extremely relevant, if we want to improve public participation in cities.
On the other hand, one negative point could be the rise of massive dependency from virtual and real-time information. In case of hacking (terrorist attacks) or problems with energy (blackouts, natural disasters), the surprise (fast and unexpected) lack of virtual information could produce massive panic, specially in the field of public transportation in the largest metropolis. This is only one first idea, probably I need more arguments to improve it, but is one example that comes to my mind.
Dear friends, please I would like to kindly ask for your contribution. We need ideas, questions, definitions. This is an amazing topic, but also a strategic point to discuss in order to plan the future of our cities.
Thanks Rodica, I really hope this becomes an interesting Discussion. Ricardo has presented already a definition for Big Data, this large amount of daily facts and figures that can be collected and require special tools to be captured and processed, so that they can become information, that as it becomes available and assimilated and understood it can turn into knowledge, to finally be capable of changing people´s habits or behaviors. Big data is already having an impact on our mobility habits. Many people now turn to their phones for the latest transit or traffic information, they plan their routes and daily commutes based on this info. Although this is practically second nature now, this was not the case a mere 10 years ago. Rodica presented cases where the aggegation of info goes beyond the local context and we are able to see and compare the operation of different transport systems in different cities. The refinement of our travel prediction models, like Ricardo pointed out, will be certainly increased and the mobility needs of travelers will be met in better ways. Now, going back to mobility habits, big data, beyond mobility, but related to different areas, has the potential to affect how many people conduct their jobs, for many cases, it may not be necessary to go out in the field to collect data every time we need an update, but only sit down in front of a computer and "harvest" the required information from the right source. This will also affect our mobility habits. We will be able to do more from our homes, reducing our need to travel, this will also affect mobility patterns, I think big data has an enormous potential to affect mobility, I know this falls beyond our topic, as I am talking about data outside the mobility realm, I actually see the data from other areas having a larger potential to affect how me move in the future, by allowing us to reduce the need to travel more and more every day.
Thank you Rodica for the information. I think that we are making a remarkable leap in mobility, from now on, in my humble opinion, we could analize the traffic like a water supply system and the streets like pipes, it will be incredible the amounts of things that we are going to be able to do with this kind of data. Not only control the flow but also we can give information of what is going on in the city in real time and the citizens can choose the best option.
Now in Spain there are a couple of projects, that i know, working in this field, mainly with the wifi and and bluetooth signals in the cities of Valencia and Vigo
In my opinion the potencial is vast and in the next years we are going to see. Nevertheless I can see a controversial problem and this is the privacy, for instance in the project of Valencia ( linked above) each car is controlled with a specific code so that to know where it is going in whole the city 24 h per day. So it have to inform the people the benefits and how theirs data are going to be handle with under strict safety measurements otherwise it going to be constestation.
Thank you for the information and your opinions. I would like to contribute with one point that I believe is also important to remark. I think that all of us agree with the main idea that Big Data has the capacity to change the mobility habits in cities around the globe. Ok, but this capacity could be improved or deteriorated by an efficient or inefficient use of Big Data. Definitely, Big Data is a powerful resource and tool to plan and monitor urban trasportations flows. Nevertheless, if we are trying to change bad mobility habits of the actual cities, we should use Big Data not just as a tool to measure, monitor and evaluate. We need to use it to change bad mobility habits and to propose new more sustainable too.
For example, one question could be: How could we use Big Data to improve walkability in cities around the globe? How could Big Data help to reduce the use of private cars in order to decrease the CO2 production in cities and promote healthy communities by walking?
Nowadays, there are apps in our smart phones that can guide us inside cities by the efficient use of public transport. The best choice made by the app is generally the faster and shorter in terms of time and distance. But, what about motivate walking? Could we have apps which give us the best pathways or sidewalks in terms of quality of public space, green areas, discovering art and culture, meeting people, slowing and enjoying live and time...? How could Big Data help me or motivate me to leave the car at home and take a healthy and pleasant walk?
Big Data has a great potential to change our bad or not-healthy mobility habits, but we need to innovate our logics of transportations too.
Thank you Rodica for initiating such a captivating and timely discussion, and thank you Ricardo, Fernando, and Saul for your well-thought out answers!
I believe some of the most applicable uses of Big Data in the context of public transport will be an increase in the quality of user experience and cost saving/cost cutting measures. With the advent of such enormous amounts of rider data, ideally we will not only be able to better time train/bus connections, but will also get a clearer aggregate sense of the different types of riders, and what features of transport will affect their modal choice (i.e., bus vs train vs walking vs Uber vs private transport). Data -- even before 'Big Data' -- has always helped increase efficiency, and hence an exponential increase in the amount of data available will help to increase efficiency and lower costs (albeit at a slower rate than the increase in data). We should renew our focus on effective data management solutions for megacities' transit data, and develop efficient frameworks for the flow and implementation of data to ensure proper utilization.
For all the benefits of Big Data, however, I believe that several important existing debates/issues will remain. Appropriate usage of tax dollars for public transport expansion, increasing access for marginalized communities to quality public transport, and disincentivizing automobile usage in American cities will continue to remain controversial issues that data alone can't solve.
Thank you Daniel, I definitely agree with your views. Because of the elements you mention at the end of your comment, I was making the point of the need to go beyond mobility data in order to be able to change mobility habits. The spread of information (based on data alone, not necessarily big data as you rightly point out) will impact most than other variables the need to travel. People today can order groceries online instead of having to take a trip to the supermarket, people can work from home instead of driving or taking the bus to their office, I believe it is in this area, information and online activity as a way to eliminate trips, that Data and Big Data can also have an impact in our mobility habits. Many issues will remain of course, we need to make sure these options are available to everyone, for example, but I am sure looking beyond transit data or analyzing it in order to see why people travel and how we can avoid/minimize this need will also contribute to modify travel behavior. This is certainly a very interesting topic and I really have learned a lot from the different views all participants have shared.
Thank you for sharing such interesting and fresh opinions! I really enjoyed watching this discussion unfolding! Since this is such a multi-faceted issue, I suggest we add a new side-topic and continue our conversation there.
Nevertheless, please consider this discussion still open...and those of you who still want to comment here, please jump in!! The more diverse the commenters, the more useful de conversation!
Thanks again and please check out our eDiscussion 4 on Ride sharing, car sharing and resource efficiency
Since our eDiscussion on Big Data has been such an interesting one and many of you have shared with me that you feel like the opinions expressed during this conversation broadened your views on the topic, here is what I suggest: let's continue our discussion on Big Data - a very complex, multifaceted topic - with a "side-topic" (one of those featured in our Ideation Contest): Ride sharing, car sharing and resource efficiency.
The reason why I think it's interesting to connect the topics is because Big Data has a lot (if not everything!!!) to do with this topic. Handling Data (not only Big Data) has been critical to the rise of car sharing services. Car sharing services (or bike sharing services) use Data to keep track of large fleets of vehicles, monitoring their location etc in (almost) real time.
Moreover, if, by using Big Data, it becomes easier to arrange convenient and less expensive transportation on demand, it is foreseeable that more people will question the need for a personal car.
On another note, Data can help with traffic jams and parking issues and therefore increase resource efficiency for those who use personal cars. Two of the largest sources of pollution are traffic jams and parking. Smart phone apps are already helping drivers to see traffic bottlenecks before they leave home, delay their departure or find an alternate route before they are stuck in traffic. As for parking, I just read some interesting figures - it looks like cars looking for parking account for up to 45 percent of traffic in Manhattan. Therefore, current studies that are trying to develop systems to spot open parking spaces and send their locations to a central database from which drivers can see them, help both with traffic jams and pollution (not to mention frustration ;-).
Looking forward to hearing your thoughts on the topic and to another extremely interesting conversation! Please jump in and share with us your input/opinion/suggestions.