- Original article
- Open Access
Distributed Context Tree Weighting (CTW) for route prediction
- Vishnu Shankar Tiwari^{1}Email author and
- Arti Arya^{1}
https://doi.org/10.1186/s40965-018-0052-9
© The Author(s). 2018
- Received: 3 March 2018
- Accepted: 11 May 2018
- Published: 2 July 2018
Abstract
Route prediction play a vital role in many important location-based applications like resource prediction in grid computing, traffic congestion estimation, vehicular ad-hoc networks, travel recommendation etc. The goal of this work is to design scalable route prediction application based on Context Tree Weighting (CTW) modeling of user travel data. CTW is one of the widely used technique for text compression as well string sequence indexing and for prediction. CTW tree construction from the huge volume of data by sequential processing is time-consuming in practical implementation. Existing techniques are designed for single machine and their implementation on the distributed environment is still a challenge. This work focuses on achieving horizontal scalability of CTW and addresses various challenges in distributed construction like reducing I/O, parallel computation of sequences and coming up with final CTW tree in a distributed environment efficiently. Map Reduce framework running over Hadoop file system is used for processing in distributed mode. Large GPS data set is map-matched to digitized road network obtained from Open Street Map and CTW model is built. A two-step construction of CTW tree is proposed which is implemented in the map-reduce framework. Horizontally scalable CTW model is built and evaluated for route prediction from a huge corpus of historical GPS traces.
Keywords
- CTW
- Route prediction
- Scalable
- Map reduce
- Hadoop
Introduction
Route prediction is a key requirement in many location-based important applications like vehicular ad-hoc networks, traffic congestion estimation, resource prediction in grid computing, vehicular turn prediction, travel pattern similarity, pattern mining etc. Route prediction is a problem which deals with, given a sequence of road network graph edges already traveled by the user, predict the most probable edge of the network to be traveled. Our approach is to build a CTW model from a huge corpus of sequential trajectories traveled by the user in past. CTW model built is probabilistic in nature. Context tree weighting (CTW) is widely used in various applications in the area of data compression and machine learning [1]. Time-stamped GPS traces are collected over a long period of time. The chronological huge sequence of GPS traces is broken down into a smaller unit called trip [2, 3]. Trips are mapped to road network graph using map matching process which identifies the object’s location on road network graph [4–6]. CTW tree based model is constructed from trips composed of an ordered sequence of road network edges. Given a trajectory traveled by the user a lookup is done in the CTW tree based model and the most likely edge is found.
Willems et al. [7] presented CTW algorithm which is a strong lossless compression algorithm. Followed by this, many research work were carried out on CTW which were focused on achieving accuracy and reducing time complexity [8]. CTW models use a set of historical occurrences of sequences to predict the probability of which a specific symbol would appear at a given position in an input stream. CTW is a combination of many lower order Markov models. Real applications using CTW deals with processing of huge data sets in a sequential manner is time-consuming and is a hurdle in practical implementation. Existing techniques are designed for single machine and scalability is achieved by increasing system resources like processors, memory etc. on the single system [7, 9]. An alternative approach to achieve scalability is to make processes run over a distributed cluster of independent systems. Construction of CTW in distributed mode still exists as a challenge. This work addressed this challenge by processing trips in parallel on distributed nodes and finally consolidating them to form CTW model. This is achieved in a two-step process. Set of user trips is decomposed into smaller sets and ported to compute module known as mappers. Mappers compute the variable order contexts as key-value pairs. In each case, the key is the context and value is the occurrence frequency in the training set. Key value pairs from various mappers are emitted to reducer node. Reducer consolidates the occurrences of various contexts and inserts in CTW tree. The final tree produced by reducer is CTW model and is used for route prediction. The major contribution of this work is a technique of distributed computation of CTW and its application in route prediction. All experiments and implementation are done on real datasets available openly in public domain.
Background
Comparison of the most important algorithms for CTW construction
Complexity | Parallel | Probabilistic | |
---|---|---|---|
Begleiter et al. (2004) [1] | O(n^{2}) | No | yes |
Willems et al. (1997) [7] | O(n^{2}) | No | yes |
Tjalkens et al. (1997) [10] | O(n^{2}) | No | yes |
Begleiter et al. (2006) [8] | O(n^{2}) | No | yes |
Sadakane et al. (2000) [11] | O(n^{2}) | No | yes |
Tjalkens et al. (1997) [12] | O(n^{2}) | No | yes |
Volf, P. (2002) [13] | O(n^{2}) | No | yes |
Aberg et al. (1997) [33] | O(n^{2}) | No | yes |
Willems, F. (1998) [34] | O(n^{2}) | No | yes |
Willems et al. (1996) [35] | O(n^{2}) | No | yes |
Willems, F. (1996) [36] | O(n^{2}) | No | yes |
Proposed CTW | O(n^{2}) | yes | yes |
Methodology
CTW tree from user location traces
Where sequence S is ordered sequence of road network edges. Let set ∑ = {e_{1}, e_{2}, e_{3}, e_{4}, e_{5}} be a finite set of all the edges of the digitized road network and ∑^{∗} represents all finite length trips possible. Any trip user makes is essentially belongs to ∑^{∗}. Let X = e_{0}, e_{1}, …. , e_{n − 1} with x_{ i } ∈ ∑ & X ∈ ∑^{∗} be a trip then the length of the trip is given by |X| = |e_{0}, e_{1}, …. , e_{n − 1}|.
All contexts of length (D) ≤ 2 for e_{1}, e_{2}, e_{5}, e_{1}, e_{3}, e_{1}, e_{4}, e_{1}, e_{2}, e_{5}, e_{1}
S.No. | D | Context(s) | Symbol(σ) | sσ |
---|---|---|---|---|
1 | 1 | e _{1} | e _{2} | e_{1}, e_{2} |
2 | 1 | e _{2} | e _{5} | e_{2}, e_{5} |
3 | 1 | e _{5} | e _{1} | e_{5}, e_{1} |
4 | 1 | e _{1} | e _{3} | e_{1}, e_{3} |
5 | 1 | e _{3} | e _{1} | e_{3}, e_{1} |
6 | 1 | e _{1} | e _{4} | e_{1}, e_{4} |
7 | 1 | e _{4} | e _{1} | e_{4}, e_{1} |
8 | 2 | e_{1}, e_{2} | e _{5} | e_{1}, e_{2}, e_{5} |
9 | 2 | e_{2}, e_{5} | e _{1} | e_{2}, e_{5}, e_{1} |
10 | 2 | e_{5}, e_{1} | e _{3} | e_{5}, e_{1}, e_{3} |
11 | 2 | e_{1}, e_{3} | e _{1} | e_{1}, e_{3}, e_{1} |
12 | 2 | e_{3}, e_{1} | e _{4} | e_{3}, e_{1}, e_{4} |
13 | 2 | e_{1}, e_{4} | e _{1} | e_{1}, e_{4}, e_{1} |
14 | 2 | e_{4}, e_{1} | e _{2} | e_{4}, e_{1}, e_{2} |
Two phase CTW tree construction
Proposed technique of CTW construction is a two-step process. The first phase is used to compute all contexts of length ≤ d where d denotes the length of trip (number of edges in the trip). All the contexts sσ is generated for each symbol σ and are put into a map which stores sequence as key and frequency as value. Second phase, consolidates the occurrences of various contexts and inserts in a context tree. Final tree produced by reducer is CTW model. Both steps are discussed below. In next section, two phase implementation is extended to execute over map reduce framework.
All contexts of length (d) ≤ 2 for e_{1}, e_{2}, e_{5}, e_{1}, e_{3}, e_{1}, e_{4}, e_{1}, e_{2}, e_{5}, e_{1} with frequency
S.No. | Sσ | Frequency(f) |
---|---|---|
1 | e_{1}, e_{2} | 2 |
2 | e_{2}, e_{5} | 2 |
3 | e_{5}, e_{1} | 2 |
4 | e_{1}, e_{3} | 1 |
5 | e_{3}, e_{1} | 1 |
6 | e_{1}, e_{4} | 1 |
7 | e_{4}, e_{1} | 1 |
8 | e_{1}, e_{2}, e_{5} | 2 |
9 | e_{2}, e_{5}, e_{1} | 2 |
10 | e_{5}, e_{1}, e_{3} | 1 |
11 | e_{1}, e_{3}, e_{1} | 1 |
12 | e_{3}, e_{1}, e_{4} | 1 |
13 | e_{1}, e_{4}, e_{1} | 1 |
14 | e_{4}, e_{1}, e_{2} | 1 |
Length of string X is denoted by n. All contexts of length d in X can be calculated in linear time Θ(n^{2}) by scanning X from left to right by maintaing a window of size d. Window is advaced by one unit on scanning one symbol. Maximum number of context strings each of length ≤d that can appear in map is Θ(n^{2}) where d ≪ n. This can happen only if contexts does not overlap otherwise in practice number of contexts≤Θ(n).
The second phase starts with a tree from scratch and keeps on inserting context sequences sσ obtained as input from the first step. For a new context which is not seen earlier, a completely new branch is created. Otherwise, a path in the tree is found which is matching/overlapping with current context then for all the nodes in an overlapping path is increased by the frequency of occurring and for remaining nodes are inserted starting the end of the overlapping path. All contexts computed by Algorithm II is as shown in Table 3.
The height of the tree is h = Θ(d + 1) ≅ Θ(d) is linear of the length of context. All branches are of equal length and length of each branch is necessarily Θ(d). As established earlier, Maximum number of context strings each of length d that can appear in the map is Θ(n − d) ≅ Θ(n) when d ≪ n. As soon as d approaches n then the total number of context string aproacches O(1). In practice, d ≪ n means d is way less than n and nearly constant O(1) so we assume a maximum number of context string Θ(n) without loss of generality. Iteratively each string sσ of length |sσ| = d which is formed by string concatenation of context s and target symbol σ, is inserted into CTW tree (starting with the empty tree). Thoretically, cost of each insertion is O(d). Number of such insertions is Θ(n) leads to total cost of O(nd) ≅ O(n^{2}). In actual practice, it is very likely that pattern repeats and contexts are same. In such a scenario, a number of total context strings n(sσ) ≪ Θ(n) and cost of construction of CTW tree on a single machine from the output of Algorithm I is ≪Θ(nd). But as stated, d is way less than n and is approximately a constant and hence complexity of algorithm approaches Θ(n). Combining running time of both phases is Θ(2n) = Θ(n).
Distributed construction of CTW tree
In order to achieve distributed construction of CTW tree-based model, two-step process described in the earlier section is extended to execute over Hadoop cluster leveraging the map-reduce computation framework. The first phase is executed by mapper module. Map matches GPS traces and decomposed in smaller units called trips are portioned into chunks of a set of trips and to mapper module. All the contexts sσ is generated by mapper for each symbol σ in the trip and are put into a map which stores sequence as key and frequency as value. Implementation of mapper for computation of contexts under map reduce model is described in Algorithm III.
All contexts of length (d) ≤ 2 for e_{1}, e_{2}, e_{5}, e_{1}, e_{3}, e_{1}, e_{4}, e_{1}, e_{2}, e_{5}, e_{1} with frequency computed by m_{1}
S.No. | sσ | Frequency(f) |
---|---|---|
1 | e_{1}, e_{2} | 2 |
2 | e_{2}, e_{5} | 2 |
3 | e_{5}, e_{1} | 2 |
4 | e_{1}, e_{3} | 1 |
5 | e_{3}, e_{1} | 1 |
6 | e_{1}, e_{4} | 1 |
7 | e_{4}, e_{1} | 1 |
8 | e_{1}, e_{2}, e_{5} | 2 |
9 | e_{2}, e_{5}, e_{1} | 2 |
10 | e_{5}, e_{1}, e_{3} | 1 |
11 | e_{1}, e_{3}, e_{1} | 1 |
12 | e_{3}, e_{1}, e_{4} | 1 |
13 | e_{1}, e_{4}, e_{1} | 1 |
14 | e_{4}, e_{1}, e_{2} | 1 |
All contexts of length (d) ≤ 2 for e_{5}, e_{1}, e_{3}, e_{1}, e_{4}, e_{1}, e_{2}, e_{5}, e_{1}, e_{3}, e_{1} with frequency computed by m_{ 2 }
S.No. | sσ | Frequency(f) |
---|---|---|
1 | e_{5}, e_{1} | 2 |
2 | e_{1}, e_{3} | 2 |
3 | e_{3}, e_{1} | 2 |
4 | e_{1}, e_{4} | 1 |
5 | e_{4}, e_{1} | 1 |
6 | e_{1}, e_{2} | 1 |
7 | e_{2}, e_{5} | 1 |
8 | e_{5}, e_{1}, e_{3} | 2 |
9 | e_{1}, e_{3}, e_{1} | 2 |
10 | e_{3}, e_{1}, e_{4} | 1 |
11 | e_{1}, e_{4}, e_{1} | 1 |
12 | e_{4}, e_{1}, e_{2} | 1 |
13 | e_{1}, e_{2}, e_{5} | 1 |
14 | e_{2}, e_{5}, e_{1} | 1 |
Result of merging of intermediate key/value pairs by Map-Reduce framework
S.No. | Key(k) | Frequencies | <K,<value_list>> |
---|---|---|---|
1 | e_{1}, e_{2} | 2, 1 | < e_{1}, e_{2}, < 2, 1 > > |
2 | e_{2}, e_{5} | 2, 1 | < e_{2}, e_{5}, < 2,1 > > |
3 | e_{5}, e_{1} | 2, 2 | < e_{5}, e_{1}, < 2, 2 > > |
4 | e_{1}, e_{3} | 1, 2 | < e_{1}, e_{3}, < 1, 2 > > |
5 | e_{3}, e_{1} | 1, 2 | < e_{3}, e_{1}, < 1, 2 > > |
6 | e_{1}, e_{4} | 1, 1 | < e_{1}, e_{4}, < 1, 1 > > |
7 | e_{4}, e_{1} | 1, 1 | < e_{4}, e_{1}, < 1, 1 > > |
8 | e_{1}, e_{2}, e_{5} | 2, 1 | < e_{1}, e_{2}, e_{5}, < 2, 1 > > |
9 | e_{2}, e_{5}, e_{1} | 2, 1 | < e_{2}, e_{5}, e_{1}, < 2,1 > > |
10 | e_{5}, e_{1}, e_{3} | 1, 2 | <e_{5}, e_{1}, e_{3}, < 1, 2 > > |
11 | e_{1}, e_{3}, e_{1} | 1, 2 | < e_{1}, e_{3}, e_{1}, < 1, 2 > > |
12 | e_{3}, e_{1}, e_{4} | 1, 1 | < e_{3}, e_{1}, e_{4}, < 1, 1 > > |
13 | e_{1}, e_{4}, e_{1} | 1, 1 | < e_{1}, e_{4}, e_{1}, < 1, 1 > > |
14 | e_{4}, e_{1}, e_{2} | 1, 1 | < e_{4}, e_{1}, e_{2}, < 1, 1 > > |
Result of calculation of the sum of value list associated with keys
S.No. | Key(k) | <K, Sum(Values)> |
---|---|---|
1 | e_{1}, e_{2} | < e_{1}, e_{2}, < 3 > > |
2 | e_{2}, e_{5} | < e_{2}, e_{5}, < 3 > > |
3 | e_{5}, e_{1} | < e_{5}, e_{1}, < 4 > > |
4 | e_{1}, e_{3} | < e_{1}, e_{3}, < 3 > > |
5 | e_{3}, e_{1} | < e_{3}, e_{1}, < 3 > > |
6 | e_{1}, e_{4} | < e_{1}, e_{4}, < 2 > > |
7 | e_{4}, e_{1} | < e_{4}, e_{1}, < 2 > > |
8 | e_{1}, e_{2}, e_{5} | < e_{1}, e_{2}, e_{5}, < 3 > > |
9 | e_{2}, e_{5}, e_{1} | < e_{2}, e_{3}, e_{1}, < 3 > > |
10 | e_{5}, e_{1}, e_{3} | <e_{5}, e_{1}, e_{3}, < 3 > > |
11 | e_{1}, e_{3}, e_{1} | < e_{1}, e_{3}, e_{1}, < 3 > > |
12 | e_{3}, e_{1}, e_{4} | < e_{3}, e_{1}, e_{4}, < 2 > > |
13 | e_{1}, e_{4}, e_{1} | < e_{1}, e_{4}, e_{1}, < 2 > > |
14 | e_{4}, e_{1}, e_{2} | < e_{4}, e_{1}, e_{2}, < 2 > > |
Route prediction using CTW
Below cases demonstrates prediction Route_Predict function over CTW model constructed by Algorithm IV.
Case I
Hence Route_Predict(ε) → e_{1}.
Case II
Another case we explore is when edge e_{2} is traversed so far S = . e_{2}. Length of input trajectory is 1 unit only and consists of single edge. Candidate edge after e_{2} is already traversed is only one and is e_{5}. In this case probability of occurrence of e_{5} after e_{2} as context is p(e_{5}| e_{2}) = 1. Hence Route_Predict(e_{2}) → e_{5.}
Case III
Hence Route_Predict(e_{1}) → e_{3}.
Case IV
Next we consider a case when multiple edges are traveled and input to Route_Predict function is {e_{1}, e_{2}}. Possible candidate for travel next is edge e_{5} having said event of traveling over {e_{1}, e_{2}} has already occurred. p(e_{5}|e_{1}, e_{2}) = 1 and hence Route_Predict(e_{1}, e_{2}) → e_{5}
Case V
Next we consider a case when the user has traveled a path which is not yet seen by CTW model. For example, if the user has traveled path {e_{3}, e_{4}} but in CTW tree no such path exists means this something which has not occurred in past. Hence prediction function result is Route_Predict(e_{3}, e_{4}) → ε. This can happen when user has reached the destination and there is nothing to predict and in another case, it’s a new route. In later case, new routes when, found should be sent to model for learning.
Case VI
All scenarios described above predicts one next hop edge. It is possible to predict multiple edges too. For example from root node \( \mathrm{p}\left({e}_1|\ \upvarepsilon \right)=\frac{16}{38}=0.42 \) is the highest among all available candidates. For e_{1} next edge with the highest probability is e_{3} with a probability of \( \mathrm{p}\left({e}_3|\ {e}_1\right)=\frac{9}{18}=0.50 \).
Results and discussion
Map data and GPS location traces are two data sets required for implementation of proposed CTW based route prediction. Data sets used are described below.
- 1.
Non-spatial features like width, length, speed, name and turn restriction of the road represented by an edge in the graph.
- 2.
Spatial data that represents the geometry of road network.
Comparision with most important researches in route prediction
Authors | Method for prediction | Horizontal scalability | Rate |
---|---|---|---|
Proposed CTW Route Prediction | Context Tree Weighting (CTW) | Yes | 40–90% |
Simmons et al. (2006) [37] | Hidden Markov Model | No | 70–80% |
Burbey et al. (2008) [38] | Prediction by Partial Match (PPM) | No | 92% |
Tiwari et al. (2012) [39] | Closest Match Algorithm | No | 40% |
Lung et al. (2014) [40] | Hidden Markov Model | No | 68.3% |
Neto et al. (2015) [41] | Prediction by Partial Match (PPM) | No | 46% |
Amirat et al. (2017) [42] | Graph Mining | No | 76% |
Froehlich et al. (2008) [3] | Closest match | No | 40–90% |
Tiwari et al. (2017) [43] | Probabilistic Suffix Tree | Yes | 85 |
Conclusions and future work
In this work, the focus was on the construction of CTW model in the distributed way from a huge corpus of GPS location traces. GPS location was decomposed into smaller units called user trips. User trips were map-matched to road network to convert the data into a set of edges. The map matching of GPS data to road network edges reduces the data size and make model construction faster than building model from raw GPS data. CTW model was constructed with edges of CTW tree annotated with probability of their occurrence. The model was then used in prediction of the route given a partial trajectory. We observed that model construction phase is the most time consuming but over distributed cluster processing time decreases linearly with the addition of nodes in the cluster. Once the model is constructed, route prediction is not a time-consuming process. It is important to note that quality of data used in such a system really matters. OSM is a crowdsourced data and data quality is a major concern [30]. However, lots of research is in progress in this area and should be considered for future work [31, 32].
Declarations
Acknowledgements
We thank Geospatial Information Science and Engineering (GISE) Lab, Indian Institute of Technology, IIT-Bombay, India for carrying out some initial part of the work at their lab. We would also like to thank anonymous reviewers, whom reviews helped us to bring this manuscript to the current form.
Funding
This work is purely author’s own work and authors own funding required for publishing of this research work.
Availability of data and materials
All data and material used is open source. Majorly, GPS data points are from GPS trajectory dataset collected in (Microsoft Research Asia) Geolife project. Dataset is made available for research from 2012 by Microsoft Research (https://geotime.com/general/geolife-project/). Map data used is from Open Street Map (OSM) which is an open project. (www.openstreetmap.org).
Authors’ contributions
VST and AA discussed the idea of CTW with respect to Route Prediction and its implementation aspects. VST has implemented the idea and contributed towards the first draft of the paper under the guidance of AA. AA thoroughly proofread the manuscript and made all vital corrections. Both the authors have read and finally approved the manuscript.
Authors’ information
Vishnu Shankar Tiwari is a post graduate (Master of Technology- M.Tech.) in Computer Engineering from Department of Computer Engineering, Indian Institute of Technology (IIT)-Bombay, Mumbai, India. Also, holds M.Tech. (Computer Applications) from YMCA University of Science and Technology, India and Master of Computer Application (MCA) from Maharshi Dayanand University, India. Working in software industry for more than 8 years.
Arti Arya is Head of Department (HOD) and Professor at Department of Computer Application, PES Institute of Technology, Bangalore South Campus. She holds Ph.D in Computer Science from Faculty of Technology and Engineering, Maharshi Dayanand University, India. She has M.Tech in Computer Science from Allahabad Agricultural Institute, Master of Science (Mathematics) and Bachelor of Science (Mathematics) from Delhi University. Her areas of interests are Spatial Data Mining, Knowledge based systems, Machine Learning, Artificial Intelligence, Data Analysis. She has approx. 17 years of teaching experience (of which 10 years of research) at Undergraduate and Post Graduate level. She is Senior Member IEEE, Life Member CSI and Life Member IAENG.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Begleiter R, El-Yaniv R, Yona G. On prediction using variable order Markov models. J Artif Intell Res. 2004;22:385–21.Google Scholar
- Tiwari VS, Arya A, Chaturvedi S. Framework for horizontal scaling of map matching using map-reduce. In: IEEE, 13th International Conference on Information Technology, ICIT 2014; 2014.Google Scholar
- Froehlich J, Krumm J. Route prediction from trip observations, Society of Automotive Engineers (SAE) 2008 world congress, paper 2008–01-0201. 2008.Google Scholar
- Liu Y, Li Z. A novel algorithm of low sampling rate GPS trajectories on map-matching. EURASIP J Wirel Commun Netw. 2017; 2017:30. https://link.springer.com/article/10.1186/s13638-017-0814-6.
- Zhou J, Golledge R. A three-step general map matching method in the GIS environment: travel/transportation study perspective. Int J Geogr Inf Syst. 2006;8(3)243–60. https://scholar.google.com/scholar_lookup?title=A%20three-step%20general%20map%20matching%20method%20in%20the%20GIS%20environment%3A%20travel%2Ftransportation%20study%20perspective&author=J.%20Zhou&author=R.%20Golledge&journal=260&publication_year=2006.
- Manikandan R, Latha R, Ambethraj C. An analysis of map matching algorithm for recent intelligent transport system. Asian J Appl Sci. 2017;05(01) (ISSN: 2321 – 0893). https://www.ajouronline.com/index.php/AJAS/article/view/4642.
- Willems F, Shtarkov Y, Tjalkens T. Reflections on The context-tree weighting method: basic properties. Newsl IEEE Inf Theory Soc. 1997; http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.109.1872.
- Begleiter R, Yaniv R. Superior guarantees for sequential prediction and lossless compression via alphabet decomposition. J Mach Learn Res. 2006;7:379–411.Google Scholar
- Willems F, Tjalkens T. Complexity reduction of the context-tree weighting algorithm: a study for KPN research, EIDMA report RS.97.01. Eindhoven: Technical University of Eindhoven; 1997. https://www.researchgate.net/publication/228732029_Complexity_reducing_techniques_for_the_CTW_algorithm. https://core.ac.uk/display/56576627.
- Tjalkens T, Willems F. Implementing the context-tree weighting method: arithmetic coding. In: International conference on combinatorics, information theory and statistics; 1997. p. 83.Google Scholar
- Sadakane K, Okazaki T, Imai H. Implementing the context tree weighting method for text compression, Proceedings DCC 2000. Data Compression Conference, Snowbird, UT, 2000, pp. 123–32. https://doi.org/10.1109/DCC.2000.838152. https://dl.acm.org/citation.cfm?id=789787.
- Tjalkens T, Volf P, Willems F. A context-tree weighting method for text generating sources. In: Data Compression Conference; 1997. p. 472.Google Scholar
- Volf P. Weighting techniques in data compression theory and algorithms. Ph.D. thesis: Technische Universiteit Eindhoven; 2002. https://www.researchgate.net/publication/238123916_Weighting_techniques_in_data_compression_Theory_and_algorithms. https://www.scribd.com/document/63172312/Weighting-Techniques-in-Data-Compression-Theory-and-Algoritms.
- Quddus MA, Noland RB, Ochieng WY. A high accuracy fuzzy logic based map matching algorithm for road transport. J Intell Transp Syst. 2006;10(3):103–15.View ArticleGoogle Scholar
- Greenfeld JS. Matching GPS observations to locations on a digital map. 81th annual meeting of the transportation research board. 2002. p. 164–73. https://www.researchgate.net/publication/246773761_Matching_GPS_Observations_to_Locations_on_a_Digital_Map.
- Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th conference on symposium on operating systems design & implementation. San Francisco; 2004. p. 10. https://dl.acm.org/citation.cfm?id=1327492.
- Lammel R. Google’s MapReduce programming model - revisited. Sci Comput Program. 2008;70:1–30.View ArticleGoogle Scholar
- Chang F, Dean J, Ghemawat S, Hsieh W, Wallach D, Burrows M, Chandra T, Fikes A, Gruber R. Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst. 2008;26(2):1–26.View ArticleGoogle Scholar
- Haklay M, Weber P. OpenStreetMap: user-generated street maps. IEEE Pervasive Comput. 2008;7(4):12–8.View ArticleGoogle Scholar
- Ranade A. Mumbai Navigator. Indian J Transp Manag. 2005; https://www.cse.iitb.ac.in/~ranade/.
- Rousell A, Hahmann S, Bakillah M, Mobasheri A. Extraction of landmarks from OpenStreetMap for use in navigational instructions. In: Proceedings of the AGILE conference on geographic information science. Lisbon; 2015. p. 9–12. https://agile-online.org/conference_paper/cds/agile_2015/posters/57/57_Paper_in_PDF.pdf. https://www.researchgate.net/publication/278301149_Extraction_of_landmarks_from_OpenStreetMap_for_use_in_navigational_instructions.
- Zipf A, Mobasheri A, Rousell A, Hahmann S. Crowdsourcing for individual needs—The case of routing and navigation for mobility-impaired persons. In: Capineri C, Haklay M, Huang H, Antoniou V, Kettunen J, Ostermann F, Puves R, editors. European Handbook of Crowdsourced Geographic Information. London: Ubiquity Press; 2016. pp. 325–37.Google Scholar
- Mobasheri A, Sun Y, Loos L, Ali AL. Are crowdsourced datasets suitable for specialized routing services? Case study of OpenStreetMap for routing of people with limited mobility. Sustainability. 2017;9(6):997.View ArticleGoogle Scholar
- Sun Y, Fan H, Bakillah M, Zipf A. Road-based travel recommendation using geo-tagged images. Comput Environ Urban Syst. 2015;53:110–22.View ArticleGoogle Scholar
- Bakillah M, Liang SHL, Mobasheri A and Zipf A. Towards an efficient routing web processing service through capturing real-time road conditions from big data, 2013 5th Computer Science and Electronic Engineering Conference (CEEC), Colchester, 2013, pp. 152–5. https://doi.org/10.1109/CEEC.2013.6659463.
- Haworth B, Bruce E. A review of volunteered geographic information for disaster management. Geography Compass. 2015;9(5):237–50.View ArticleGoogle Scholar
- Zook M, Graham M, Shelton T, Gorman S. Volunteered geographic information and crowdsourcing disaster relief: a case study of the Haitian earthquake. World Med Health Policy. 2010;2(2):7–33.View ArticleGoogle Scholar
- Sun Y, Mobasheri A, Hu X, Wang W. Investigating impacts of environmental factors on the cycling behavior of bicycle-sharing users. Sustainability. 2017;9(6):1060.View ArticleGoogle Scholar
- Ganeshan K, Sarda L, Gupta S. Developing IITB smart CampusGIS grid. In: A2CWiC '10 Proceedings of the 1st Amrita ACM-W celebration on women in computing in India. New York: ACM; 2010.Google Scholar
- Senaratne H, Mobasheri A, Ali AL, Capineri C, Haklay M. A review of volunteered geographic information quality assessment methods. Int J Geogr Inf Sci. 2017;31(1):139–67.View ArticleGoogle Scholar
- Mobasheri A, Huang H, Degrossi LC, Zipf A. Enrichment of OpenStreetMap data completeness with sidewalk geometries using data mining techniques. Sensors. 2018;18(2):509.View ArticleGoogle Scholar
- Mobasheri A. A rule-based spatial reasoning approach for OpenStreetMap data quality enrichment; case study of routing and navigation. Sensors. 2017;17(11):2498.View ArticleGoogle Scholar
- Aberg J, Shtarkov Y. Text compression by context tree weighting. In: Proceedings data compression conference (DCC); 1997. p. 377–86.View ArticleGoogle Scholar
- Willems F. The context-tree weighting method: extensions. IEEE Trans Inf Theory. 1998;44(2):792–8.View ArticleGoogle Scholar
- Willems F, Shtarkov Y, Tjalling T. Context weighting for general finite-context sources. IEEE Trans Inf Theory. 1996;42(5):1514–20.View ArticleGoogle Scholar
- Willems F. Coding for a binary independent piecewise-identically-distributed source. IEEE Trans Inf Theory. 1996;42(11):2210–7.View ArticleGoogle Scholar
- Simmons R, Browing B, Yilu Z, Sadekar V. Learning to predict driver route and destination intent. In: Intelligent transportation systems conference; 2006.Google Scholar
- Burbey I, Martin TL. Predicting future locations using prediction-by-partial-match. In: Proc. 1st ACM MELT; 2008. p. 1–6.Google Scholar
- Tiwari VS, Chaturvedi S, Arya A. Route prediction using trip observations and map matching, 2013 3rd IEEE International Advance Computing Conference (IACC), Ghaziabad, 2013, pp. 583–7. https://doi.org/10.1109/IAdCC.2013.6514292.
- Lung HY, Chung CH, Dai B-R. Predicting locations of mobile users based on behavior semantic mining. In: Trends and applications in knowledge discovery and data mining, lecture notes in computer science, vol. 8643; 2014.Google Scholar
- Neto FDN, Baptista CDS, Campelo CEC. Prediction of destinations and routes in urban trips with automated identification of place types and stay points. In: Proc. Brazilian Symposium on Geoinformatics; 2015. p. 80–91.Google Scholar
- Amirat H, Lagraa N, Fournier Viger P, Ouinten Y. MyRoute: a graph-dependency based model for real-time route prediction. J Commun. 2017; https://doi.org/10.12720/jcm.12.12.668-676.
- Tiwari VS, Arya A. Horizontally scalable probabilistic generalized suffix tree (PGST) based route prediction using map data and GPS traces. Journal of Big Data. 2017;4:23. https://journalofbigdata.springeropen.com/articles/10.1186/s40537-017-0085-4.