The ctu/personal dataset (v. 2012-03-15)
Mobile phone records of Czech Ph.D. student Michal Ficek.
Contributed by Michal Ficek.
This dataset contains 142 days of mobile phone records (aka Call Data Records) and ground-truth movement description of Czech Ph.D. student Michal Ficek, stored by his own mobile terminal in 2010-2011.
details of the ctu/personal dataset (v. 2012-03-15)
- last modified
- reason for most recent change
the initial version
- release date
- date/time of measurement start
- date/time of measurement end
- network type
GSM (Global System for Mobile Communications)
- network type
- collection environment
This dataset contains 142 days of mobile phone records (as known as Call Data Records) and cell transitions (a ground-truth movement description) of Czech Ph.D. student Michal Ficek, stored by his own mobile terminal in 2010-2011. The dataset covers more than 99.99% of 142 days of mobile phone usage in mobile networks of 8 different providers in 5 countries: Czech Republic, Slovak Republic, Germany, Austria and the USA.
- network configuration
The phone was serviced mostly by Vodafone Czech Republic, the home network of the user, in the Czech Republic. Other network providers in countries abroad are as follows: Orange (Slovakia), A1 Telekom (Austria), T-Mobile Deutschland, Vodafone D2, O2 (Germany), and T-Mobile and AT&T (USA)
- data collection methodology
The source of the data is user's own mobile phone Nokia E52. The publicly available LogExport application was used to record time and type of communication events (voice, SMS, data). For cell-transition recording, the free CellTrack91 application was utilized. The coordinates of positions within the cells were obtained by translating the Cell-IDs to their geographical coordinates by querying the Google Location API, as described in our MASS paper.
The Cell Global Identity of a cell the mobile phone is attached to is only partially anonymized. Fields with original values are the Mobile Country Code (MCC) and the Mobile Network Code (MNC), to distinguish in which country a mobile phone was present, and which provider serviced it. The Location Area Code (LAC) and the Cell-ID are anonymized, in other words, renumbered according to the time of their first occurence in the dataset. Such approach does not limit the data usage but helps the mobile providers not to feel threatened by exposing the Cell-IDs together with the approximate geographical coordinates of the cell. This geographical information, the longitude/latitude coordinates of a cell, is not anonymized and thus represents a way to reconstruct a ground-truth movement trajectory of the mobile phone.
The spatial accuracy of the data is typical for a cellular network. It depends on a cell size and thus varies from tens to hundred of meters in urban areas to several kilometers in rural areas.
- disruptions to data collection
There are only three gaps in the data when the cell-tracking application was turned off by accident: from 02-Oct-2010 22:42:06 to 03-Oct-2010 07:58:04, from 05-Oct-2010 15:08:42 to 05-Oct-2010 15:22:42, and from 09-Oct-2010 13:40:18 to 09-Oct-2010 15:49:32. Otherwise, the mobile phone had never been switched off during the measurement period, except when on-board of a plane and airborne.
The positions within the cells were obtained by querying the Google Location API. In our MASS paper, we showed, by comparing with data obtained from a large and cooperating mobile network provider, that the accuracy of such approach is nearing the cellular network operator's own approximation of position inside a cell.
This dataset contains the following traceset:
Mobile phone records of Czech Ph.D. student Michal Ficek collected in 2010-2011.
quick access to download the traceset
- download the ficek_personal_communication.csv.gz (from the ctu/personal/mobile/2010 trace) file
- from a CRAWDAD mirror: US UK
size="24KB" type="gz" md5="be33b354956287a768fb5446594d5900"
- download the ficek_personal_movement.csv.gz (from the ctu/personal/mobile/2010 trace) file
- from a CRAWDAD mirror: US UK
size="320KB" type="gz" md5="6ce11990c64d107c7ef55c1c94eb223c"
- Michal Ficek
Czech Technical University in Prague
Technicka 2, 166 27, Prague, Czech Republic
how to cite this dataset
When writing a paper that uses CRAWDAD datasets, we would appreciate it if you could cite both the authors of the dataset and CRAWDAD itself, and identify the exact dataset using the appropriate version number. For this dataset, this citation would look like:
Michal Ficek, CRAWDAD dataset ctu/personal (v. 2012‑03‑15), downloaded from https://crawdad.org/ctu/personal/20120315, https://doi.org/10.15783/C7059S, Mar 2012.
We also provide bibliographic information in common citation formats below:
If you do not use the provided citation formats, please include a reference with the same information, as described in the CRAWDAD FAQ.