# Temporal Distribution of Traffic Series (Mobile Traffic Big Data)

Background:

Mobile instant messaging (MIM) services significantly facilitate personal and business communications, inevitably consume substantial network resources, and potentially affect the network stability. It is meaningful to carefully examine the traffic nature of MIM (e.g. WeChat/Weixin) services, so as to design MIM service-oriented protocols to overcome their induced negative influence to cellular networks.

MIM Working Mechanisms:

MIM services, which solely rely on mobile Internet to exchange information, have quite working mechanisms from traditional short messaging services. One of the prominent differences is that born with stanard protocols, traditional short messaging services could conveniently fulfill timely information delivery and provision “always-online” service. However, for mobile Internet in packet switching domain, a TCP connection would release itself if exceeding a TCP inactivity timer. Therefore, as depicted in the above picture, besides transmitting (TX) and receiving (RX) normal packets after logging onto a server, MIM services commonly take advantages of keep-alive mechanisms to send packets containing little information periodically and maintain a long-lived TCP connection.

Hereinafter, message refers to a series of packets transmitted between the user equipment (UE) and the servers of service provider on application layer. Therefore, the messages delivered on every TCP connection constitute the fundamental elements of MIM services, and are named as individual message-level (IML) traffic. Comparatively, when the messages  are transmitted through one BS, they become accumulated and could be regarded as the aggregated traffic from a slightly macroscopical perspective.

The Statistical Pattern and Inherited Methodology of MIM Services:

Compared with the geometric and exponential distribution functions recommended by 3GPP, power-law and lognormal distributions functions are more suitable to model the statistical pattern of message lengths and inter-arrival time of consecutive messages, respectively.

Due to their generality,  $$\alpha$$-Stable models are most suitable to characterize the aggregated traffic in cellular networks. Together with previous findings in fixed broadband networks, $$\alpha$$-Stable models are proven to accurately model the aggregated traffic from cellular access networks to core networks.

In addition, according to the generalized central limit theorem, the aggregated traffic within one BS, following  $$\alpha$$-Stable models, can be explained as the accumulation of a number of power-law distributed messages.

On the other hand, we have also investigated and characterized various kinds of traffic in wireless cellular networks, based on a large amount of real traffic data measurement. In particular, our dataset is based on a significant number of practical traffic records from one of the biggest cellular operators in an eastern provincial capital in China. The records in dataset are originated from nearly 10000 BSs with more than 10 million subscribers involved. Each traffic record has a resolution of 5 minutes, including timestamps, location area code (LAC), cell ID, application name and the corresponding volume of data traffic.

Concretely, IM(WeChat/Weixin), HTTP web browsing and QQLive Video are selected as the representatives of the three typical types of mobile service, IM, web browsing and video for discussion, respectively. Particularly, WeChat/Weixin is a widely booming social IM service which allows over 6 hundred million mobile users to exchange text messages and multimedia files like voices, pictures and videos with each other via smart phones, in China as well as around the world. The summary information on the mobile traffic dataset under study is listed in the following Table and Figures (e.g., Traffic time series of different mobile service types during one day).

Remark 1. Application-level cellular data traffic series for IM, Web Browsing and video service appear bursty across a long range of time scales. The burstiness remains significant as the time scale increases.

Burst commonly implies sharp increase in volume of information interaction in seconds, which is potentially accompanied with the emergence of unexpected events or centralized activities of human beings. It is generally believed that bursty phenomena appears apparently and enormously in cellular data traffic series which is closely related to people’s daily life. In this section, we have a brief look at the burstiness of application-level cellular data traffic at different time scales and validate this intrinsic characteristics.

Remark 2. There widely exists self-similarity in application-level cellular data traffic in terms of   IM, Web browsing, and video services. Specifically, for IM and web browsing service, most traffic series exhibit a moderate degree of self-similarity while video service shows weaker self-similarity compared with the other two services under study.

In general, the parameter H is known as the Hurst parameter with the value ranging from 0.5 to 1.0 and has a positive correlation with the degree of self-similarity. That is to say, H =0.5 indicates the lack of self-similarity whereas large value for H (i.e., close to 1.0) indicates a large degree of self-similarity. Generally, graphical methods such as variance-time plot, R/S plot are used to test for self-similarity (see the following Figures).

Remark 3. According to the minor fitting errors, beside the MIM (WeChat/WeiXin), $$\alpha$$-Stable models are suitable to characterize all the other application-level data traffic in cellular networks.

In summary, we have demonstrated the universal existence of burstiness and self-similarity and their great significance in social mobile data traffic series. To capture these characteristics, $$\alpha$$-Stable distribution has been taken to model traffic series. The minor fitting errors for different service types verify the validity of $$\alpha$$-Stable models and the estimated parameter can reflect the characteristics of traffic series well.

Related references:

1. Rongpeng Li, Zhifeng Zhao, Chen Qi, Xuan Zhou, Yifan Zhou, and Honggang Zhang. “Understanding the Traffic Nature of Mobile Instantaneous Messaging in Cellular Networks: A Revisiting to $$\alpha$$-Stable Models,” IEEE Access, vol. 3, pp. 1416-1422, 2015. PDF

2. Rongpeng Li, Zhifeng Zhao, Jianchao Zheng, Chengli Mei, Yueming Cai, and Honggang Zhang, “The Learning and Prediction of Application-level Traffic Data in Cellular Networks,” IEEE Trans. Wireless Communications, March 2017. PDF

3. Zhifeng Zhao, Meng  Li, Rongpeng Li, and Yifan Zhou, “Temporal-Spatial Distribution Nature of Traffic and Base Stations in Cellular Networks,” IET Communications, August 2017.

4. Chen Qi, Zhifeng Zhao, Rongpeng Li, and Honggang Zhang, “Characterizing and Modeling Social Mobile Data Traffic in Cellular Networks,” 2016 IEEE 83rd Vehicular Technology Conference (VTC-Spring 2016), Nanjing, May 2016. PDF