ABSTRACT
The Internet of Things is predicted to consist of over 50 billion devices aiming to solve problems in most areas of our digital society. A large part of the data communicated is expected to consist of various multimedia contents, such as live audio and video. This article presents a solution for the communication of high definition video in low-delay scenarios (< 200 ms) under the constraints of devices with limited hardware resources, such as the Raspberry Pi. We verify that it is possible to enable low delay video streaming between Raspberry Pi devices using a distributed Internet of Things system called the Sensible Things platform.
Specifically, our implementation transfers a 6 Mbps H.264 video stream of 1280 × 720 pixels at 25 frames per second between devices with a total delay of 181 ms on the public Internet, of which the overhead of the distributed Internet of Things communication platform only accounts for 18 ms of this delay. We have found that the most significant bottleneck of video transfer on limited Internet of Things devices is the video coding and not the distributed communication platform, since the video coding accounts for 90% of the total delay.
MATERIALS AND METHODS
Our approach is based on sending H.264 baseline coded NAL units as P2P packets over the Sensible Things Platform using Raspberry Pi 2 model B devices with attached camera modules. We selected this particular hardware to verify that our approach is viable for typical IoT devices. The Sensible Things platform was chosen since it is an openly available middleware platform for creating distributed IoT applications, capable of very low delay communication.
An overview of our implementation can be seen in Figure 1; a Raspberry Pi 2 model B device with an attached camera module as the video source, the Sensible Things platform which will communicate the video data in a P2P manner, and finally a second Raspberry Pi 2 model B device, which will act as the video sink and render the video stream on a display connected via High-Definition Multimedia Interface (HDMI).
RESULTS
The recorded clock was then displayed on the second display (Samsung Sync Master SA450, Samsung, Seoul, South Korea) to be compared with the live clock. This comparison was possible as the two displays were recorded simultaneously with a 300 FPS camera, and saved for later analysis. The complete system delay video could be calculated by comparing the clock difference, which was done by investigating the recorded still frames of the two screens. A figure displaying the resulting view of the two displays can be seen in Figure 2.
The receiving Raspberry then fed the received data directly to the decoder and displayed it on the screen. See Figure 5 for an overview of how this measurement was set up. The measurements performed in this configuration showed that the total delay of encoding and decoding on two Raspberry Pi 2 model B devices with a network between them was on average 163 ms with a standard deviation of 18.5 ms. This was very close to the previous measurement without the network, which indicates that the network communication itself does not add any significant overhead if it is on a local gigabit speed network.
DISCUSSION
This article focused on the problem of communicating high definition live multimedia for IoT applications in scenarios with low delay under the constraints of typical IoT devices and hardware. That this is possible was shown by sending H.264 NAL units over a P2P-based IoT communication system on a typical IoT device. This article has also shown that our approach satisfies the three stated requirements. It has a low source to sink delay, which was requirement 1.
We measured a 181 ms delay from source to sink, if both the source and sink are behind NAT networks. The transferred video was of a high definition quality of 1280 × 720 at 25 FPS, which was requirement 2. Finally it satisfies requirement 3, because it was shown to work on a Raspberry Pi 2 model B device, which can be considered a typical IoT devices with resource constrained hardware. In conclusion, when using a fully distributed IoT system 90% of the total delay is due to the encoding and to the decoding of the video.
Source: Mid Sweden University
Authors: Ulf Jennehag | Stefan Forsstrom | Federico V. Fiordigigli