WIFI Contact Prediction

(Wojtusiak, Wang, Vakkalagadda, Durbha, Alemi, Roess)

Our team is studying how to utilize WIFI infrastructure data to predict potential contacts between people. COVID-19 pandemics made everyone realize public health requires being able to utilize existing datasets. In this specific case we utilize data that are routinely collected by IT departments as part of network maintenance – access logs for the WIFI networks. Such logs can tell us who was at a specific location at a given time. Then when needed, the data can be used to backtrack who else was there. This can help in activities such as contact tracing during a disease outbreak.

The key part of the problem is that WIFI data are not very accurate. Even though there may be 100+ WIFI access points in a large building, each of them covers many rooms. This does does not help in pinpointing exact person’s location. This is what this research is about.

It is also important to note that there is a lot of published work on increasing accuracy of WIFI location tracking. Our goal is to ignore most of it: our intention is not to use specialized apps on cell phones (to triangulate location) or to install additional scanning devices or modify network infrastructure in any way. This is costly and often infeasible.  Our intention is to use only data that are collected by the networks by default, and this is essentially just access logs. In addition, if available, we also use floorplan information and known locations of access points. Experimental results show that knowing floorplans significantly increases performance of the method.

Here is an example of how a person enters a building and moves around. The short animation is based on real movements of our study participant. It shows when the person’s cell phone connects to access points in different places.

How does it work?

First, floorplans are mapped to graphs. This is a relatively easy process although it takes some time. Mapping of one large 500+ room building may take a few hours. The saved graph data are in the form of a list of connected rooms and used later by the prediction algorithm.

Then, WIFI log data are extracted. This needs to be done by network support. Many enterprise networks can be used to extract logs automatically on a scheduled basis (we suggest daily for the purpose of contact prediction). Log data essentially includes information of who connected to which access point and for how long.

Once the all the data are in the prediction can start. The method first predicts where a person is likely to be and for how long. It is all probabilistic because of WIFI data are not very precise. The method assigns probabilities to locations within AP range, and then predicts how a person moved between these locations. People don’t disappear in one place and magically appear in another. Our algorithm predicts which ways are likely the person for from one location to another.

Finally, once we predict who was where we can predict who was in contact with whom. We calculate a contact score. In the simplest form it is how likely two people were in contact multiplied by how long. More complex ways of calculation take into consideration other factors such as type of location.

This all sounds easy, but there a lot of technical details to work on. The report below shows some of our initial investigation.

This work would not be possible without sponsorship of National Cancer Institute through contract to Vibrent Health and our group. The movement data were collected thanks to help of study participants that made a lot of effort walking around the building and recording locations.

References

Wojtusiak, J., Vakkalagadda, V., Wang, Y., Durbha, S., Alemi, F. and Roess, A., “Towards Wi-Fi Contact Prediction: Methods and Initial Results,” Reports of the Machine Learning and Inference Laboratory, MLI 21-1, 2021.