Biometric Mapping III

This is Part 3 in a series of updates on the Biometric Mapping project. See Part I and Part II

Biometric Mapping is a workflow that creates heat maps of the unseen qualities of spaces. For example: air quality data such as CO2, temperature, and humidity, or biometric data such as an occupant's heart rate, blood oxygen level, and neural activity. 

confluence cover - Copy.png

This project is an attempt to expand the scope, precision, and volume of data we can get out of our pre- and post-occupancy program. We are currently limited to anecdotes, surveys, energy monitoring, and on-site measurement with hand-held devices. By pairing data-collecting sensors with a timestamp and indoor positioning device, we can visualize this data spatially.

Previous prototypes of this workflow used GPS as the positioning device to match a data point with a location, but GPS is too imprecise for mapping interior spaces. The challenge then is to replace GPS with a practical positioning method. The following methods were considered as alternatives. 

dots1.JPG

Indoor positioning with a WiFi signal is feasible, but it requires the placement of beacons within a space before every survey. The ideal positioning system would stand alone as its own device and not require any set up work.

SLAM is an acronym for Simultaneous Localization And Mapping. It's a method used in robotics and autonomous vehicles in which a moving object uses sensors such as cameras, radar, or lidar (like radar, but with lasers) to map its surroundings, and then understand how it is moving relative to those surroundings. The choice between lidar versus a camera is a choice between investing more heavily in hardware versus software. Lidar requires more impressive hardware, with less work to process the data, and vice versa. Lidar is more expensive and not as universally available as video on a mobile phone, making vision-based SLAM more ideal, if not as easy.

A stationary array of data-collecting sensors bypasses active positioning altogether, though the trade-off is the investment in installation and limited area of collection. Finally, a human-powered compromise of manually marking one's location on a printed plan or touchscreen as data is being collected and timestamped separately.

Vision-based SLAM (loosely synonymous with Structure From Motion (SfM) or Photogrammetric Range Imaging) is perhaps the most challenging option (especially without a background in robotics or computer vision), but is the most promising considering the ease and ubiquity of collecting data from a mobile phone. There is also reason to be more optimistic about upcoming advances in software versus hardware. Many open-source visual SLAM packages have become available in recent years and are likely to continue improving.

t1.PNG

Our toolkit for collecting spatial data pre-occupancy or post-occupancy can be as robust as we want it to be, but each device comes with a price tag. We are currently limited both by cost and the amount of data we can collect. Most of the devices listed above are not inexpensive, readily available, or discrete, and do not include a built-in geo-location and timestamp. The only way to gain valuable insights from spatial data collection is by collecting a lot of it. This requires an easy workflow with universally available devices.

t2.PNG

The goal is to get the most data with as few devices as possible. The most versatile device is the mobile phone. A single video collects data such as sound (decibel level, sound recognition), image (color, light, vegetation, face, and object recognition), movement (from the built-in gimbal and accelerometer), rough global positioning from GPS, and more precise spatial localization and mapping from computer vision techniques that turn multiple video frames into a 3D point cloud of the space.

VID_20180511_1615394510136.gif

For this test, I have three devices; a mobile phone for video, an Arduino microcontroller for light and temperature, and a Fitbit watch recording heart rate. Each device has a universal timestamp so they can be synchronized later. The Arduino starts recording data as soon as I plug it in to the battery pack, and I then I start recording video.

cp1.jpg

I'm using Structure-from-Motion software called COLMAP 3.4, a pipeline with a graphical interface that inputs photos and outputs a 3D point cloud. COLMAP has three major processes: 1) Each frame of the video is analyzed to determine the focal length, dimensions, and other properties of the image. 2) Unique features are identified in each frame and matched with unique features in other frames. 3) When the same unique feature appears in multiple frames, its position is triangulated and a point is created with position and color value. The reconstruction process continues, occasionally correcting for errors as the point cloud is solidified.

The reconstruction works best when the video frames are in focus, and the object or space is photographed from several vantage points.

As of now, the reconstruction in COLMAP is the most time-consuming part of the process. The following test at Confluence Park in San Antonio was a 10 minute video, with about 4,000 individual frames. It took a few hours to match unique features, and at least 24 hours for a complete reconstruction of about 300,000 points. Of course, 4,000 frames is far more than necessary to identify the path of the camera, which is all I need to construct the map of data points. Even so, the current workflow with COLMAP is a bottleneck that needs to be replaced.

gh77.png

When the reconstruction is done, I export the point cloud and camera path data, and process it in Grasshopper. This script synchronizes the location data with the corresponding light, temperature, and heart rate data, and produces a heat map of each metric. The area of the map outside of the path is interpolated with a non-linear regression model. I'm forgoing the Proving Ground Lunchbox ML regression model components that I used in previous scripts in favor of a more simple model that runs faster.

There were no surprises with the daylight map, ranging from about 1,000 lux blue to about 100,000 lux in red. North is to the right on this map.

The temperature map produced a nice gradient from about 90 degrees on the sunlit, paved north side, to about 80 degrees on the vegetated south side.

My heart rate fluctuated between 70 and 90 bpm over the 10 minute survey time. While biometrics are some of the most compelling data to be collected and mapped, they are also the most variable. A large sample size of biometric data is necessary before any real correlation can be drawn between the spaces we inhabit and our momentary biological response.

My next steps are to streamline the SLAM workflow to get it as close to real-time processing as possible. Volume of data is critical for this kind of work, so it needs to be processed quickly and easily. I might expand the types of biometrics being collected, for example galvanic skin response devices to approximate stress levels. There is also a lot of data to be mined from the video itself, such as color, sound, and object recognition.

Other fun experiments might be recording video with a 360 degree camera for more efficient mapping, and viewing the model and map in VR.