SVDS

Trainspotting: real-time detection of a train’s passing from video

PyCon 2016

Chloe Mawer | @chloemawer | chloe@svds.com

CaltrainHeader

The Caltrain obsession

AppScreenShots
Tweet



Window

Did the train really leave?



Window


Did the train really leave?


Agenda

Window

OpenCV

Motion detection in Python: OpenCV

  • Written in optimized C/C++.
  • C++, C, Python and Java interfaces.
  • Windows, Linux, Mac OS, iOS and Android.
  • Free for academic and commercial use.
  • Can use multi-core processing.
  • Designed for computational efficiency with a strong focus on real-time applications.
  • To install, use Anaconda!
    • conda install -c https://conda.binstar.org/menpo opencv3

If you are not obsessed with trains...

  • Many other applications
  • Following along

The original video

In [ ]:
for j, frame in enumerate(original):
    
    cv2.imshow('Original video', frame)
    
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

cv2.destroyWindow('Original video')

Key

  • frame: An image represented by a M x N x 3 numpy ndarray.
  • original: A list containing a series of frames that make up the video.
  • cv2.imshow(): Displays an input frame.
  • cv2.waitKey(): Used to introduce delay between frames.
  • cv2.destroyWindow(): Will close a given window.

The original video

If you are not obsessed with trains...

Get immediate feedback from your computer's camera!

In [ ]:
feed = cv2.VideoCapture(0) 

The following will read and display the feed in a window called 'Me' until you press q to quit:

In [ ]:
#feed.read()[0] is True unless there is no frame to grab (e.g. camera not working)
while feed.read()[0]:
    # Grab the current frame
    current_frame = feed.read()[1]
    
    # Show the current frame in a window called "Me"
    cv2.imshow('Me', current_frame) 
    
    # Pauses to make computer time = real time
    # and allows pressing "q" on the keyboard
    # to break the loop
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

You must close the window with the following command when done:

In [ ]:
cv2.destroyWindow('Me')

To discontinue the feed capturing video, you must release it:

In [ ]:
feed.release()

Steps for train detection

  1. Is something moving?
  2. Is that moving thing a train?
  3. In what direction is it moving?

1. Is something moving?

A. Develop a model of the background.
B. Identify pixels in the current frame that do not match the background.

A. Develop a model of the background

Convert to one channel

In [ ]:
for j,frame in enumerate(original):
    
    # Convert the frame to one channel (e.g. gray scale)
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    cv2.imshow('Gray scale',gray_frame)
    
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

cv2.destroyWindow('Gray scale')

Convert to one channel

Computer Camera

In [ ]:
while True:
    current_frame = feed.read()[1]
    gray_frame = cv2.cvtColor(current_frame, cv2.COLOR_BGR2GRAY)
    cv2.imshow('Gray scale',gray_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

cv2.destroyWindow('Gray scale')

Smooth the image

Gaussian Blur
2D Gaussian Gaussian kernel

\begin{equation} G(x,y) = Ae^{\frac{-(x-\mu_x)^2}{2\sigma_x^{2}}+\frac{-(y-\mu_y)^2}{2\sigma_y^{2}}} \end{equation}

Implement in Python:

smooth_frame = cv2.GaussianBlur(gray_frame, (kx, ky), 0)

Smooth the image

Smooth images

Smooth the image

In [ ]:
k = 31 # Define Gaussian kernel size

for j, frame in enumerate(original):
    
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    # Apply a Gaussian blur to the gray scale frame 
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    
    cv2.imshow('Smooth', smooth_frame)

    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break
        
cv2.destroyWindow('Smooth')

Computer camera

In [ ]:
k = 31
while True:
    current_frame = feed.read()[1]
    gray_frame = cv2.cvtColor(current_frame, cv2.COLOR_BGR2GRAY)
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    cv2.imshow('Smooth',smooth_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break
In [ ]:
cv2.destroyWindow('Smooth')

Smooth the image

Calculate the running average

At each time step, t, calculate at each pixel (x,y): \begin{equation} \mathbf{R}(x,y,t) = (1-\alpha)\mathbf{R}(x,y,t-1) + \alpha\mathbf{F}(x,y,t) \end{equation}

  • $\mathbf{R}:$ Running average
  • $\mathbf{F}:$ Frame being added to the running average
  • $\alpha:$ How heavily to weight the new frame

Implement in Python:

cv2.accumulateWeighted(np.float32(smooth_frame), running_avg, alpha)

Calculate the running average

In [ ]:
alpha = 0.2 # Define weighting coefficient
running_avg = None # Define variable for running average

k = 31
for j, frame in enumerate(original):
    
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    
    # If this is the first frame, making running avg current frame
    # Otherwise, update running average with new smooth frame
    if running_avg is None:
        running_avg = np.float32(smooth_frame) 
    else:
        cv2.accumulateWeighted(np.float32(smooth_frame), running_avg, alpha)
    
    cv2.imshow('Running average', cv2.convertScaleAbs(running_avg))

    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

cv2.destroyWindow('Running average')

Computer camera

In [ ]:
alpha = 0.02 # Define weighting coefficient
running_avg = None # Define variable for running average

k = 31
while True:
    current_frame = feed.read()[1]
    gray_frame = cv2.cvtColor(current_frame, cv2.COLOR_BGR2GRAY)
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    
    # If this is the first frame, making running avg current frame
    if running_avg is None:
        running_avg = np.float32(smooth_frame) 
    else:
        cv2.accumulateWeighted(np.float32(smooth_frame), running_avg, alpha)
        
    cv2.imshow('Running Average', cv2.convertScaleAbs(running_avg))
    
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break
In [ ]:
cv2.destroyWindow('Running Average')

Calculate the running average

B. Identify pixels in the current frame that do not match the background.

Calculate the difference

Background and foreground Implement in Python:

diff = cv2.absdiff(np.float32(smooth_frame), np.float32(running_avg))

Calculate the difference

Difference frame

Calculate the difference

In [ ]:
alpha = 0.2 
running_avg = None 
k = 31
for j, frame in enumerate(original):

    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    
    # If this is the first frame, making running avg current frame
    if running_avg is None:
        running_avg = np.float32(smooth_frame) 
        
    # Find |difference| between current smoothed frame and running average
    diff = cv2.absdiff(np.float32(smooth_frame), np.float32(running_avg))
    
    # Then add current frame to running average after
    cv2.accumulateWeighted(np.float32(smooth_frame), running_avg, alpha)

    cv2.imshow('Difference', diff)

    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

cv2.destroyWindow('Difference')

Computer camera

In [ ]:
alpha = 0.02 # Define weighting coefficient
running_avg = None # Define variable for running average

k = 31
while True:
    current_frame = feed.read()[1]
    gray_frame = cv2.cvtColor(current_frame, cv2.COLOR_BGR2GRAY)
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    
    # If this is the first frame, making running avg current frame
    if running_avg is None:
        running_avg = np.float32(smooth_frame) 
    
    # Find absolute difference between the current smoothed frame and the running average
    diff = cv2.absdiff(np.float32(smooth_frame), np.float32(running_avg))
    
    # Then add current frame to running average after
    cv2.accumulateWeighted(np.float32(smooth_frame), running_avg, alpha)
        
    cv2.imshow('Difference', diff)
    
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

Calculate the difference

Threshold the difference

Deciding what constitutes motion and what is just noise. Difference frame

Threshold the difference

In []:
thresh = 35 # All pixels with differences above this value will be set to 1

alpha = 0.2
running_avg = None 
k = 31
for j, frame in enumerate(original):
    
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    
    if running_avg is None:
        running_avg = np.float32(smooth_frame) 
    
    diff = cv2.absdiff(np.float32(smooth_frame), np.float32(running_avg))
    cv2.accumulateWeighted(np.float32(smooth_frame), running_avg, alpha)
    
    # For all pixels with a difference > thresh, turn pixel to 1, otherwise 0
    _, subtracted = cv2.threshold(diff, thresh, 1, cv2.THRESH_BINARY)
    
    cv2.imshow('Thresholded difference', subtracted)
    
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

cv2.destroyWindow('Thresholded difference')

Computer camera

In [ ]:
alpha = 0.02 # Define weighting coefficient
running_avg = None # Define variable for running average

k = 31
while True:
    current_frame = feed.read()[1]
    gray_frame = cv2.cvtColor(current_frame, cv2.COLOR_BGR2GRAY)
    smooth_frame = cv2.GaussianBlur(gray_frame, (k, k), 0)
    
    # If this is the first frame, making running avg current frame
    if running_avg is None:
        running_avg = np.float32(smooth_frame) 
    
    # Find absolute difference between the current smoothed frame and the running average
    diff = cv2.absdiff(np.float32(smooth_frame), np.float32(running_avg))
    
    # Then add current frame to running average after
    cv2.accumulateWeighted(np.float32(smooth_frame), running_avg, alpha)
    
    # For all pixels with a difference > thresh, turn pixel to 255, otherwise 0
    _, subtracted = cv2.threshold(diff, thresh, 1, cv2.THRESH_BINARY)
    
    cv2.imshow('Difference', subtracted)
    
    if cv2.waitKey(1) & 0xFF == ord('q'): 
        break

Threshold the difference

2. Is that moving thing a train?


2. Is that moving thing a train?

Fraction in motion

2. Is that moving thing a train?

Area threshold

2. Is that moving thing a train?

Area threshold

3. In what direction is it moving?

Area threshold

Define regions of interest (ROI)

Area threshold

Track activity in ROIs separately

Real-time direction detection

Fraction in motion

Real-time direction detection

  1. Initialize a dataframe, history.
  2. At each frame, t, add row to dataframe indicating if left and/or right meets motion threshold
  3. Repeat #2 until motion threshold met on either side for last T frames.
  4. The side with the detection for all T indicates the train's direction.
  5. Reinitialize the history dataframe to detect future trains.
  6. Return to #2
    Area threshold Area threshold

Real-time deployment

Raspberry pi setup

Cost ~ $130

Tweent

Detecting motion at night

Infrared photography

Detecting motion at night

Infrared photography
Photography from http://www.david-keochkerian.com/

PiCamera python package

import time
import picamera
import picamera.array
import cv2


camera = picamera.PiCamera()

# Adjust resolution of the camera
camera.resolution = (1024, 768)

# Adjust framerate of the camera 
camera.framerate = 30 # frames per second

# Save current frame to jpg
camera.capture('image-file.jpg')

# Start live streaming of video
camera.start_preview()

# Flip the camera
camera.vflip = True
camera.hflip = True

# Set the brightness and contrast 
camera.brightness = 60
camera.contrast = 30

Implementing the algorithm

with picamera.PiCamera() as camera:

    camera.start_preview()
    time.sleep(2)

    with picamera.array.PiRGBArray(camera) as stream:

        camera.capture(stream, format='bgr')

        frame = stream.array

        gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        ...

Video quality considerations

Recap of parameters

  • k : Size of the kernel used for Gaussian blur.
  • area_thresh : Minimum fraction of ROI that must be "in motion" to be considered train-like.
  • alpha : How heavily to weigh each new frame in running average.
  • history_length : Number of frames in motion required to be considered a train.

Resolution + k

Change k proportionally with the resolution.

Gaussian

Resolution and k

Gaussian

Resolution + area_thresh

Resized ROIs
Fraction in motion resized and original

Frame rate + alpha

  • Reminder: alpha determines how heavily to weigh new frame in running average.
  • As frame rate (fps) is lowered, alpha should increase.
  • For $fps_{new} =\frac{fps_{old}}{2}$, mathematically:
\begin{equation} \alpha_{new} = \alpha_{old}(2-\alpha_{old}) \end{equation}

Frames rate + history_length

Fraction in motion short

Frames rate + history_length

Fraction in motion short

Frame rate limitations

Requires one frame with train present in only one ROI.

Fraction in motion

What if the light changes?!

  • Background subtraction
    • Frame differencing
      • Good for indoors
      • Very fast computationally
      • Could train threshold for different lighting
    • Alternative: Adaptive background subtraction
  • Camera settings
    • Can adjust brightness and contrast of camera in real-time

Gaussian Mixture-based Background/Foreground Segmentation Algorithm (MOG, MOG2)

  • Each pixel considered separately
  • Background model (mixture of k Gaussian models) continually updated
  • Threshold is based on the pdf

Implement in Python:

cv2.createBackgroundSubtractorMOG2(history=500, varThreshold=16,detectShadows=False)

  • history = length of history to consider in describing the background model.
  • varThreshold = square of the number of standard deviations the pixel must be from its mean to be considered not background.

Gaussian Mixture-based Background/Foreground Segmentation Algorithm (MOG, MOG2)

mask = cv2.createBackgroundSubtractorMOG2(history=100, 
                                          varThreshold=25,
                                          detectShadows=False)
subtracted = frame.apply(mask)
MOG2 left right fractions

Must increase the history length to detect longer last objects

mask = cv2.createBackgroundSubtractorMOG2(history=400, 
                                          varThreshold=25,
                                          detectShadows=False)
subtracted = frame.apply(mask)
MOG2 left right fractions

cmawer.github.io/trainspotting

End

@chloemawer | chloe@svds.com

Appendix

Architecture

Tweent