Dash Map Development

Trying to develop a combination map + 24-hr line graph using plotly's dash . The map will show route segments from the City's Bluetooth sensors (from the Open Data and when the user clicks on a given segment, a plot of the average travel times over 24 hours will be displayed for that segment.

In [1]:
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go
init_notebook_mode(connected=True)

I downloaded the 2017 data and the WGS84 shapefile for the routes. Since I'm remote I'm going to reimport the data into a PostgreSQL database on my local computer. After unzipping the shapefile, I imported it into the database using the following command in bash:

shp2pgsql -S -s 4326 -D bluetooth_routes_wgs84.shp bluetooth.routes | psql -d bigdata

The sql we used to create the data exports can be found on the City of Toronto's Github. Note that, in order to make the exported data match as close as possible the data from the live feed, a little switcheroo was performed:

REPLACE(to_char( (datetime_bin AT TIME ZONE 'America/Toronto' + interval '5 minutes'), 'YYYY-MM-DD HH24:MI:SSOF'), ' ', 'T')

So this will have to be switched back in order to import that data into the database.

In [2]:
from psycopg2 import connect
import psycopg2.sql as pgsql
con = connect(database='bigdata')
In [5]:
import pandas
import pandas.io.sql as pandasql
import sqlalchemy
In [6]:
bt_2017 = pandas.read_csv('bt_2017.csv')
bt_2017
Out[6]:
resultId timeInSeconds count updated
0 J_I 56 16 2017-01-01T00:05:00-05
1 J_I 60 9 2017-01-01T00:10:00-05
2 J_I 60 8 2017-01-01T00:15:00-05
3 J_I 60 8 2017-01-01T00:20:00-05
4 J_I 55 17 2017-01-01T00:25:00-05
5 J_I 53 15 2017-01-01T00:30:00-05
6 J_I 58 13 2017-01-01T00:35:00-05
7 J_I 55 18 2017-01-01T00:40:00-05
8 J_I 58 22 2017-01-01T00:45:00-05
9 J_I 54 19 2017-01-01T00:50:00-05
10 J_I 59 19 2017-01-01T00:55:00-05
11 J_I 57 11 2017-01-01T01:00:00-05
12 J_I 54 11 2017-01-01T01:05:00-05
13 J_I 59 15 2017-01-01T01:10:00-05
14 J_I 54 12 2017-01-01T01:15:00-05
15 J_I 57 23 2017-01-01T01:20:00-05
16 J_I 58 11 2017-01-01T01:25:00-05
17 J_I 51 15 2017-01-01T01:30:00-05
18 J_I 55 17 2017-01-01T01:35:00-05
19 J_I 53 8 2017-01-01T01:40:00-05
20 J_I 58 23 2017-01-01T01:45:00-05
21 J_I 52 21 2017-01-01T01:50:00-05
22 J_I 60 10 2017-01-01T01:55:00-05
23 J_I 57 10 2017-01-01T02:00:00-05
24 J_I 49 11 2017-01-01T02:05:00-05
25 J_I 55 13 2017-01-01T02:10:00-05
26 J_I 50 13 2017-01-01T02:15:00-05
27 J_I 52 19 2017-01-01T02:20:00-05
28 J_I 56 11 2017-01-01T02:25:00-05
29 J_I 52 10 2017-01-01T02:30:00-05
... ... ... ... ...
368339 BR2_BR3 354 1 2017-01-31T19:05:00-05
368340 BR2_BR3 484 1 2017-01-31T19:10:00-05
368341 BR2_BR3 390 1 2017-01-31T19:15:00-05
368342 BR2_BR3 281 2 2017-01-31T19:25:00-05
368343 BR2_BR3 300 1 2017-01-31T19:30:00-05
368344 BR2_BR3 300 1 2017-01-31T19:35:00-05
368345 BR2_BR3 249 3 2017-01-31T19:50:00-05
368346 BR2_BR3 275 1 2017-01-31T20:10:00-05
368347 BR2_BR3 274 1 2017-01-31T20:15:00-05
368348 BR2_BR3 252 1 2017-01-31T20:20:00-05
368349 BR2_BR3 256 2 2017-01-31T20:25:00-05
368350 BR2_BR3 243 4 2017-01-31T20:30:00-05
368351 BR2_BR3 183 1 2017-01-31T20:40:00-05
368352 BR2_BR3 266 1 2017-01-31T20:45:00-05
368353 BR2_BR3 210 1 2017-01-31T20:50:00-05
368354 BR2_BR3 226 1 2017-01-31T21:00:00-05
368355 BR2_BR3 153 1 2017-01-31T21:10:00-05
368356 BR2_BR3 229 3 2017-01-31T21:25:00-05
368357 BR2_BR3 202 1 2017-01-31T21:30:00-05
368358 BR2_BR3 259 3 2017-01-31T21:45:00-05
368359 BR2_BR3 240 1 2017-01-31T21:50:00-05
368360 BR2_BR3 216 1 2017-01-31T22:10:00-05
368361 BR2_BR3 255 1 2017-01-31T22:15:00-05
368362 BR2_BR3 241 1 2017-01-31T22:20:00-05
368363 BR2_BR3 249 4 2017-01-31T22:30:00-05
368364 BR2_BR3 248 2 2017-01-31T22:45:00-05
368365 BR2_BR3 236 2 2017-01-31T22:55:00-05
368366 BR2_BR3 328 2 2017-01-31T23:05:00-05
368367 BR2_BR3 644 1 2017-01-31T23:35:00-05
368368 BR2_BR3 202 1 2017-01-31T23:55:00-05

368369 rows × 4 columns

In [9]:
bt_2017['observed_time'] = bt_2017['updated'].str.replace('T', ' ')

del bt_2017['updated']

bt_2017.to_csv('bt_2017_fixed.csv', index=False)

I imported the fixed csv into PostgreSQL and checked the span of the data (January 2017). So we will create an average 24hr weekday plot for a segment on Richmond, BR2_BR3.

In [6]:
weekday_avg_sql = pgsql.SQL('''
SELECT updated::TIME AS "Time", AVG(timeinseconds)
FROM bluetooth.bt_2017
WHERE resultid = {resultid} and EXTRACT('isodow' FROM updated) <6
GROUP BY "Time" 
ORDER BY "Time" ''')
In [7]:
richmond_jan = pandasql.read_sql(weekday_avg_sql.format(resultid=pgsql.Literal('BR2_BR3')), con)
In [26]:
trace = go.Scatter(x=richmond_jan['Time'],
                   y=richmond_jan['avg'],
                   mode='lines')
layout = dict(title = "Average Weekday Travel Times",
              xaxis = dict(title="Time of Day"),
              yaxis = dict(title="Travel Time (s)"))
In [27]:
iplot(dict(data=[trace], layout=layout), filename='graph1.html')
plot(dict(data=[trace], layout=layout), filename='graph1.html', auto_open=False)
Out[27]:
'file:///home/rad/git/bdit/dash-env/graph1.html'

Great! Now testing to run this in Dash I saved this notebook as a python file and then added Dash imports:

import dash
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html

app = dash.Dash()

And then adding to the bottom:

app.layout = html.Div(children=[dcc.Graph(id='travel-time-graph',
                                          figure=dict(data=[trace], layout=layout))])

if __name__ == '__main__':
    app.run_server(debug=True)

Finally running the Dash app from the command-line with python graph_map.py and opening up a browser... surprise! it works!

(yes I know there's an issue with missing data at 2:30 AM, I'll deal with that after)

The map

Now that we've created the simple line graph, let's see if we can display the segments on a plotly map. We're going to try to read the geometry from the database. From the plotly mapping documentation latitude and longitude have to be passed separately as:

lon (list, numpy array, or Pandas series of numbers, strings, or datetimes.)

This isn't really how most geographic libraries operate, instead requiring arrays of coordinates, so this approach might be a little... hacky.

In [10]:
geometry_sql = pgsql.SQL('''SELECT resultid, ST_ASGeoJSON(geom) geojson
FROM bluetooth.routes
WHERE resultid = {resultid} ''')
In [11]:
resultid = pgsql.Literal('BR2_BR3')
In [12]:
richmond_df = pandasql.read_sql(geometry_sql.format(resultid=resultid), con)

I'll be basing myself on this tutorial which, honestly, is pretty hacky.

In [13]:
import json
In [14]:
row = richmond_df.iloc[[0]]
In [17]:
# row['geojson'][0]
geojson = json.loads(row.geojson[0])
geojson['coordinates'][0]
Out[17]:
[-79.3691483292635, 43.6533779496959]
In [18]:
def get_lat_lon(geojson):
    lons, lats = [], []
    for coord in geojson['coordinates']:
        lons.append(coord[0])
        lats.append(coord[1])
    return lats, lons
In [19]:
segments = []

for row in richmond_df.itertuples():
    geojson = json.loads(row.geojson)
    lats, lons = get_lat_lon(geojson)
    segments.append( dict(type = 'scattergeo',
                          lon = lons, lat = lats,
                          mode = 'lines',
                          line = dict(width = 2)
                         ))
In [25]:
layout = dict(
        title = 'Bluetooth segments',
        showlegend = False,
        geo = dict(
            resolution=50,
            showland = True,
            showlakes = True,
            showcountries = True,
            scope = "North America",
            projection = dict(type = "mercator"),
            lonaxis = dict( range = [-79.39, -79.36]),
            lataxis = dict( range = [43.64, 43.66])
        )
    )
iplot(dict(data=segments,layout=layout), filename='map1.html')
plot(dict(data=segments,layout=layout), filename='map1.html')
Out[25]:
'file:///home/rad/git/bdit/dash-env/map1.html'
In [22]:
mapbox_token = 'pk.eyJ1IjoicmVtb3RlZ2VudHJpZnkiLCJhIjoiY2lnanJzMjJpMDA1dnYxbHo5MTZtdGZsYSJ9.gLE8d40zmDAtMSSZyd2h1Q'


data = go.Data([
    go.Scattermapbox(
        lat=lats,
        lon=lons,
        mode='lines',
        hoverinfo='text',
        showlegend=False
    )])
    
layout = go.Layout(
    title='Bluetooth Segments',
    autosize=True,
    hovermode='closest',
    mapbox=dict(
        accesstoken=mapbox_token,
        bearing=0,
        center=dict(
            lat=43.65,
            lon=-79.37
        ),
        pitch=0,
        zoom=12,
        style='light'
    )
)    

iplot(dict(data=data,layout=layout), filename='jupyter/map_better')
In [24]:
plot(dict(data=data,layout=layout), filename='map_better.html')
Out[24]:
'file:///home/rad/git/bdit/dash-env/map_better.html'