Interactive Visualization using D3.js

Srinivas Havanur By: Srinivas Havanur

Figure 1: Overview-detail graph of Activity Tracker Dashboard

Abstract: The main aim of this project is to provide interactive, data-driven visualizations for a Department of Conservation client that can be used to gain insights about various statistics related to statewide conservation areas. Here, the interactive visualizations are produced using D3.js, a popular JavaScript visualization library for the web, to analyze activities over different conservation areas.

 

1. Introduction
The internal staff for this Department of Conservation are responsible for keeping conservation areas up to date so that recreation activities are easy to discover; people are safe, informed and acting within the law while visiting; and return visits are encouraged. Here, for the purpose of analysis we have used a dataset from an internal department application that is related to statewide activities.  The app allows staff to receive various activity statuses depending on a work order and helps them to approve/deny any requests made.
The visualizations produced here seek to help answer some of the following questions:
a. To identify different or specific activity statuses across conservation areas.
b. Know which area has the minimum or maximum number of activities.
c. To identify the different statuses of the activities in specific areas.
d. To get the volume of different conservation areas by region and county.

 

2. Dataset
The raw data was available in a structured format in a SQL Server database. There are over 1,500 conservation areas in state.  For data analysis purposes, we extracted around 200-300 conservation area records into a CSV format from various tables in the database. The R language is used to compile these CSV files to a JSON format which is more conducive for use with JavaScript and D3.

 

3. Overview of the Visualizations

3.1 Overview by Conservation Area ID
An overview bar chart gives details related to number of activities over different conservation areas. Additionally, a dropdown is added to provide sorting and filtering functionality based on conservation area IDs and the number of activities in ascending or descending order.


Figure 2: Overview bar graph with activities count

3.2 Detailed Visualization for Specific Conservation Area ID
The detailed visualization here is a pie chart, along with a table which shows different activity statuses such as pending for approval, pending for deletion, and approved activities. The table shows the details about the conservation area which includes the Area ID, name, and volume of activities. This detailed visualization also behaves like the overview chart, i.e. When you hover over any status in the pie chart, it updates the bar chart to give counts of activities with the selected status.  D3 makes this sort of data-driven interaction across multiple related charts simple and provides smooth animations while charts are transforming to fit a user’s filter.  This sort of visual aid helps draw the user’s attention to the changes occurring and tells a more compelling story with the data.


Figure 3: Detailed visualization of specific area.

 


Figure 4: Detailed visualization of specific activity status.

3.3 Bubble Chart for all Conservation Areas
This bubble chart represents quantitative details based on acres, region, and county.
a) Group by Acres: This bubble chart provides the details of all conservation areas based on the size of the acres so that it can be easily distinguished to identify which conservation area has large acres in size and which one has the least.
b) Group by Region: This bubble chart gives details about the number of conservation areas belonging to specific region and allows one to easily identify the region with the most/least conservation areas.
c) Group by County: Like with Group by Region, this chart gives details about the total number of conservation areas belonging to specific counties.


Figure 5: Bubble chart by acres, region and county

 


4. What-Why-How Framework
The what-why-how framework acts as a basic guideline and framework for visualization with any form of data. We will discuss how this framework was effectively applied here.  Additionally, a short summary of this can be found in Table 1.

 

4.1 What: Data
The original dataset is a multi-attribute table in a SQL Server database. The dataset consists of many quantitative attributes and has one-to-many relationships between area and activity related tables and there are around 1,500 active conservation areas. For the analysis, we extracted the data for about 200 to 300 conservation areas in the form of a CSV formatted data dump, and then compiled it into a JSON format with the required quantitative attributes.

 

4.2 What: Derived

The following derived attributes are computed from the final dataset used for visualizing the data.

  • Number of activities approved.
  • Number of activities which are in pending for approval status.
  • Number of activities which are in pending for delete status.
  • Total number of acres by region and by county.

 

System Interactive Visualization using D3.js
What: Data Table:
  • Multiple quantitative attributes.
  • Key attributes of multiple tables.
What: Derived
  • Number of activities approved.
  • Number of activities pending for approval.
  • Number of activities pending for delete.
  • Total number of acres by region and by county.

Why: Tasks

  • Present the overview of the dataset.
  • Locate an area.
  • Compare the activities of different areas.
  • Identify patterns and find the area with the most and least activities.
  • Summarize the data distribution.

How: Encode
 

  • Bar chart.
  • Pie chart.
  • Detail grid.
  • Bubble chart.
How: Manipulate
  • Select an interesting data range bin.
  • Select a certain number of bins to focus on using the horizontal scrollbar.
  • Reorder (by Area ID and Activity Frequencies).
How: Facet
  • Overview-detail.
  • Juxtapose multiple visualization.
How: Reduce
  • Filtering by Area ID, Activity, and Frequencies.
  • Filtering areas by acres size, region and county.

Table 1: What-Why-How framework

 

4.3 Why: Abstract Tasks
There are three main actions that users can do with this visualization:  analyze, search, and query. In analyze, users can present the overview of data so that they will be able to choose interesting items from the data for further analysis. In search, users can locate the areas of interest by selecting a desired data range bin in the bar chart. In query, users can identify patterns and find the area with most and least activities. Users could also compare the different areas through various bins to get the details about the activities of summarized data such as the total number of activities, total number of specific statuses, such as pending for approval, pending for delete, and approved.

 

4.4 How: Encode
The following are the visualization charts used to visualize the data:

  1. Bar chart: This overview chart shows details about quantitative data which includes count of overall activities or with specific statuses (Pending for approval, approved, pending for delete) on y-axis and Area IDs on x-axis.
  2. Pie chart and Table: This acts as both detail and overview graph which shows the number of activities with different statuses and, when hovered over, any slice of pie chart will update the stats and depiction of the bar chart.
  3. Bubble chart: This chart is used to determine quantitative sizes by acres, region, and by county.

 

4.5 How: Manipulate, Facet, Reduce
To manipulate the dataset, users can select how many “bins” (grouping of data based on a value range) they want to see using the horizontal scrollbar to select a bin of interest by hovering on a specific bin. Users could also reorder the bins based on the dropdown filter at the top to filter by Area IDs and activity frequencies. To facet, we used overview and detail; juxtapose multiple views and share navigation: The overview bar chart and pie chart.

 

5. Conclusion
We observed how these interactive visualizations are used to analyze the activities across different areas and help a user to manage the activity statuses across these areas. In addition, it also provides quantitative stats about each of the conservation areas. These visualizations were designed using D3.js technology. R tools were used to compile the extracted dataset into the required JSON data format. Various interaction idioms have been used to provide user-friendly experience while exposing valuable insights on agency data.


References

  1. Data-Driven Documents
  2. jQuery
  3. Mike Bostock

 

About the Author:

Srinivas Havanur is an experienced Full-stack developer at Timmons Group. He holds a Master’s degree in the field of Computer Science from Old Dominion University in Norfolk, VA. His area of expertise is in developing responsive web applications using different stack of technologies and to develop interactive visualizations using different JavaScript libraries and tools.

Share

twitter
facebook
googleplus
linkedin
email