Mace & Crown

CS 725 - Information Visualization

Dr. Michele C. Weigle

View project on GitHub

Project Proposal

We are interested to know how the salary of faculty working at ODU varies across each department.
We also want to know the answers for some of the questions that are listed below:

  • What is the median salary for each department by position? How does it compare with that of the median salary for each department in other universities in U.S.A?
  • Geographic diversity of the faculty working at ODU. From which state/country did they obtain their degree from?
  • How does the faculty salary compare with that of the industry salary, both holding the same degree?
  • Which department has the highest median salary?

Ruby Script to extract data from ODU directory


  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
require 'mechanize'
require 'csv'

salary_data = CSV.read('salary_data.csv')
universities = CSV.read('postscndryunivsrvy2013dirinfo12.csv', encoding: "iso-8859-1:UTF-8", :headers => true)
states = CSV.read('state_table.csv', encoding: "iso-8859-1:UTF-8", :headers => true)
teachers = []

salary_data.each do |sd|
  teacher = {
    first_name: "",
    last_name: "",
    position: "",
    salary: "", 
    department: "",
    university: "",
    st_abbr: "",
    state: "",
    year: "",
    major: "",
    degree: ""
  }
  puts "-"*25
  mechanize = Mechanize.new

  first_name = sd[1]
  last_name = sd[0]

  teacher[:first_name] = first_name
  teacher[:last_name] = last_name
  teacher[:position] = sd[2]
  teacher[:salary] = sd[3]

  page = mechanize.get('https://www.odu.edu/directory?F_NAME='+first_name+'&L_NAME='+last_name+'&SEARCH_IND=E')

  # puts page.inspect

  link = ""
  page.search("table.bordertable tr:nth-child(3) td:first a").each do |a|
    link = a['href'] 
    # puts a['href']
    puts "Name - " + a.text.strip
  end

  page = mechanize.get('https://www.odu.edu'+link)

  # puts page.inspect

  img_url = ""
  page.search("section.alpha ul.left_column li:nth-child(1) img").each do |img|
    img_url = img['src']
    puts "Image Url - " + img_url
  end

  department = ""
  if img_url == ""
    page.search("section.alpha ul.left_column li:nth-child(3)").each do |a|
      department = a.text.strip
      puts "Department - " + department
    end
  else
    page.search("section.alpha ul.left_column li:nth-child(4)").each do |a|
      department = a.text.strip
      puts "Department - " + department
    end
  end

  teacher[:department] = department

  education = []
  page.search(".tab-content ul.ul_in_tab li.fas_education dl").each do |a|
    edu = {}
    university, year, major, degree = ""
    a.search('dt').each do |a|
      university, year = a.text.strip.split(',')
    end
    a.search('dd:first').each do |a|
      a.search('strong').remove
      major = a.text
    end
    a.search('dd:last').each do |a|
      a.search('strong').remove
      degree = a.text
    end
    edu = {
      'university' => (university || "").strip,
      'year' => (year || "").strip,
      'major' => (major || "").strip,
      'degree' => (degree || "").strip,
    }
    education << edu
  end

  puts "education"
  puts education.first.inspect

  if education.any?
    teacher[:university] = education.first['university']
    uni = universities.select{ |u| u['INSTNM'] == education.first['university'] }
    puts uni.inspect
    if uni.any?
      teacher[:st_abbr] = uni.first['STABBR']
    end
    state = states.select{ |u| u['abbreviation'] == teacher[:st_abbr] }
    puts state.inspect
    if state.any?
      teacher[:state] = state.first['name']
    end
    teacher[:year] = education.first['year']
    teacher[:major] = education.first['major']
    teacher[:degree] = education.first['degree']
  end

  teachers << teacher
  puts "#{teacher[:first_name]}  #{teacher[:last_name]}- Done"
end

puts teachers.inspect

CSV.open('processed_salary_data.csv', 'w') do |csv_object|
  csv_object << teachers.first.keys
  teachers.each do |row_array|
    csv_object << row_array.values
  end
end

What-Why-How framework

  • What : Data – Tables with Items & Attributes. Here the attributes will be used as filters and items are the values for each attribute.
  • The attribute types were mostly categorical and quantitative.
  • The entire dataset was a static file.
  • What : Derived – Combination of original attributes to form new attributes. For example combining all Masters and Doctorate degrees into just two different attributes.
  • Why : Tasks – Cross-attribute comparison, for example different colleges and departments. Find trends within the attributes or derived attributes.
  • How : Encode – Histograms, Line charts, Choropleth maps.
  • How : Reduce – Dynamic filtering and aggregation.
  • How : Manipulate – Navigate with zoom in and zoom out.
  • How : Facet – Multiple juxtaposed views with linked highlighting.

Planned milestone check-in dates

March 27th & April 27th

List of data sources

  • http://www.richmond.com/data-center/salaries-virginia-state-employees-2013/
  • http://www1.salary.com/Edu-Govt-and-Nonprofit-Industry-Education-Salaries.html
  • https://www.odu.edu/directory
  • http://www.payscale.com/
  • Mace & Crown
  • http://chronicle.com/article/2013-14-AAUP-Faculty-Salary/145679

Links to tool and technologies used

  • http://d3js.org/
  • http://www.tableau.com/

Link to the Visualization

Visualization 1

Visualization 2

Authors and Contributors

Prasanna Sajjan

Avinash Gosavi