Course Syllabus

Course Description

In this course, we will explore the moral, social, and ethical ramifications of the choices we make at the different stages of the data analysis pipeline, from data collection and storage to understand feedback loops in analysis. Through class discussions, case studies and exercises, students will learn the basics of ethical thinking in science, understand the history of ethical dilemmas in scientific work, and study the distinct challenges associated with ethics in modern data science.

When: MWF 9:40-10:30

Where: WEB L110

Instructor: Suresh Venkatasubramanian

Co-Instructor: Katie Shelef

Office hours: (SV) MEB 3404, MF 3-4pm (note that on holidays like Labor Day there will be no office hours)

Text: Ethics for the Information Age, 7th Ed., by Michael Quinn


  • Weekly writing (or coding) assignments: (60%)
  • Class participation (including scribe notes) (10%)
  • Project (2 people max): a case study of ethical decision-making in data analysis (30%)

Course Outline

  1. A quick tour through the foundations of ethics
  2. The data collection process
  3. Doing ethical data analysis
  4. Acting on your predictions
  5. Remedies and Responsibilities

Guidelines for Class Discussion

In this class we will often touch on issues that are controversial, touch on diverse and strongly held beliefs, and address deeply personal issues of identity and culture. While we want to have a healthy and vigorous debate, we must be able to express our views without attacking others in a personal way. To that end, I've prepared some guidelines for class discussion


Class rosters are provided to the instructor with the student’s legal name as well as “Preferred first name” (if previously entered by you in the Student Profile section of your CIS account).  While CIS refers to his as merely a preference, I will honor you by referring to you with the name and pronoun that feels best for you in class, on papers, exams, group projects, etc.  Please advise me of any name or pronoun changes (and please update CIS) so I can help create a learning environment in which you, your name, and your pronoun will be respected


Assignments will for the most part be essays that answer specific questions based on the assigned readings. Each assignment will generally be no more than 2 pages long (11 pt, single spaced) and should be turned in electronically (in PDF format, either generated directly or exported from another text editing mechanism). 

Assignments will be graded based on your facility in

  • Summarizing the problem statement or issue 
  • considering context and assumptions inherent in the topic
  • communicating your own perspective or position.
  • justifying your answer with evidence
  • using other perspectives to add context to your answer
  • following through on implications and consequences where they lead you
  • communicating effectively (with good organization, clean presentation and effective language)



For your project, I'd like you to undertake a more detailed analysis of the ethical considerations in a data science setting of your choice. As an example of what you might want to aspire to (although you may not be able to achieve the level of detail in these articles), I present three case studies developed by the Council on Big Data, Ethics and Society. 

  1. No Encore for Encore? Ethical questions for web-based censorship measurement

  2. The Ethics of Using Hacked Data: Patreon’s Data Hack and Academic Data Standards

  3. "It was a matter of life or death": A YouTube engineer's decision to alter data in the "It Gets Better" project.

These are merely ideas for how you might approach a particular scenario. But you should feel free to choose other topics/formats. 

Lecture Outline

  1. Aug 23: Introduction to the class, logistics. Overview of the course

  2. Aug 25: Discussion of reading, introduction to ethical frameworks

    • Reading: EIA Chapter 2.1-2.4
  3. Aug 28: Utiliitarianism (by action and by rule)

  4. Aug 30: Utilitarianism (continued), social contracts. 

    • Reading: EIA Chapter 2.9
  5. Sep 1: Rawls and ideas of fairness in society

  6. Sep 6: Kant, deontological ethics and the categorical imperative

  7. Sep 8: Kant, continued. 

  8. Sep 11: Virtue Ethics

  9. Sep 13: Data Collection: where do ethical conundrums arise in the process of collecting data. 

  10. Sep 15: Data Collection (continued)

  11. Sep 18: Data as commodity: Data Brokers

  12. Sep 20: Data as (personal) property: De-identification and anonymization

  13. Sep 22: Data as property continued.

  14. WEEK of Sep 25-29: Suresh is traveling (as it turns out, to an event on AI ethics). 
  15. Oct 2: Data as public resource: News and Medicine

  16. Oct 4: The Ethics of Data Analysis: Science and Behavior

  17. Oct 6: The Ethics of Data Analysis: Science and Behavior, continued

  18. Oct 16: The Mechanics of Data Analysis: Collection

  19. Oct 18: The Mechanics of Data Analysis: Model building

  20. Oct 20: The Mechanics of Data Analysis: Prediction and Feedback

  21. Oct 23: Data is humans: The history of human experimentation

  22. Oct 25:  Codes of ethics in medical experimentation
  23. Oct 27:   Modern experiments on humans
  24. Nov 13: Auditing black box models
  25. Nov 15: Auditing black-box models (continued)
  26. Nov 20: Codes of Conduct
  27. Nov 22: Fiduciary Roles
  28. Nov 27: Data Scientists as Security Consultants
  29. Nov 29:  A Scenario
  30. Dec 1: No class: Instead, you should all attend this talk on the foundations of data science. 


Course Summary:

Date Details Due