Assignments

  • Project

    Final Portfolio

    In lieu of a written final exam, you will construct a course portfolio of three RMarkdown files containing annotated R functions, topic suggestions for a follow-up course to CDS-101, and a comparative discussion of two simulations.

  • Homework

    Homework 5

    For this homework assignment, you will be guided through the process of building a regression model that predicts the market value of condominiums in New York City using a dataset published by the New York City Department of Finance.

  • Homework

    Homework 4

    For this homework assignment, you will use statistical inference to answer a question about the National Survey of Family Growth, Cycle 6 dataset published by the National Center for Health Statistics.

  • Reading

    Reading 16

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading16

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, April 28th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 15

    Introductory Statistics with Randomization and Simulation

    Read the following:

    • From chapter 5: from the beginning through to the end of section 5.1.4, section 5.4.1

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading15

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Thursday, April 26th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 14

    Nature News Feature article

    Read the following article about p values:

    Reading discussion

    Instead of posting a question as we’ve done for the other readings, please respond to the following prompts:

    1. Had you ever heard of this situation concerning p-values before this class?

      • If this is the first time you’ve heard this, did you find this surprising, and does it affect how you feel about science? Explain.

      • If you have heard about this situation before, did the article change your perspective in any way? Explain.

    2. Based on the article, what practical things can we do to make sure our claims are accurate and transparent? Mention any quantities that we should compute and what kinds of details we should try to include in our RMarkdown notebooks.

    Students that write a full and thoughtful response that addresses both prompts will receive both a question and an answer credit. A full response consists of a minimum of two paragraphs, one for the first prompt and one for the second prompt. Each paragraph must have a minimum of three full sentences, and the content must be substantive. Posts that don’t fulfill these criteria will only be eligible for a question credit.

    Discussion hashtag
    #reading14


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 13

    Introductory Statistics with Randomization and Simulation

    Read the following:

    • From chapter 2: section 2.3 through to the end of section 2.5

    • From chapter 4: section 4.5 (skip 4.5.3)

    Reading discussion

    Discussion hashtag
    #reading13

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, April 19th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Homework

    Homework 3

    For this homework assignment, you will practice using the SelectorGadget Chrome extension to find the CSS selectors needed to scrape information from a webpage and use the rvest package to scrape data from the official Mason Patriots sports website.

  • Reading

    Reading 12

    Introductory Statistics with Randomization and Simulation

    Read the following:

    • From chapter 1: sections 1.3 (skip 1.3.4), 1.4.1, and 1.5

    Writeups

    Reading discussion

    Read the following writeups that supplement the content from reading 10:

    Discussion hashtag
    #reading12

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Sunday, April 15th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 11

    Introductory Statistics with Randomization and Simulation

    Read the following:

    • From chapter 2: from the beginning through to the end of section 2.2

    Reading discussion

    Discussion hashtag
    #reading11

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, April 14th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 10

    Writeups

    Read the following writeups on the probability mass function and cumulative distribution function:

    Reading discussion

    Discussion hashtag
    #reading10

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Thursday, April 12th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 9

    Tutorials

    Read the following tutorials on the rvest package and SelectorGadget Chrome extension.

    Beginner’s Guide on Web Scraping in R (using rvest) with hands-on example

    SelectorGadget
    Vignette


    Reading discussion

    Discussion hashtag
    #reading9

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, April 7th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Project

    Midterm Project

    For the midterm, you will conduct an exploratory data analysis of the U.S. Department of Education’s College Scorecard dataset in teams.

  • Homework

    Homework 2

    For your second homework assignment, you will explore a dataset about the passengers on the Titanic, the British passenger liner that crashed into an iceberg during its maiden voyage and sank early in the morning on April 15, 1912.

  • Reading

    Reading 8

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading8

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, March 3rd.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 7

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading7

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Thursday, March 1st.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Homework

    Homework 1

    Your first major assignment is a set of exercises based around a single dataset called rail_trail, which will provide you with practice in creating visualizations using R and ggplot2.

  • Reading

    Reading 6

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading6

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, February 24th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 5

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading5

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Thursday, February 22th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 4

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading4

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Thursday, February 15th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Mini-Assignment

    Visualization mini-assignment

    Mini-assignment to practice using RStudio to run code blocks in RMarkdown files and to create visualizations using ggplot2.

  • Reading

    Reading 3

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading3

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, February 10th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Reading

    Reading 2

    Introductory Statistics with Randomization and Simulation

    Read the following:

    • All of Chapter 1, except skip sections 1.3 (read subsection 1.3.4), 1.4, and 1.5

    Reading discussion

    Discussion hashtag
    #reading2

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Thursday, February 8th.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Mini-Assignment

    RMarkdown mini-assignment

    Mini-assignment to practice editing RMarkdown files and saving to Github.

  • Reading

    Reading 1

    R for Data Science

    Read the following:

    Reading discussion

    Discussion hashtag
    #reading1

    Remember to post your question about it to the #5-discussion channel in Slack by the due date. To receive an answer credit, reply to a posted question no later than 11:59pm on Saturday, February 3rd.


      Posting guidelines can be found in the Readings section of the syllabus.

  • Mini-Assignment

    Try R Tutorial

    Instructions

    Complete all the levels of the Try R tutorial on codeschool.com before class begins on Tuesday, January 30th. After you complete the interactive tutorial, you will receive a certificate of completion.

    It is recommended that you sign up for an account before starting, as this will let you save your progress.

    Submission

    Take a desktop screenshot of the certificate (Print Screen button) and send it to Dr. Glasbrenner as a Slack Direct Message. The screenshot should show some sort of identifiable information. For example, open a small notepad window and type your name there, like this:

  • Mini-Assignment

    Introduce yourself; Twitter Data Science Study; Github signup

    Introduce yourself

    Write an introduction about yourself in the #3-members channel on Slack. Include your name, your major, and say one thing you know or have heard about data science before starting this class (this can be the news, in your major, etc.).

    Can Twitter predict election results?

    Finish reading the editorial and skimming the white paper from the Can Twitter predict election results? activity we started during class on January 23rd. Then, post your answer to question 1 in the #5-discussion channel on Slack, using the hashtag #class01 somewhere in your message.

    Github account

    Sign up for an account on Github: http://github.com using your @gmu.edu email address. After you signup, send me your username in a Direct Message.