Example Dataset – Social Media

Example Dataset – Social Media

Do NOT use this dataset for your final Data Jam project.




Social media platforms are web-based tools that create a community for sharing content and messages online. The first social media sites were created in the late 1990’s and early 2000’s. These early sites have been replaced by other sites and applications that are used by millions of people around the world. Social media now accounts for more than 24 billion dollars in global revenue.


Social media has become a major part of modern society used by hundreds of millions of people around the world. It serves as an important marketing tool for businesses and allows users to stay up to date with current events. On average, in 2019 people spent more than two hours per day on social media.






       Example of phone survey questionnaire.

Researchers from the Pew Research Center studied trends in use of technology and social media in the United States. They used surveys to collect data between January and April of 2018. These were the methods used to collect data:

  • Surveys were conducted over the phone, by mail, and in person.
  • Researchers wrote survey questions based on their study goals. The questionnaire was tested by interviewing a small number of people, and questions were modified based on the quality of responses given and understanding of the question’s phrasing.
  • A total of 3,803 adults and teens living in the US responded to the survey.

               Photo from: www123rf.com

  • Calls were made randomly through a random digit dialing software. People who responded to the survey on their cell phone were offered a $5 cash incentive for participating.
  • Researchers used a process called “data weighting” to ensure the data was representative of the gender, age, education level and ethnicity of the US population.
  • Results were published in two different papers, one that focused on teen use of social media and one focusing on adults.





CODAP is a free educational software for data analysis, a product of The Concord Consortium (https://concord.org) and funded by NSF grants.