The following research group descriptions are archived because they are no longer offered, the faculty member is on sabbatical, or the group is taking a break. Please contact the faculty member or an advisor to learn more about these groups.
- Exploring Engagement with Virtual Pet Sites
- Human-Centered Data Science and Large Language Models
- A Systematic Literature Review of Research on Recommender Systems
- Research Design for Games to Teach Data Ethics
- Comparing Content Recommendations based on User-Centered Content Analysis
- Data Visualization and Analytics for Diversity and Inclusion Research
- Turning Visualization Research into Product: Traffigram
- Human-Centered Natural Language Processing and Text Visualization
- Exploring Positionality in Qualitative Coding
- Safety Culture for Professional Pilots
- Research Design for Games to Teach Data Ethics
- Human Centered Natural Language Processing and Text Visualization
- Emotions and Relationship-Building in Online Fanfiction Communities
- Human Centered Natural Language Processing and Text Visualization
- Human Centered Natural Language Processing
- Distributed mentoring and fanfiction data analytics
- Games for Good: Designing a Data Science Ethics Game
- Cultural differences in data privacy perspectives on social media
- Distributed mentoring and fanfiction data analytics
- Data Science Ethnography
- Distributed Mentoring in Online Fanfiction Communities
- Qualitative Coding and Analysis of Affect (Emotion) in Text
- Visualization of Large Text Data Sets
- Analyzing Online Community Data
- Visualization of Large Text Data Sets
- Understanding and Analyzing Eye Tracking Data
- Reading Group: Remixing User Research Methods (Winter 2014)
- Informal Learning in Online Fan Communities
- Analyzing Emotional Content of Text-Based Communication
- Games for Good: Designing and Building Collaborative Games for Engineering and Science
- Social Media Qualitative Analysis: Methods and Practice (SMQA DRG)
- Understanding and Analyzing Eye Tracking Data
Spring 2024
Exploring Engagement with Virtual Pet Sites
Organizers: Alyse Marie Allred, Cecilia Aragon
Virtual pet sites, such as Neopets, are browser games built around the core mechanic of collecting digital pets, often with additional features such as: minigames, forums, contests, dress-ups, and a tradesmarket. Although they rose to prominence in the early 2000s, virtual pet sites persist to this day. Moreover, while the original trend was associated primarily with children, many of the current users of these sites are adults--some of whom were the original children who have since grown up. This DRG seeks to understand how adults interact and play with these virtual pet sites, as a point of comparison to existing ethnographic data on how children engage with them. The goal is to produce a paper summarizing the findings and submitting it to a conference or journal. Co-authorship on the paper is a possibility for motivated and engaged students.
What students should expect:
- Joining one of four virtual pet sites (Neopets, Flight Rising, Dappervolk, Lorwolf)
- Independently playing at least 30 minutes, three days of the week (1.5 hours playtime total)
- Weekly short reflections (1-3 paragraphs) relaying activities, observations, and other significant interactions
- Weekly 2 hour meetings with the team to share work and discuss observations (meeting time TBD)
- Full reflections at midterms and finals
For questions, please contact: Alyse Allred at alyse.allred@gmail.com.
Autumn 2023 - Spring 2024
Human-Centered Data Science and Large Language Models
This year-long Directed Research Group will explore questions in the field of human-centered data science, as it relates to the development, usability, evaluation and social impacts of the recent proliferation of large language models. Students will lead and participate in original research projects, as they discuss novel and impactful questions in the field by designing and executing their own studies. Students will be working directly with PhD candidate Sourojit Ghosh (G).
Winter 2024
A Systematic Literature Review of Research on Recommender Systems
Led by Sourojit Ghosh, HCDE PhD Candidate
Advised by Cecilia Aragon, HCDE Professor
The goal for this 3-credit DRG is to conduct a systematic literature review of research on recommender systems, to establish a comprehensive understanding of how researchers and designers of recommender systems define "success" in their work. Students who join the DRG will receive the experience of working on a full research project, from start to finish, which will be submitted to a conference at the end of the quarter.
Participants in this DRG will need to be available on Tuesdays at 11 a.m. and commit to closely reading and analyzing a body of academic research papers. Additionally, they will be expected to contribute to the writing and preparing of the submission.
We are looking for 6-8 students (grad or undergrad) who meet the following qualifications:
- Able to manage a heavy research-reading load throughout the quarter.
- Able to commit to at least 6 hours of work a week (including DRG meetings).
Autumn 2022 - Spring 2023
Research Design for Games to Teach Data Ethics
Co-directed by Cecilia Aragon, Bernease Herman, and Sarah Evans
This research group will co-design a game, along with faculty and students from the University of North Texas (UNT), a Hispanic-Serving Institution, to explore issues of ethics and diversity in data science. Students will be hands-on in exploring examples of educational games, brainstorming and providing ideas for games, creating prototypes, and playtesting. Some themes we may consider include data privacy, trust of algorithmic systems, predictive policing, fairness, and others. Our goal is to produce a working prototype of a game, playtest it, and study our own design processes to gain insight into how conflicts in norms and culture may change the learning process.
This will be a two-quarter (with option to continue for the full year) directed research group with the goal of writing and submitting a paper to a top venue in spring 2023. All group members will be offered the opportunity to be co-authors on the paper.
We are looking for a relatively small group of people who are each interested in between 2 and 5 credit hours of credit/no credit grade in HCDE 496/596 for Fall, Winter, and Spring Quarters in 2022-2023. Interested undergraduate and graduate students may apply. Graphic design experience and familiarity with a wide variety of games is recommended but not required for motivated students.
Winter 2023
Comparing Content Recommendations based on User-Centered Content Analysis
In this 2-credit DRG, we are looking for 2-5 students for a research project which intends to compare content recommendation processes for designing social recommender systems. Group members will analyze user-generated content on online fanfiction communities and provide content recommendations to direct users to consume. Some prior experience with qualitative coding or content analysis is preferred, but not required.
Data Visualization and Analytics for Diversity and Inclusion Research
Co-directed by Kimberly Perkins and Cecilia Aragon
Did you know that 95% of airline pilots are men? Did you know that 94% are white? We don’t know the LGBTQI+ percentages because nobody asked.
This research explores why a demographic majority persists despite the industry being open to diversity more than 50 years ago. PhD student Kimberly Perkins has collected over 26,500 answered survey questions (both text and quantitative) from pilots in leadership roles at one major airline based in the United States. The DRG will focus on data cleaning and using Tableau to gain insights and prepare data visualizations from this data set.
Students must have extensive experience with Tableau and data cleaning. Completion of HCDE 411, HCDE 511, or similar data visualization/analysis class is a plus.
We are looking for a relatively small group of people who are each interested in between 2 and 5 credit hours of credit/no credit grade in HCDE 496/596 for Winter Quarter 2023. Interested undergraduate and graduate students may apply. Successful completion of this research may result in co-authorship of an academic paper.
Winter 2023
Turning Visualization Research into Product: Traffigram
This DRG will be led by UW CSE alum/Microsoft software engineer Ken Aragon and HCDE professor Cecilia Aragon.
Are you interested in the process through which a novel visualization algorithm is taken from research prototype to viable product? Do you think interactive maps could be improved by combining the science of visual perception with efficient algorithms?
Work with industry engineers and HCDE faculty to turn novel visualization techniques sponsored by UW’s CoMotion Labs into a marketable product. Participants will have the opportunity to work in areas such as user research, design, and software engineering.
Preference will be given to those with proficiency in programming/software development, data visualization, usability testing, design, or entrepreneurship. Meetings will take place once a week on campus at a time (most likely late afternoon) that best suits all participants’ needs. We are looking for a team of 5-6 dedicated students with a variety of skills and backgrounds to work together with the ultimate goal of creating a viable and successful product.
More information about the research can be found here: Human-Centered Data Science Lab » Traffigram: A Design Methodology for Distance Cartograms.
Autumn 2022
Human-Centered Natural Language Processing and Text Visualization
This research group will apply human-centered techniques in the fields of natural language processing (NLP) and visualization to study very large text corpora, with a specific focus on text visualization. We’re looking for students with experience in either (a) programming and analysis of large text datasets or (b) machine learning and data science. Data visualization or NLP experience is a plus but not required.
Autumn 2022
Exploring Positionality in Qualitative Coding
This 2-credit DRG will take students through the process of qualitative coding of data. Students will be asked to perform open coding on datasets, and meet to discuss agreements/disagreements as they work towards formulating a codebook. As they code, they will be asked to consider how their own positionalities affect their interpretation of data, and what external contexts they applied in their work. The DRG will meet on Wednesdays, 10 a.m. - 12 noon.
We are looking for 5-7 students with little or no prior experience with qualitative coding, but interest in qualitative research. If you are interested, please email Sourojit Ghosh (ghosh100@uw.edu) with a brief statement of intent, CV and unofficial transcript. Please use the subject line "Interest in Positionality in Qualitative Coding DRG:<your name>.
Spring 2022
Safety Culture for Professional Pilots
This research group investigates the interactions between socio-cultural issues and the construction and maintenance of a safety culture for professional pilots in the US aviation industry. These interactions will be explored via several methods, including grounded theory explorations of survey data and trace ethnography as well as a genealogy of literature in the safety systems field.
We will explore the tension between ideas of excellence and competence which are embedded in gendered understandings and prioritize individual pilots against the understanding in the safety systems world that safety is achieved collectively–specifically in our early investigations we discovered that reactions to collective definitions of excellence and competence were highly gendered. Shifts in safety culture which stress the importance of revealing power dynamics which were previously invisible seem to produce a community threat response to protect traditional or historical ways of piloting. Changes to safety culture are politicized as socially progressive changes to the culture at large, and the culture at large becomes a reservoir for tools and mechanisms for reinforcing the hegemonic status quo.
The goal of this research group is to write a paper to submit at a top venue. All members of the group will be offered the opportunity to be co-authors on the paper. Themes for the paper may include but are not limited to: the collective ethos of professional pilots in the United States; individualist propensities as a threat to collective safety systems; gender as a lens of threat response to traditionalism or historical preservation; or, politicking as a form of silencing equity.
This DRG will contain a subgroup to learn Python and use it to extract specific data points to augment the overall research.
This is a closed DRG and will not be accepting applications. The group will meet either via zoom or in-person on Tuesdays, Wednesdays, or Thursdays for two-hour blocks based on members’ availability. This DRG will be 3-6 credit hours of credit/no credit grade in HCDE 496 (undergrads) /596 (grads) for the Spring Quarter in 2022.
Winter 2022 - Spring 2022
Research Design for Games to Teach Data Ethics
Co-directed by Cecilia Aragon, Sarah Evans, Bernease Herman, and Andrea Figueroa
This research group will co-design a game, along with faculty and students from the University of North Texas (UNT), a Hispanic-Serving Institution, to explore issues of ethics and diversity in data science. Students will be hands-on in exploring examples of educational games, brainstorming and providing ideas for games, creating prototypes, and playtesting. Some themes we may consider include data privacy, trust of algorithmic systems, predictive policing, fairness, and others. Our goal is to produce a working prototype of a game, playtest it, and study our own design processes to gain insight into how conflicts in norms and culture may change the learning process.
This will be a two-quarter directed research group with the goal of writing and submitting a paper to a top venue in June 2022. All group members will be offered the opportunity to be co-authors on the paper.
We are looking for a relatively small group of people who are each interested in between 2 and 5 credit hours of credit/no credit grade in HCDE 496/596 for Winter and Spring Quarters in 2022. Interested undergraduate and graduate students may apply. Graphic design experience and familiarity with a wide variety of games is recommended but not required for motivated students.
The group will meet virtually over Zoom to accommodate the UNT students, although we may meet a few times in person at UW before the UNT semester starts. Meetings will be on Thursdays at either 11:30-1, 12-1:30, or 12:30-2 depending on group availability.
Human Centered Natural Language Processing and Text Visualization
This research group will apply human-centered techniques in the fields of natural language processing (NLP) and visualization to study very large text corpora, with a specific focus on text visualization. We’re looking for students with experience in either (a) programming and analysis of large text datasets or (b) machine learning and data science. Data visualization or NLP experience is a plus but not required.
Aviation Safety Research:
A sub-section of the research group will focus on analyzing survey responses and creating various forms of data visualization to convey survey results. Experience with survey analysis, data visualization tools, and a working knowledge of the aviation industry is a plus, but not required.
Spring 2021
Emotions and Relationship-Building in Online Fanfiction Communities
Led by HCDE PhD student Sourojit Ghosh, with guidance from Professor Cecilia Aragon
This research group will investigate the role played by shared or conflicting emotions in the process of relationship-building in online communities. We aim to explore that role through extensive qualitative coding of individual fanfiction reviews. This work will be the final quarter of an ongoing research project on this topic, utilizing subsets of a large dataset of fanfiction data collected by the Human-Centered Data Science Lab in previous years. Past explorations with this dataset have put forward the theory of distributed mentoring, a phenomenon where people from all over the world and all age groups collaboratively give and receive support through an informal yet substantive network of constructive advice. The goal for this DRG will be to finish our qualitative coding via a novel collaborative coding and visualization tool, and to contribute to a research paper to be submitted to CSCW this academic year.
Participants in this DRG will gain hands-on experience with large datasets, learning to qualitatively analyze each data point for its rich content while also looking at it in the larger context of the entire set.
Human Centered Natural Language Processing and Text Visualization
This research group will apply human-centered techniques to the field of natural language processing (NLP) to study very large text corpora, with an additional focus on text visualization. We’re looking for students with experience in either (a) programming and analysis of large text datasets or (b) machine learning and data science. No NLP experience is required as we will be reading seminal papers in the field and applying those techniques to a text dataset.
We plan to use a previously-collected dataset of over 61.5 billion words (the largest fiction dataset outside of the Google Books corpus) of stories, reviews, and associated metadata from fanfiction sites as a test dataset for human-centered NLP techniques.
Human Centered Natural Language Processing
This research group will apply human-centered techniques to the field of natural language processing (NLP) to study very large text corpora. We’re looking for students with experience in either (a) programming and analysis of large text datasets or (b) machine learning and data science. No NLP experience is required as we will be reading seminal papers in the field and applying those techniques to a text dataset.
We plan to use a previously-collected dataset of over 61.5 billion words (the largest fiction dataset outside of the Google Books corpus) of stories, reviews, and associated metadata from fanfiction sites as a test dataset for human-centered NLP techniques.
Distributed mentoring and fanfiction data analytics
Spring 2019
Co-directed by Cecilia Aragon and Jenna Frens
This ongoing research project studies informal learning in online fanfiction communities. We are looking for students with experience in either (a) programming and analysis of large text datasets or (b) machine learning and data science, to join an existing research group.
We’ve collected a vast, rich text dataset of over 61.5 billion words (the largest fiction dataset outside of the Google Books corpus) of stories, reviews, and associated metadata from fanfiction sites and have applied both qualitative (ethnography) and quantitative techniques (machine learning, statistical analysis, data visualization) to investigate the relationship between distributed mentoring and writing quality (e.g., grammar, reading level). We have published multiple papers on our research and are in the process of submitting others.
We have found quantitative evidence that distributed mentoring plays a positive role in fanfiction authors’ development as writers, and this quarter’s project continues our efforts with a specific focus on quantitative analysis of our large dataset.
Games for Good: Designing a Data Science Ethics Game
Winter 2019
Co-directed by Cecilia Aragon and data scientist Bernease Herman
This research group will explore the use of analog and digital games to introduce users to ethical and human-centered issues in data science and computing. Students will be hands-on in exploring examples of educational games, brainstorming and providing ideas for games, and creating prototypes using paper and/or a computer game engine such as Unity3D. Some themes we will consider include data privacy, trust of algorithmic systems, predictive policing, fairness, and others.
At the end of ten weeks, we aim to produce a working prototype of the game, including several rounds of playtesting.
We are looking for a relatively small group of people who are each interested in between 2 and 5 credit hours of credit/no credit grade in HCDE 496/596. Interested undergraduate and graduate students may apply. Graphic design experience or programming experience is recommended, but not required for motivated students.
Cultural differences in data privacy perspectives on social media
Winter 2019
The Cambridge Analytica scandal has triggered a discussion about data privacy in social media. As the news regarding this issue has traveled around the world, a worldwide public discussion about data privacy has emerged. Motivated by this context, we aim to answer this research question: Does the public online debate reveal different perspectives on data privacy across countries/cultures? To do so, we have collected Twitter activity associated with data privacy and the Cambridge Analytica scandal in both English and Spanish. Our work will result in insights about the different aspects of data privacy that are emphasized by people in different countries; a characterization of how geography, time, and bots influence the worldwide online conversation on data privacy; and, lessons learned about how best to apply human-centered data science techniques to support cross-cultural comparisons of social media data.
We have collected a large-scale Twitter dataset around this issue and are in the process of analyzing the data through both qualitative coding and automated analysis. The research group will take a mixed-methods approach to understanding the data, and as a result we are currently focused on qualitative coding of a large Twitter dataset.
The group is open both graduate and undergraduate students. Qualitative research experience in grounded theory and qualitative coding is desirable but not required. Bilingualism is a plus, particularly in Spanish. We strongly encourage interested undergrads to apply, even if you have little or no experience with this type of research. This is an excellent opportunity to be introduced to the methods of human-centered data science, as well as a chance to gain valuable insight into the way that research is carried out.
Distributed mentoring and fanfiction data analytics
Co-directed by Jenna Frens, PhD student; Cecilia Aragon, Professor
Winter 2019
Are you interested in applying human-centered data science to study how people learn from online fandom?
This ongoing research project studies informal learning in online fanfiction communities. We are looking for students with experience in either (a) programming and analysis of large text datasets or (b) qualitative research in online fandoms, to join an existing research group. We have published multiple papers on our research and are in the process of submitting others.
We have found quantitative and qualitative evidence that distributed mentoring plays a positive role in fanfiction authors’ development as writers, and this quarter’s project continues our efforts with a specific focus on visual analytics of a large dataset. We’ve collected a vast, rich text dataset of over 61.5 billion words (the largest fiction dataset outside of the Google Books corpus) of stories, reviews, and associated metadata from fanfiction sites and have applied both qualitative (ethnography) and quantitative techniques (machine learning, statistical analysis, data visualization) to investigate the relationship between distributed mentoring and writing quality (e.g., grammar, reading level).
Big data, the data deluge, the information explosion... there have been many names to describe the overwhelming amount of data that is being generated in just about every scientific domain today. Data science is the term that has emerged to describe the study of the extraction of knowledge from this flood of data, and it can include elements of various fields from computer science to applied mathematics to human centered design and engineering.
However, little is known about the culture and human processes surrounding the emerging practice of data science. A recent five-year, $37.8 million award to UW, UC Berkeley, and NYU from the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation seeks to address this gap.
In this research group, we will utilize ethnographic practices including contextual inquiry, interviews, and participant observation to delve more deeply into the culture of data science on the UW campus. We will participate in scientific efforts in astronomy, oceanography, sociology, and other exciting data-driven fields on the cutting edge of science today.
We are looking for students with a background or interest in ethnography, who are each interested in between 2 and 5 credit hours of credit/no credit grade of HCDE 496/596. This group will meet Mondays from 3:30–4:30 p.m. in Sieg Hall, room 420.
Distributed Mentoring in Online Fanfiction Communities
Are you both a fan and a hacker? Are you interested in studying how people learn from online fandom?
This ongoing research project studies informal learning in online fanfiction communities. We are looking for a small number of experienced programmers interested in fandom to join an existing research group. We have already published one paper on our research (arXiv:1510.01425v2) and are in the process of submitting others.
We suspect that the novel concept of distributed mentoring plays a positive role in fanfiction authors’ development as writers, and this quarter’s project attempts to quantify this effect. We intend to scrape stories, reviews, and associated metadata from fanfiction sites and apply quantitative techniques (machine learning, statistical analysis, data visualization) to investigate the relationship between distributed mentoring and writing quality (e.g., grammar, reading level). Applicants must have spent substantial time outside of class writing scripts to scrape the web and process text, in languages such as python, perl, or bash. No experience in machine learning or visualization is required, although it is a plus.
Qualitative Coding and Analysis of Affect (Emotion) in Text
Co-directed by Cecilia Aragon & Taylor Scott
We are studying creative collaboration in a distributed team of astrophysicists and have collected a large amount of longitudinal data in the form of chat logs. We have been qualitatively analyzing this data to detect and classify emotional content, relate it to events occurring in the group's history, and form a theoretical framework of Distributed Affect. Our initial methods have been successful and promising, and we plan to refine and verify them through further qualitative coding and analysis of the data.
We strongly encourage interested undergraduates to join this group, even if you have little or no experience with qualitative research. This is an excellent opportunity to be introduced to various methods of analyzing text data, and gain insight into the way that such research is carried out.
Participation in this research group should be a good opportunity to:
- Gain valuable practice in qualitative coding of chat log data
- Learn more about the application of methods and theoretical perspectives in qualitative data analysis
- Apply visual analysis as a means of exploring a large data set
- Discover how these methods can be applied to your own areas of interest and research
Visualization of Large Text Data Sets
The amount of informal text communication (e.g. chat, texting, microblogs) in the world is increasing exponentially. Submerged within this text data deluge lies a wealth of information that is potentially valuable to businesses, governments, social scientists, and all human communities. In this research group, we will develop text visualizations with a specific focus on visual concordances that can be applied to very large text data sets.
This will be a two-quarter directed research group with the goal of submitting a paper to Vis 2016 in March 2016. During the first quarter, we will sketch, stretch our visual imagination with hands-on design exercises and critiques, and build and test visualization prototypes in javascript and d3. During the second quarter, we will iterate on the research questions, refine our visual prototypes, conduct usability tests of our designs, and write a paper on our results.
Analyzing Online Community Data
-
Experience how theory is used to guide analysis of data
-
Understand how collaborative analysis of data can be organized
-
Learn a new set of theories (externalization of knowledge, creative resonance)
-
Learn about publication venues
interested in between 2 and 5 credits. The actual organization of the work will be based on the number of people interested.
Visualization of Large Text Data Sets
The amount of informal text communication (e.g. chat, texting, microblogs) in the world is increasing exponentially. Submerged within this text data deluge lies a wealth of information that is potentially valuable to businesses, governments, social scientists, and all human communities. In this research group, we will develop text visualizations with a specific focus on visual concordances that can be applied to very large text data sets.
This will be a two-quarter directed research group with the goal of submitting a paper to Vis 2016 in March 2016. During the first quarter, we will sketch, stretch our visual imagination with hands-on design exercises and critiques, and build and test visualization prototypes in javascript and d3. During the second quarter, we will iterate on the research questions, refine our visual prototypes, conduct usability tests of our designs, and write a paper on our results.
-
What makes for compelling game mechanics and narrative storytelling?
-
What is the role of social game play and how can game environments support collaboration?
-
How are affect and emotion supported in a game environment to promote greater scientific creativity?
Cecilia Aragon
Social Media Qualitative Analysis: Methods and Practice (SMQA DRG)
-
Methods
-
developing an integrated methodological framework that can be used by researchers in designing and evaluating studies of social media texts
-
relating existing practice with epistemological bases for existing empirical social science methodologies
-
involves understanding related methodological research in HCI and extensive analysis of existing literature on social media data studies
-
-
Practice
-
identifying and engaging with stakeholders in social media data analysis (including both researchers and, for example, managers who make decisions on the basis of research findings) across both academia and industry
-
contextual inquiry into the practices of researchers and their tool ecosystem
-
involves understanding related CSCW studies of similar populations, designing and distributing surveys, conducting interviews, and observation
-
Winter 2019
Games for Good: Designing a Data Science Ethics Game
Co-directed by Cecilia Aragon and data scientist Bernease Herman
This research group will explore the use of analog and digital games to introduce users to ethical and human-centered issues in data science and computing. Students will be hands-on in exploring examples of educational games, brainstorming and providing ideas for games, and creating prototypes using paper and/or a computer game engine such as Unity3D. Some themes we will consider include data privacy, trust of algorithmic systems, predictive policing, fairness, and others.
At the end of ten weeks, we aim to produce a working prototype of the game, including several rounds of playtesting.
We are looking for a relatively small group of people who are each interested in between 2 and 5 credit hours of credit/no credit grade in HCDE 496/596. Interested undergraduate and graduate students may apply. Graphic design experience or programming experience is recommended, but not required for motivated students. To apply, fill out the following form explaining your interest in the project, and attach a resume and an unofficial transcript here.
Please send any questions to both Bernease Herman <bernease@uw.edu> and Cecilia Aragon <aragon@uw.edu>.
The group will meet on Thursdays from 3-4:20 p.m. in Sieg 427.
Winter 2019
Cultural differences in data privacy perspectives on social media
Note: This DRG is at capacity for Winter 2018
The Cambridge Analytica scandal has triggered a discussion about data privacy in social media. As the news regarding this issue has traveled around the world, a worldwide public discussion about data privacy has emerged. Motivated by this context, we aim to answer this research question: Does the public online debate reveal different perspectives on data privacy across countries/cultures? To do so, we have collected Twitter activity associated with data privacy and the Cambridge Analytica scandal in both English and Spanish. Our work will result in insights about the different aspects of data privacy that are emphasized by people in different countries; a characterization of how geography, time, and bots influence the worldwide online conversation on data privacy; and, lessons learned about how best to apply human-centered data science techniques to support cross-cultural comparisons of social media data.
We have collected a large-scale Twitter dataset around this issue and are in the process of analyzing the data through both qualitative coding and automated analysis. The research group will take a mixed-methods approach to understanding the data, and as a result we are currently focused on qualitative coding of a large Twitter dataset.
The group is open both graduate and undergraduate students. Qualitative research experience in grounded theory and qualitative coding is desirable but not required. Bilingualism is a plus, particularly in Spanish. We strongly encourage interested undergrads to apply, even if you have little or no experience with this type of research. This is an excellent opportunity to be introduced to the methods of human-centered data science, as well as a chance to gain valuable insight into the way that research is carried out.
Note: This DRG is at capacity for Winter 2018
Winter 2019
Distributed mentoring and fanfiction data analytics
Co-directed by Jenna Frens, PhD student; Cecilia Aragon, Professor
Note: This DRG is at capacity for Winter 2018
Are you interested in applying human-centered data science to study how people learn from online fandom?
This ongoing research project studies informal learning in online fanfiction communities. We are looking for students with experience in either (a) programming and analysis of large text datasets or (b) qualitative research in online fandoms, to join an existing research group. We have published multiple papers on our research and are in the process of submitting others.
We have found quantitative and qualitative evidence that distributed mentoring plays a positive role in fanfiction authors’ development as writers, and this quarter’s project continues our efforts with a specific focus on visual analytics of a large dataset. We’ve collected a vast, rich text dataset of over 61.5 billion words (the largest fiction dataset outside of the Google Books corpus) of stories, reviews, and associated metadata from fanfiction sites and have applied both qualitative (ethnography) and quantitative techniques (machine learning, statistical analysis, data visualization) to investigate the relationship between distributed mentoring and writing quality (e.g., grammar, reading level).
Note: This DRG is at capacity for Winter 2018