Introduction to Hyphe: A new webcrawler for analyzing controversies

Workshops directed by Mathieu Jacomy, MédiaLab, Sciences Po, Paris
Monday, Jan 26, 2015 - 3:00pm to Wednesday, Jan 28, 2015 - 5:00pm
  • Monday, January 26, 3 - 5 p.m. and Wednesday, January 28, 3 - 5 p.m.
  • Laboratory for Digital Cultural Heritage, first floor, Research Library

Working on controversies—whether related to immigration, the environment, or police behavior—can be greatly facilitated by crawling the websites maintained by actors involved in any controversy and thereby analyze their online connections.

These workshops are designed to introduce non-technical users to a new web crawler, Hyphe, designed so that researchers can control the building of a web corpus (by filtering and qualifying the websites to include in the corpus) while simultaneously providing  powerful tools capable of handling the huge amount of data available on the web.

Using modern and robust technologies such as Lucene, MongoDB, Scrapy, Twisted, Thrift, Domino.js, Sigma.js or Bootstrap, Hyphe can manage multiple corpora within each instance, bypassing crawling issues (redirections, cookies, javascript-only pages, …), handling multi-websites entities from the web interface, tagging the results, and so on…

Hyphe is easy to use. Workshop participants will simply need a laptop equipped with a conventional web browser (Chrome, Firefox, etc) and access to the internet.  Depending on time and interest, the Wednesday workshop will also provide an overview of Gephi.

Mathieu Jacomy, a software developer, works at Médialab, Sciences Po, a research center in Paris connecting social scientists with new digital tools. Jacomy was part of the team that developed Gephi, a tool for visualizing networks.

Bastian M., Heymann S., Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media.

Thanks to support from: the UCLA International Institute; UCLA Interdisciplinary and Cross-campus Affairs; the UCLA School of Law; the UCLA Graduate School of Education and Information Studies; and the Irene Flecknoe Ross Lecture Series in the UCLA Department of Sociology. The Irene Flecknoe Ross Lecture Series is made possible by a gift from Ray Ross in memory of his wife.