Project 90 (Partial Data)

Project 90 was a prospective study of the influence of network structure on the dynamics of HIV transmission in a community of high-risk heterosexuals. The data was collected between 1988 and 1992 in Colorado Springs, CO, and the project was funded by the Centers for Disease Control and Prevention (CDC). For more details of the Project 90 study, please see the references below.

Stephen Muth and John Potterat kindly provided the data to Sharad Goel and Matthew Salganik in 2007, and it was later used in their paper, S. Goel and M. J. Salganik (2010) "Assessing respondent-driven sampling" Proceedings of the National Academy of Sciences (PNAS). The release of these data allows others to replicate the analyses of Goel and Salganik.

Data Release

Included in this release are two tab-separated files, edges.tsv and nodes.tsv, that describe the structure of the Project 90 network and the individual-level attributes of study participants.

edges.tsv: Each row indicates an edge in the network, specified by a pair of node ids. Edges represent social, sexual, and/or drug affiliation. Each edge is recorded twice. In other words, if there is an edge between 12 and 15, there is also an edge between 15 and 12. There are 43,288 edges in the file.

nodes.tsv: Each row corresponds to a study participant. In addition to node id, the following attributes are listed for each individual:

  • Race (1 = Native American; 2 = Black; 3 = Asian/Pacific Islander; 4 = White; 5 = Other)
  • Gender (0 = Male; 1 = Female)
  • Sex Worker (0 = No; 1 = Yes)
  • Pimp (0 = No; 1 = Yes)
  • Sex Work Client (0 = No; 1 = Yes)
  • Drug Dealer (0 = No; 1 = Yes)
  • Drug Cook (0 = No; 1 = Yes)
  • Thief (0 = No; 1 = Yes)
  • Retired (0 = No; 1 = Yes)
  • Housewife (0 = No; 1 = Yes)
  • Disabled (0 = No; 1 = Yes)
  • Unemployed (0 = No; 1 = Yes)
  • Homeless (0 = No; 1 = Yes)

Missing values are denoted by 'NA' and there are 5,492 individuals in the file. Please direct any questions to Prof. Matthew Salganik.

readme: In addition, this text file gives one-way frequencies of all the individual attributes.


  • S. Goel and M. J. Salganik (2010) "Assessing Respondent-Driven Sampling," Proceedings of the National Academy of Sciences of the United States of America (PNAS) Vol. 107, pp:6743-6747.
  • Potterat J. J., et al. (2004) "Network Dynamism: History and Lessons of the Colorado Springs Study" in Network Epidemiology: A Handbook for Survey Design and Data Collection, ed Morris M. (Oxford University Press, Oxford), pp 87-114.
  • Woodhouse D. E., et al. (1994) "Mapping a Social Network of Heterosexuals at High Risk for HIV Infection," AIDS 8:1331-1336.
  • Klovdahl A. S., et al. (1994) "Social Networks and Infectious Disease: The Colorado Springs Study," Social Science and Medicine 38:79-88.
  • Rothenberg R. B., et al. (1995) "Social Networks in Disease Transmission: the Colorado Springs Study" in Social Networks, Drug Abuse, and HIV Transmission, eds Needle R. H., Coyle S., Genser S. G., Trotter R. T. (National Institute on Drug Abuse) Vol. 151, pp 3-19.

Registration Required

To access these datasets, please login or register as a user of the data archive.