DopeStats Data Sources Update

User avatar
Site Admin
Posts: 548
Joined: Thu Jan 15, 2015 11:25 pm

DopeStats Data Sources Update

Postby armorm2 » Mon Mar 16, 2015 10:59 pm

In order to increase the effectiveness of our drug awareness maps, I'm happy to announce that we will be importing data from reliable sources that have conducted similar surveys and studies in the past. The data we currently have and will continue collecting from anonymous survey respondents, will supplement these primary sources for drug use statistical data. The end goal is to ensure accuracy and complete data visualization of drug use throughout the United States. The primary fields we require survey respondents to provide are:

1. County
2. Date substance used or purchased
3. Number of times substance was used on date specified

The criteria necessary for a data source to be considered basically entails the following:
  1. The source must be a federal, state or academic institution or municipality, a non-profit or related institution (ie - gov't sponsored)
  2. The source survey data must be compatible with our required fields
For example, a source might meet the first criteria but it does not ask survey respondents questions (or provide answers) that allow deducing the number of times a substance was used during a specific time period (last year, past three months, daily, are all examples of specific time periods). Another source might allow us to deduce the number of times a substance was used during specific time periods, but it may not provide the geographical layout used for the survey respondents (ie - not broken down by county or city). In both cases, there is just no exact way for us to map the existing survey data onto our dynamic county based maps. Small area estimation (SAE) methodology used in the National Survey on Drug Use and Health seems promising, but this is ultimately an estimate of a geographical area and does not always guarantee county-level or lower coverage

To start things off, the National Survey on Drug Use and Health is not compatible with our systems. The reasoning is as follows:

1. Go to 2013 NSDUH Public-use Data Files
2. Under Methodology, you'll find:

A multistage area probability sample for each of the 50 states and the District of Columbia has been used since 1999. A coordinated sample design was developed for the 2005 through 2009 NSDUHs. The 2013 NSDUH is an extension of the 5-year sample design. Although there is no overlap with the 1999-2004 samples, the coordinated design for 2005 through 2009 facilitated a 50 percent overlap in second-stage units (area segments [see below]) between each two successive years from 2005 through 2009. The 2010-2013 NSDUHs continue the 50 percent overlap by retaining half of the second-stage units from the previous year. This design was intended to increase precision of estimates in year-to-year trend analyses because of the expected positive correlation resulting from the overlapping sample between successive survey years. The 2013 design allows for computation of estimates by state in all 50 states plus the District of Columbia. States may therefore be viewed as the first level of stratification as well as a reporting variable. Eight states, referred to as the large sample states, had a sample designed to yield 3,600 respondents per state for the 2013 survey. This sample size was considered adequate to support direct state estimates. The remaining 43 states (which include the District of Columbia) had a sample designed to yield 900 respondents per state in the 2013 survey. In these 43 states, adequate data were available to support reliable state estimates based on small area estimation (SAE) methodology. Within each state, sampling strata called state sampling (SS) regions were formed. Based on a composite size measure, states were partitioned geographically into roughly equal-sized regions. In other words, regions were formed such that each area yielded, in expectation, roughly the same number of interviews during each data collection period. The eight large sample states were divided into 48 SS regions each. The remaining states were divided into 12 SS regions each. Therefore, the partitioning of the United States resulted in the formation of a total of 900 SS regions. Unlike the 1999 through 2004 surveys, the first stage of selection for the 2005 through 2013 NSDUHs was Census tracts....

So the National Survey on Drug Use and Health does not meet criteria #2 because the data is state-based, and therefore not compatible with our county-based maps. On the other hand, if the data was city-based, we can deduce a county from a city so this would be compatible.

An example of a survey that is compatible is the PreventionFIRST! Student Drug Use Survey. As you see in the survey results for Hamilton County, Ohio and Warren County, Ohio, the survey allows us to deduce the county, dates used ("Past 30 day use"), and at minimum 1 use for each survey respondent who provided the appropriate answer (13,962 students in Hamilton County, and 14,627 in Warren County)

Help us identify existing survey data by posting similar surveys conducted in the past which you feel meet the criteria above. Use this thread only for posting relevant surveys. Any comments should go in the General Feedback & Support Forum

Return to “Announcements”

Who is online

Users browsing this forum: No registered users and 1 guest