Data Issues
Problems with FEC source data:
- Detailed individual contributions to candidates are reported only for contributions over $200. To get a complete picture, quarterly filings must be taken into account.
- Committee contributions to candidates are stored in the Committee Contributions table for a total of three sources that have to be combined in order to get contribution totals.
- Party identification is not validated and is not coded consistently. I've cleaned the glaring errors.
- The Individual and Committee contributions tables each contain State, City, and Zip columns. I normalized the tables by yanking out State and City into a snowflake, only to learn that the data is dirty. The database includes whatever the candidates and committees entered, without data validation. Accordingly, there are invalid zip codes, invalid state codes, and other assorted garbage, cities entered into Street Address column, zips entered into City column, etc. I put City and State back into the tables, and geographic analysis will be approximate.*
- State population data comes from the Census Bureau using figures released Dec 2007.
* This should help: Zip Codes
Reader Comments