Skip to content

JasonCochran/SeniorProject

Repository files navigation

SeniorProject

Greg and Jason's senior project repo. Contains Useful links for research and tech information.

Installation instructions.

  • Note: These are for production setup
  • Clone the git directory: git clone
  • Run 'docker-compose up --build'
  • Download the Chicago crime data in CSV format and put it in the 'crimeCSV' folder
  • Download https://openmaptiles.com/downloads/tileset/osm/north-america/us/illinois/chicago/ and put it in the data folder
  • Run crimecleaning.py in the root directory
  • Run db_create.py and db_load.py in the dblayer container
  • Do work (if you delete the postGIS container you have to re-create and re-load the data)

Ideas for development order

    1. Build the database -> PostGIS
    1. Setup docker deployment system to automate setup and testing
    1. Create the front end GUI to view basic things from the database
    1. Create prediction algorithm (PreCog)
    1. Create caching layer for GUI, emphasize speed
    1. Create additional data aggregators and more prediction abilities
    1. Graduate

Purpose for each directory

  • dbLayer - holds scripts to create and upload data to a PostGIS database using SQLAlchemy + GeoAlchemy
  • webGUI - Act as the GUI for the predictive policing software
  • precog - Basic ML precog
  • stats_precog - Basic precog using simple statistics
  • cpdScraper - Scrape the Chicago Police Department crime database daily for new information
  • twittersuicide - Program to scrape Twitter for tweets that indicate possible suicide

Interesting predictive policing links

Algorithms links

Frameworks / tech stack links

Useful research links

Greg's plan

Chicago police data->postGIS->algorithm->web interface divide chicago into block sizes that reflect the precision of the crime location data map available crime data onto those geoblocks map additional factor data onto those geoblocks calculate information gain/entropy with regards to each attribute associated with all blocks containing a targeted crime sort attributes by information gain branch on the highest entropy attribute-record entropy and attribute repeat until the tree is finished, either all attributes used or all remaining attributes have low information gain run

Useful commands

  • Login to db (must be logged into PostGIS container): psql -h 127.0.0.1 -d predpol -U ppuser
  • Login into container: docker exec -it [container name] bash