Skip to content

cmwardcode/kickstarter-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

An Analysis of Kickstarter Campaigns

Overview of Project

Organizing, sorting, an analyzing excel data for a crowdfunding campaign. Deliverable 1: Outcome based on Launch Date Chart, Deliverable 2: Outcomes based on goals chart, Deliverable 3: Written analysis of results (README)

Skill Development

Import data into tables for analysis; apply filters, conditional formatting, and formulas; generate and interpret pivot tables; calculate summary statistics - central tendency, standard deviation, and variance; characterize data to identify outliers in data sets; perform an excel analysis with visualizations; interpret common Excel visualizations

Purpose

An up-and-coming playwright wants to design a crowdfunding campaign to fund her play “Fever” which she estimates will cost $12,000. The playwright has provided a dataset for which the goal was to determine whether there are specific factors that make a campaign successful. The data was organized and sorted in excel to prepare for analysis. The goal of the analysis was to understand the campaigns from start to finish, determine what factors that make a project's campaign successful, and how to develop a campaign to mirror those that were successful. After this initial analysis another was done to identify how different campaigns fared in relation to their launch dates and funding goals.

Analysis and Challenges

Initial Look

The dataset was comprised of 4113 crowdfunding campaigns. Information provided about each campaign: Id, Name, Blurb, Goal, Amount Pledged, Success Outcome, Country, Currency, Deadline Date, Launch Date, Staff Pick, Number of backers, spotlight, Category and Subcategory. Minor adjustments were made to clean the data through proper cell formatting.

Filtering & Formatting

The “Outcome” column was color-coded using conditional formatting to easily identify the outcome of the campaign. A new column was then added to compute the “Percentage Funded” to easily determine the deficit between the “Goal” and “Pledged” columns. The “Percentage Funded” used the ‘=ROUND’ function and value shading was added. It was then that an outlier was identified because of the limited color transition. “Average Donation” column was added and used the ROUND function again to average the pledged amount by backers. The “Category and Subcategory” column was filtered to only view “Theater” category. Two additional columns were added to filter the data to show more detail. The “Category and Subcategory” were split into “Parent Category” and “Subcategory” columns. The “Deadline” and “Launched_at” columns were in Unix Timestamps so they were converted using the Date Created Conversion, ‘=(((J2/60)/60)/24)+DATE(1970,1,1)’ and moved to new columns, “Date Created Conversion” and “Date Ended Conversion”.

Errors & Debugging

The Average Donation column threw some errors so to correct that the formula was altered to include the IFERROR formula. The new formula is the following: ‘=IFERROR(ROUND(E2/L2,2),0)’

Tables & Visualizations

A new sheet was created “Category Statistics” that created a visual representation of pivot table. The Pivot Chart showed Theater Outcomes in the US. Another Pivot Chart was added to allow for further detail. The next sheet is “Subcategory Statistics” which visualizes the subcategory of Theater: Plays and its outcomes. A pivot table and chart were made on the sheet “Theater Outcomes by Launch Date” to allow for easy determination of what month was the most successful. A sheet labeled “Edinburgh Research” was created to home in on plays that were a part of the Edinburgh Festival Fringe to identify how they were started and funded. The formula used to pull this information was ‘=VLOOKUP’. A new sheet named “Descriptive Statistics” was created to analyze the measure of central tendency and spread for both successful and failed goal and pledged amount of play campaigns in the US. The functions used were ‘=AVERAGE’, ‘=MEDIAN’, ‘=STDEV.P’, and ‘=QUARTILE.EXC’.

Analysis of Outcomes Based on Launch Date

A new column was added to the Kickstarter dataset for “Years” and the YEAR() function was used on the “Date Created Conversion” column. A new sheet was labeled “Theater Outcomes by Launch Date.” A pivot table was inserted into the sheet that filtered the “Parent Category” and “Years” against different outcomes of theater campaigns. A pivot table line chart was then inserted to visualize the relationship between launch month and outcomes. Theater_Outcomes_vs_Launch

Analysis of Outcomes Based on Goals

A new sheet was added and labeled “Outcomes Based on Goals.” A table was labeled with Goal amounts, number of successful, failed, and canceled, number of total projects, and then percentage of successful, failed, and canceled. The goal column contained ranges of amounts to group the plays by. The function used was the ‘=COUNTIF’ to identify the number of various outcomes. An example of the code is ‘=COUNTIFS(Kickstarter!D:D,"<1000", Kickstarter!F:F, "successful", Kickstarter!N:N, "theater/plays")’. The total projects used ‘SUM()’ and the percentage columns used cell formatting and dividing number of outcome by total projects. A line chart was then populated to reflect the results visually. Outcomes_vs_Goals

Challenges and Difficulties Encountered

The challenge was identifying if there was an outlier in the data. Because of the limited color transition in the “Percentage Funded” column of the Kickstarter Data. After reviewing the data in that column there was just large returns on the pledged amount with very small amounts on for the goal. They are valid values, so they remained included in the data.

Results

  • Two conclusions you can draw about the Outcomes based on Launch Date. The most successful month to launch the campaign would be in May as it had the highest amount of success. The second is that the highest fail rate was also in May but when comparing the failed to the success there was over 50% more successful than failed.
  • It can be concluded that the most successful goal amount was less than $1000 based on the Outcomes Based on Goals. The anticipated amount for the “Fever” production was $12000 which has a success rate of 54%.
  • Limitations of this dataset is the amount of unnecessary data. The extra data to include campaign types outside of theater and how many different countries it was drawn from created more of challenge to sort through. Other limitations would be more data on these campaigns such as including how the campaign was advertised and designed as that factor could contribute to the outcome of the campaign.
  • Other possible tables and graphs that could be to compare the duration of campaign to the percentage funded to see if there is a amount of time that leads to meeting or exceeding the funding goal.

About

Excel analysis with visualizations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published