How’s Kickstarter Doing These Days?

What kind of projects fail?We begin by getting an idea about the data-set we are facing with.

We have data of about 378,000 projects from March 2009 to March 2018 (incidentally Kickstarter was launched in April 2009, but whatever).

Lo and behold, the raw data:ks## # A tibble: 378,661 x 16## ID name category main_category currency deadline goal## <int> <chr> <chr> <fct> <chr> <date> <dbl>## 1 1.

00e9 The ~ Poetry Publishing GBP 2015-10-09 1000## 2 1.

00e9 Gree~ Narrati~ Film & Video USD 2017-11-01 30000## 3 1.

00e9 Wher~ Narrati~ Film & Video USD 2013-02-26 45000## 4 1.

00e9 Tosh~ Music Music USD 2012-04-16 5000## 5 1.

00e9 Comm~ Film & ~ Film & Video USD 2015-08-29 19500## 6 1.

00e9 Mona~ Restaur~ Food USD 2016-04-01 50000## 7 1.

00e9 Supp~ Food Food USD 2014-12-21 1000## 8 1.

00e9 Chas~ Drinks Food USD 2016-03-17 25000## 9 1.

00e9 SPIN~ Product~ Design USD 2014-05-29 125000## 10 1.

00e8 STUD~ Documen~ Film & Video USD 2014-08-10 65000## # .

with 378,651 more rows, and 9 more variables: launched <dttm>,## # pledged <dbl>, state <fct>, backers <int>, country <chr>, `usd## # pledged` <dbl>, usd_pledged_real <dbl>, usd_goal_real <dbl>,## # wrap_main_category <chr>Since we wanted to look what kind of projects get funded and what kind of project fail, it makes sense to first store new variables to split between failing and successful projects.

ks$wrap_main_category<-str_wrap(ks$main_category,width=5)success.

fail.

bar <- ggplot(ks, aes(x=wrap_main_category, fill=state))success.

fail.

bar + geom_bar() + theme_economist() + labs(x="Project Category",y="Count of Projects", title="Kickstarter Projects State")We see that for most project categories, the success rate would be less than 50%.

Below is a more detailed number on that.

We create new variables counting how many projects are in each category, and how many projects fail in each category, and simply divide them to find the fail rate.

ks.

all<- ks %>% group_by(main_category)%>% summarise(count=n()) %>% arrange(desc(count))ks.

allfail<- ks.

fail %>% group_by(main_category)%>% summarise(count=n()) %>% arrange(desc(count))ks.

rate <- ks.

all %>% mutate(fail=ks.

allfail$count/ks.

all$count) %>% arrange(desc(fail))ks.

rate## # A tibble: 15 x 3## main_category count fail## <fct> <int> <dbl>## 1 Journalism 4755 0.

787## 2 Games 35231 0.

742## 3 Fashion 22816 0.

729## 4 Food 24602 0.

700## 5 Technology 32569 0.

697## 6 Publishing 39874 0.

692## 7 Theater 10913 0.

685## 8 Art 28153 0.

658## 9 Design 30070 0.

649## 10 Film & Video 63585 0.

628## 11 Comics 10819 0.

619## 12 Music 51918 0.

534## 13 Crafts 8809 0.

497## 14 Photography 10779 0.

462## 15 Dance 3768 0.

380note: the column "count" represents the amount of projects belonging to the category in the data-set, the column "fail" indicates how often projects belonging to that category failsTurns out, journalism themed projects fail a lot.

and at the opposite side we got dance categorized projects failing the least.

Both have relatively small count of projects proposed.

Why?.Mayhaps it has got something to do with the funding goals?.Are people deterred by projects with big funding goals?.Or are projects with small amount of funding goal deemed not ambitious enough?.Is funding goals the answer?.We are going to take a sample of 1% of the population.

ks.

sample <- sample_frac(ks,0.

01)viol <- ggplot(ks.

sample,aes(x=state,y=goal),options(scipen=1000000))viol + geom_violin(scale="area") + coord_cartesian(ylim=c(0,1000000)) + theme_economist() + labs(x="Project Outcome",y="Funding Goal",title="Distribution of Funding States by Funding Goal")Turns out it didn’t.

It only shows that there ARE some crazy ambitious projects (which failed or cancelled) but funding goals does not seem to really affect success.

Makes sense.

Perhaps we need a closer look.

viol + geom_violin(scale="area") + coord_cartesian(ylim=c(0,100000)) + theme_economist() + labs(x="Project Outcome",y="Funding Goal",title="Distribution of Funding States by Funding Goal")See, most of the successfully funded projects are at the lower end of the funding goal.

as opposed to other states (fail) which distributes quite nicely.

Does this imply that lower funding goals signifies better calculated projects or just that, it is easier to fund?.Perhaps we ought to see how far are they from their funding goals?ks2 <- ks%>% mutate(failhard=(ks$pledged/ks$goal)) %>% filter(state!="successful", state!="live") %>% select(ID:pledged,failhard,everything())ks2## # A tibble: 241,906 x 17## ID name category main_category currency deadline goal## <int> <chr> <chr> <fct> <chr> <date> <dbl>## 1 1.

00e9 The ~ Poetry Publishing GBP 2015-10-09 1000## 2 1.

00e9 Gree~ Narrati~ Film & Video USD 2017-11-01 30000## 3 1.

00e9 Wher~ Narrati~ Film & Video USD 2013-02-26 45000## 4 1.

00e9 Tosh~ Music Music USD 2012-04-16 5000## 5 1.

00e9 Comm~ Film & ~ Film & Video USD 2015-08-29 19500## 6 1.

00e9 Chas~ Drinks Food USD 2016-03-17 25000## 7 1.

00e9 SPIN~ Product~ Design USD 2014-05-29 125000## 8 1.

00e8 STUD~ Documen~ Film & Video USD 2014-08-10 65000## 9 1.

00e8 Of J~ Nonfict~ Publishing CAD 2013-10-09 2500## 10 1.

00e9 The ~ Crafts Crafts USD 2014-10-02 5000## # .

with 241,896 more rows, and 10 more variables: launched <dttm>,## # pledged <dbl>, failhard <dbl>, state <fct>, backers <int>,## # country <chr>, `usd pledged` <dbl>, usd_pledged_real <dbl>,## # usd_goal_real <dbl>, wrap_main_category <chr>I added a column called “failhard” to show how far a project is from being funded by dividing amount pledged, and amount of goal.

But it didn’t show here.

We added a new column to calculate the completion percentage of the project’s funding goal.

This way, we can gauge how far are they from getting funded.

ks.

sample2 <- sample_frac(ks2,0.

01)viol2 <- ggplot(ks.

sample2,aes(x=state,y=failhard),options(scipen=1000000))viol2 + geom_violin(scale="area") + coord_cartesian(ylim=c(0,1)) + theme_economist() + labs(x="Project Outcome",y="Funding Completion",title="How Hard did They Fail?")Most projects fail not even passing the 25% mark.

Can we conclude that it is not the funding goal that is too big, or the time limit for funding as set by Kickstarter is too small?.Perhaps it is simply that the projects are not that interesting.

All in all we can conclude that a fail rate of 60%, frankly, is pretty low!.I initially thought only a very small fraction of projects would be successfully funded.

Turns out 40% of them hit the funding goal.

Projects either got funded, or fall very far from the goal.

Showing it’s still a hit or miss, but hey, 40% chance of funding is pretty big if you ask me.

And we’ll continue another time.

This is after all, my first rodeo with the R.

.. More details

Leave a Reply