Playground Earth

As the name reveals, this means we create a matrix of the number of occurrences for all possible combinations of two out of our 235 categories.This co-occurrence matrix is best visualized in a network graph that will show the connections between our categories and will also show the most co-occurring activities and connections:From this chart we can see immediately some of the key categories that are occurring most often: Food, park, museum, store, restaurant, historic site, natural feature and some more.Also some connections are quite clear: ‘Place of Worship’ is clearly linked with ‘Church’, ‘Mosque’, ‘Hindu Temple’ and ‘Sacred & Religious site’ but also with ‘Architectural Building’; while ‘Store’, ‘Restaurant’, ‘Cafe’, ‘Bar’ and ‘Food’ also often go hand in hand.The goal is to define a manageable subset of root categories to which all other categories can be related back to, through these co-occurrences.Fig..2 Root categoriesTo establish these initial root categories, we will use the returned related table behind the chart that includes the weight of occurrence for each possible combination of categories.Looking at the distribution of this weight value aggregated per activity, we will start by filtering on categories where the weight > 200..This will be leading us to the root activities displayed in fig..2 on the left.The friendly name to which we mapped each of these categories gives us 12 distinct categories that we will now consider our root category subset for further analysis.The next step is to derive the related root category for all of the 235 categories that are not in the above list..For this we can build a parent-child hierarchy and loop through all the categories based on the weight, till we reach the first root value.In other words, lets look for each category that is not in the above list, to which category its most co-occurring by weight, if that activity is not in the root, we will again look its most co-occurring category and continue until we find the category that is in the root.The below example will show you how this works..In this case we want to find the root category for the category ‘Campground’..Which turns out to be ‘Park’ (note that when equal weight apply, the alphabetic order prevails):Fig..3 relation tree for category ‘Campground’ ending into root category ‘Park’Applying this transformation to each category, will give us one (or multiple) root categories for each activity.So back to Fabrica la Aurora as an example; initially we saw the following categories from Google/TripAdvisor combined:art galleries; art gallery; cafe; jewelry store; furniture store; home goods store; store; restaurant; food;The result from are above implement transformation generalized these to the following root categories respectively:museum & gallery; museum & gallery; food & drinks; shopping; shopping; shopping; shopping; food & drinks; food & drinks;or, when uniqueness is applied: Museum & Gallery, Shopping and Food & Drinks..Not bad!.Meaning that this activity will be included in the algorithms’ output whenever any (or all) of the above three categories are given as an input.After this step, let’s make up the balance of the number of activities that will be filtered for each of these automatically derived root activities:Fig..4 number of activities per root category, produced with SeabornManual InterferenceThis approach is obviously not waterproof, so here I would like to manually interfere in some categories..Lets have a more detailed look into what subcategories make up the four main categories and the ‘Other’ category (which we want to be as small as possible).Fig.. More details

Leave a Reply