Connecting Data: multi-domain graphs
In a post where I talked about modelling data in graphs, I stated that graphs can be used to model different domains, and discover connections. In this post, we’ll do an exercise in using multi-domain graphs to find relationships across them. An institution can have data from multiple sources, in different formats, with different structure and information. Some of that data, though, can be related. For example, we can have public data on civil marriages, employment records, and land/property ownership titles. What these data sets would have in common is the identity of individuals in our society, assuming of course, they were from the same locality or country. In these scenarios, in order to run a cross-domain analysis we need to find ways to connect the data in a meaningful way, to uncover new information, or discover relationships. We could do that to answer questions like “What percent of the married people own property vs those that don’t”, or more interestingly “who is recently married, and bought property near area X while changing jobs”. ...