existence is a particularly special core attribute within Factual's data. It is a machine-learned numerical value between 0.0 and 1.0 (with values rounded to the nearest tenth) applied to every POI record, and is an indication of how confident we are the POI is real, open, and not a duplicate. Records that are deemed to not meet those criteria are set to existence equal to 0.0. We derive these scores by training ML models using a variety of inputs, including social signals such as user checkins and tags.

By providing a range of confidence values, we enable our partners to filter out data below a threshold to suit their particular needs. The higher the filter threshold, the more accurate and precise the data will be, but the lower the coverage will be (this tends to be the strategy for mapping and display use cases). The lower the threshold, the more comprehensive the data will be, but you will have a higher quantity of bad records (this tends to be the strategy for search and active-user use cases).

Important Considerations

  • Factual will deliver country data in its entirety to our partners (i.e., our data files will include records with existence equal to 0.0). Therefore, it is strongly recommended that all partners use an existence threshold greater than zero to filter out known closed businesses, duplicates, and junky data.
  • existence is not a probabilistic score; records with 0.9 existence will not necessarily be real, open, and non-duplicates 90% of the time.
  • Every country has its own existence model and distribution (e.g., a 0.5 score in US likely is not the same level of confidence as a 0.5 in CA).