Existence
existence
is a particularly special core attribute within Factual's data. It is a machine-learned numerical value between 0.0
and 1.0
(with values rounded to the nearest tenth) applied to every POI record, and is an indication of how confident we are the POI is real, open, and not a duplicate. Records that are deemed to not meet those criteria are set to existence
equal to 0.0
. We derive these scores by training ML models using a variety of inputs, including social signals such as user checkins and tags.
By providing a range of confidence values, we enable our partners to filter out data below a threshold to suit their particular needs. The higher the filter threshold, the more accurate and precise the data will be, but the lower the coverage will be (this tends to be the strategy for mapping and display use cases). The lower the threshold, the more comprehensive the data will be, but you will have a higher quantity of bad records (this tends to be the strategy for search and active-user use cases).
Important Considerations
- Factual will deliver country data in its entirety to our partners (i.e., our data files will include records with
existence
equal to0.0
). Therefore, it is strongly recommended that all partners use anexistence
threshold greater than zero to filter out known closed businesses, duplicates, and junky data. existence
is not a probabilistic score; records with0.9
existence will not necessarily be real, open, and non-duplicates 90% of the time.- Every country has its own
existence
model and distribution (e.g., a0.5
score inUS
likely is not the same level of confidence as a0.5
inCA
).
Updated over 4 years ago