Auto-linking self relation working funky

Auto-linking a self-relation is a great way to find duplicates, even composite unique duplicates. It works, but is a bit off.

  1. it shows it’s own entity inside of the relationship.
  2. It creates two relationship fields for the same thing. This kind of makes sense, as one relationship is “all entities that are like me” and the other is “all entities I am like”, but in this case it is the same.
  3. It allows you to create a one to one or one to many, but things don’t link if you do that. it must be a many to many to work properly.

My suggestion is to have a special case for when auto-linking a self-relation: only create one relation field, not show it’s self, and force it to be to-many. Or disallow auto-linking on self-referencing relations, and introduce a separate “Duplicates” field that essentially does the same thing. Where you can set what fields to use to find duplicates.

This might break other people’s workspace if they have rules for duplication that expect 1 entity in the relation by default, but would be better in the long term. Its a clean way to find duplicates and even send automatic notifications when duplicates are made.

We are already implementing functionality to find (and if necessary remove) duplicates

Assuming the root cause objective is to prevent duplicates from existing, then the ‘unique fields’ feature is part of the long term solution for this. This will probably be extended at some point to cover uniqueness as determined by multiple fields.

Apart from these things what is the underlying use case for creating a db relationship to ‘duplicates’?

The autolinking works better than the in-house implimentation for a few reasons.

  1. You can set composite unique constraints. So to check if a few values together are unique.
  2. It can be triggered by automation. (Either to merge automatically or notify)
  3. You can have a static view for duplicates, the views that are created in the de-duplication by fibery only show duplicates at a specific point in time, and then get outdated as the data grows.

This would solve the use case i’m building indeed! Would be even better as it would prevent entry of the data, as opposed to require it fixed after.

That said, you don’t always want to remove duplicates, and in those cases it would be nice to just see a live “Duplicates” view (within entity, and for whole DB)