Public ID cannot be selected in the Manage duplicates section’s Update existing Test based on and Skip duplicates based on fields, because:
ISSUE: those two fields only allow the selection of a mapped field (i.e. being imported and mapped to an existing field).
and of course Public ID cannot be imported, since it is immutable, so it never gets exposed in the above two fields, and can therefore not be used to match existing entities.
I have posted this under “Bugs & Issues” because the Public ID field is the most important/safe field when importing data for existing entities since it is the immutable, truly unique ID of an entity across time (i.e. even if its associated entity is deleted, the released ID will never be re-assigned to a another new entity).
It’s a valid comment, but there are some technical challenges with using Public Id as one of the import fields.
For example, there could be issues if the import rows include one where the Public Id has a value which is equal to the value of a deleted entity.
It presumably wouldn’t make sense to update the entity in the trash bin, but it wouldn’t be acceptable to create a new entity with a duplicate Id to the deleted one.
There might also be other issues as well that I haven’t yet thought of.
You’re right that using Public ID raises edge cases around deleted entities. But that applies even now, since using the “name/label” or any other field such as “own_custom_unique_key”, could lead to rows that match the value of the field of a deleted entity regardless of the fact that it is not the Public ID.
The only difference I see:
In the case of using a non-Public-ID field
The importing user would have the valid expectation that even if a row matches an entity in the trash it should not update the deleted entity, but create a new entity.
But in the case of using the Public ID field
If it matches an entity in the trash, even if it does not update the deleted entity, it CANNOT create a new entity, since Public ID is immutable. As for the importing user’s expectation, I would argue that the very act of including a value for the immutable Public ID in a row, signals the update-only (not update or create new) intent for that row. Only an empty Public ID would mean create.
Import semantics
In fact you have raised a valid point about “import semantics”. Perhaps in the Manage duplicates component there should be a note stating the current behaviour, which I presume is: Entities in Trash are not checked.
And if Public ID is selected, then a better version of something like this: Rows with a non-empty Public ID which does NOT match any entity are ignored.
Thanks for posting here @Sev and your thoughts @Chr1sG.
A big plus one from me here. There definitely needs to be a “safe” way of overwriting data - where we can be 100% sure other users aren’t making edits. Typically this immutability is safeguarded by the primary key / unique ID (tbh I was genuinely taken aback to discover this limitation today as importing / overwriting data is typical use case for “organisations run by us nerds”).
I’m with @Sev in that including the Public ID would signal the users intent to update existing records, but note the challenges observed regarding deleted entities.
My two pennies: would it be possible to “check” the uploaded data to see if the user has included Public IDs for entities which are deleted, and simply let them know with an error message?
“The uploaded file contains entity Ids that have been deleted from the database. Please either restore these entities or remove them from your upload file prior to continuing”.
Perhaps this is a foolish suggestion, but I definitely agree that these import semantics need resolving so that we can have confidence in data overwrite operations.