-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish ids from multiple providers #9896
Distinguish ids from multiple providers #9896
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do have some concerns about this approach - it seems to care only about what ends up in the EIDs array; with no support for the old userId
map. I think this means that we need to deprecate it, or if this catches on adapters that are using it will effectively become second class citizens (but I'll defer to @patmmccann for the decision).
By that I mean that for example, if your adapter uses userIdAsEids
you can get uid2 from liveintent; but if you're using userId
you won't.
It does have the advantage of being much simpler than a comprehensive solution.
modules/userId/index.js
Outdated
@@ -453,7 +453,7 @@ function getSubmoduleId(submodules, sourceName) { | |||
return {}; | |||
} | |||
const submodule = submodules.filter(sub => isPlainObject(sub.idObj) && | |||
Object.keys(sub.idObj).length && USER_IDS_CONFIG[Object.keys(sub.idObj)[0]]?.source === sourceName); | |||
Object.keys(sub.idObj).length && USER_IDS_CONFIG[Object.keys(sub.idObj)[0]]?.find(conf => conf.source === sourceName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this says "the module that corresponds to this source is the first one I find that can provide IDs from this source". That's not necessarily true is it? I think we need a way to mark the "owner" of a source outside of the EID object. It could be as simple as the first one declared is the one you own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, since this is only used in here:
Prebid.js/modules/userId/index.js
Lines 656 to 658 in d1b228c
function getUserIdsAsEidBySource(sourceName) { | |
return createEidsArray(getSubmoduleId(initializedSubmodules, sourceName))[0]; | |
} |
I think what we should do is reverse the filtering - instead of finding the module for a given source to get its EIDs, it makes more sense now to get all EIDs and filter them by source (since they can come from different modules).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly, "get all EIDs and filter them by source". If there is a collision, i.e., 2 or more eids with the same source check the provider and choose the one where the provider and source are the same. The question then becomes what do we do when one eid has no provider value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The eids#USER_IDS_CONFIG
could be adjusted to always have a global provider
for each user id that is then used as provider
for all uids
extracted based on it. That way every eid/uid
would have a provider
. For example:
'lipb': {
provider: 'liveintent.com',
extractors: [
// liveIntentId
{
getValue: function(data) {
return data.lipbid;
},
source: 'liveintent.com',
atype: 3,
getEidExt: function(data) {
if (Array.isArray(data.segments) && data.segments.length) {
return {
segments: data.segments
};
}
}
},
// uid2
{
source: 'uidapi.com',
atype: 3,
getValue: function(data) {
return data.uid2;
}
}
]
}
And eids#createEidObjects
would take care of building the uid.ext
using the global provider
value and output
[{
source: 'liveintent.com',
uids: [{
id: 'some-random-id-value',
atype: 1
}],
ext: {
provider: 'liveintent.com',
segments: ['s1', 's2']
}
},
{
source: 'uidapi.com',
uids: [{
id: 'some-random-id-value',
atype: 3,
ext: {
provider: 'liveintent.com'
}
}]
}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One related question: what if there is a collision and neither of the providers is the owner of the id?
Maybe it would make sense to return an eid
with multiple uids
when getUserIdsAsEidBySource
is called and there is a collision?
For example: getUserIdsAsEidBySource('uidapi.com')
would then return
[{
source: 'uidapi.com',
uids: [{
id: 'some-random-id-value',
atype: 3,
ext: { provider: 'uidapi.com' }
},
{
id: 'some-random-id-value',
atype: 3,
ext: { provider: 'liveintent.com' }
}]
}]
when a UID2 could be resolved via LI's and the UID2 adapter. The caller then decides which uid
to pick if there are multiple. We also can make sure that the uid
of the id owner is always at index 0
in the uids
array.
Same can be applied when building the eids
array in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently getUserIdsAsEidBySource returns a single object, and it's part of the public API, so that's not really up for change. if there's more than one EID we have to pick one.
I agree that the API should not be changed in this regard. getUserIdsAsEidBySource
should return one single object - one eid
. An eid
, however, can have an array of uids
. My proposal is rather about - in case of collisions - exposing multiple uids
with their providers within the eid
that is returned. By providing the "provider" field the uids
become distinguishable and the caller can decide what id to use in case there are multiple based on the provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to do that you have to make the assumption that all providers for a source will always generate "compatible" EID objects; for example, they share (edit - cancel that, atype
andatype
is part of UID) everything they put in ext
(through getEidExt
) always applies equally no matter the provider. I would be very hesitant to rely on this especially since the EID objects are only likely to get more complicated over time. Choosing one when there's a conflict is an acceptable compromise IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets not change the API, that will require changes to bid adapters and the servers behind them, that would be a massive undertaking. Can we make this configurable? Can the pub declare a list of priorities in the core user id module?
========================
In case of user id source collisions
- liveintent.com
- pubcid.org
- google.com
If this prioritization still doesn't produce a winner, pick one at random.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this configurable? Can the pub declare a list of priorities in the core user id module?
@jdwieland8282 Sounds good.
@jdwieland8282 What do you think, should that config allow to provide priority per identifier type - e.g. uid2
from uidapi.com
has higher priority than uid2
from liveintent.com
? Or should it be a config that defines that all identifiers from one user id module have priority over same types of identifiers exposed by any other module with lower priority? I personally tend towards the first option as it is more fine grained - but it is also more complex in the implementation and when writing the config.
Also, I would see if it is feasible to apply this prioritization logic when building the eids
array, not only in getUserIdsAsEidBySource
. Are we on the same page here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We met today in Identity PMC call and made the following decisions after vote:
https://docs.google.com/document/d/1dXrTaA-gFXayP-QWhFZN_91kLWp7qFO1CH1GmgafUdc/edit?usp=sharing
We want to go with Option I.I which is to give publisher the ability to provide the priority of the modules for each identifier. In the event where no configuration is provided, the first id/source that gets populated from any provider will get picked.
We want this configuration defined in the core user id module so that publishers can define which. sub adapters should override the other when there are id conflicts
The priority field should appear as an optional parameter right next to name here for each id: https://github.com/prebid/Prebid.js/blob/master/modules/userId/userId.md
I'm pretty okay with this approach; we should also remember that the user ID method is in the public api and encourage the 57 adapters not using the eids method of the advantages to changing. It isn't really discriminating against an adapter if they can all change implementations at any time. We hope live intent may help in this, letting their ssp partners know to prefer the eids method. We should make sure this information is well published. |
@patmmccann I can help come up with a public facing documentation explaining this. |
b8b0d17
to
c0e5c62
Compare
@jdwieland8282 @dgirardi @deepthivenkat Adjusted the PR to make submodule priority configurable per user id. Also, closed the magnite id PR and added magnite id handling here. |
The e2e test failures seem not to be caused by the changes in this PR. Ran the e2e test from local machine successfully. |
I'm not qualified to code review this, but maybe @dgirardi can when he has a free moment. I can review the docs pr does that exist yet @deepthivenkat ? |
modules/userId/index.js
Outdated
const currentIdPriorityByName = idPriority[key]?.reverse().indexOf(i.submodule.name); | ||
const currentIdPriorityByAlias = idPriority[key]?.reverse().indexOf(i.submodule.aliasName); | ||
// eslint-disable-next-line | ||
const currentIdPriority = currentIdPriorityByName ? currentIdPriorityByName : currentIdPriorityByAlias ? currentIdPriorityByAlias : - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this works (and I don't see it in a test case) - assuming you meant "use the name first, then the alias".
Because of how boolean casting works, this will always use the name (every value except 0
is truthy), except when it was the last element in the priority array (reversed becomes 0
), in which case it will always use the alias, even if it was not in the array.
I am not sure that considering aliases here is necessary, but if we do, my approach would be to "resolve" them when the id priority is passed to setConfig
; that would also help with performance. By that I mean, instead of just taking in idPriority
verbatim from user input, do some processing so that when you get here idPriority[key]
is already reversed, and does not contain any aliases - only the proper names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modules/userId/index.js
Outdated
.filter(i => isPlainObject(i.idObj) && Object.keys(i.idObj).length) | ||
.reduce((carry, i) => { | ||
Object.keys(i.idObj).forEach(key => { | ||
carry[key] = i.idObj[key]; | ||
const currentIdPriorityByName = idPriority[key]?.reverse().indexOf(i.submodule.name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, doing the reverse only once may be a requirement and not just a suggestion - because IIRC it modifies the array in place so on the second call you'd actually be restoring it to the original order.
edb0498
to
2256f92
Compare
@deepthivenkat This looks ready to merge once we have a docs pull drafted? |
Here is the docs pull: prebid/prebid.github.io#4675
…On Wed, Jun 21, 2023 at 10:38 AM Patrick McCann ***@***.***> wrote:
@deepthivenkat <https://github.com/deepthivenkat> This looks ready to
merge once we have a docs pull drafted?
—
Reply to this email directly, view it on GitHub
<#9896 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABYICH5AZWAJNJIQNXJYZWLXMMIOZANCNFSM6AAAAAAXURURWI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
* Make id priority configurable * Use mocked modules * Support provided ext for medianet, uid2 and bidswitch eid.uid * Add support for magnite id * Delete unused findEid function * Add a tests for getEncryptedEidsForSource * Extract constant for the provider domain * Make updatePPID take priority into account * Compute effective priority only once and handle aliases * Fix after merge * Use Map#has instead of Map#get * Remove duplicate test * Add tests * Run linter
* Make id priority configurable * Use mocked modules * Support provided ext for medianet, uid2 and bidswitch eid.uid * Add support for magnite id * Delete unused findEid function * Add a tests for getEncryptedEidsForSource * Extract constant for the provider domain * Make updatePPID take priority into account * Compute effective priority only once and handle aliases * Fix after merge * Use Map#has instead of Map#get * Remove duplicate test * Add tests * Run linter
* Make id priority configurable * Use mocked modules * Support provided ext for medianet, uid2 and bidswitch eid.uid * Add support for magnite id * Delete unused findEid function * Add a tests for getEncryptedEidsForSource * Extract constant for the provider domain * Make updatePPID take priority into account * Compute effective priority only once and handle aliases * Fix after merge * Use Map#has instead of Map#get * Remove duplicate test * Add tests * Run linter
Type of change
Description of change
The goal of this PR is to sketch an implementation proposal for allowing multiple user id modules to expose the same kind of identifiers. E.g. two modules exposing UID2. The idea is that a config - like the one below - provides priority in case a collision happens.