Issues raised by March Wikidata harvest of identifiers

Mike Lichtenberg, BHL’s technical lead, has undertaken the March Wikidata Harvest of identifiers. Currently there are just over 9000 identifiers needing review. I’m aware that several of librarians on the BHL Cataloguing group are working on the BHL catalogue to make progress in rectifying these issues. If anyone wants to contribute on the Wikidata side the spreadsheet can be downloaded at this link.

@Ambrosia10 is it possible to explain to readers like me who don’t do this work, how we could help? For example, the first two rows of the data have the same creator id and the same BHL names, but link to the Wikidata entities for two different people (hence “BHL Author has more than one Wikidata identifier”). So it looks like an error in BHL. How do I add that information to the spreadsheet? Where do I send any corrections I might have?

BHLEntityType BHLEntityID EntityDescription IdentifierType IdentifierValue Message
Author 525 Kaliwoda, Leopold Johann, 1705-1781 [Person] Wikidata Q99094480 BHL Author has more than one Wikidata identifier
Author 525 Kaliwoda, Leopold Johann, 1705-1781 [Person] Wikidata Q84497 BHL Author has more than one Wikidata identifier

I agree this needs documentation or a “how to”. I’m aware the BHL cataloguing group produced some documentation for this for BHL cataloguers wanting to correct Wikidata items but I’m not sure where this is lurking. Perhaps @emckinley might be able to provide a link. I’ve also been wanting to produce more detailed documentation for the Wiki editors but am time limited currently. It is on my “to do” list!

1 Like

Hi @Ambrosia10 and @rdmpage,

At the moment, the Cataloging and Metadata Committee is focusing only on the data with the message, “Wikidata identifier associated with more than one BHL author.”

I’ve pulled those rows into a separate working Google spreadsheet, which is where volunteers document progress: Wikidata IDs associated with more than one BHL author or title. (This has been updated to reflect the most recent 20260331 report).

We do have supporting documentation that notes how to merge duplicate IDs in BHL and then deprecate the appropriate BHL IDs in Wikidata.

For example, the report noted Richard Thomas Lowe had three separate Creator IDs in BHL.

Screenshot 2026-03-31 164311

After confirming they were all the same person, I merged the records so all forms of the name now appear under 5959.

I then deprecated the inactive IDs in Wikidata

Open to any feedback regarding the process and documentation.

I haven’t personally worked with cases where a “BHL Author has more than one Wikidata identifier, ” but in the screenshot example you provided, it looks like Q84497 (Jacquin, Nikolaus Joseph von) was mistakenly attached to BHL ID 575 (Kaliwoda, Leopold Johann, 1705-1781)

Based on my understanding, the next steps would be:

  • Deprecate BHL ID 575 in Wikidata Q84497(which it looks like has already been done).
  • Add BHL ID 575 to the correct Wikidata Q99094480.
  • Remove the Q84497 identifier from BHL ID 575 in BHL Admin.

I’m not entirely sure which of these steps need to be done manually versus which might be handled automatically during the next harvest/ingest?

2 Likes

Thanks so much for this @emckinley, particularly all the links to the guidance. Perhaps if Mike is on this forum he might be able to give you and other BHL folk more information on this. I’m afraid this is the point where my technical knowledge and also knowledge of BHL’s backend (so to speak) rapidly comes to a stop.

It’s perhaps also worth mentioning that Wikidata has “Constraint Reports” (generated automatically; constraints themselves are editable) that highlight such issues. For example:

Wikidata:Database reports/Constraint violations/P4081 - Wikidata

3 Likes

Thanks Andy! Appreciate you sharing this.

The harvest process is a fully-automated >harvest<, full stop. It makes no attempt to correct discrepancies that are included in the report. It should be assumed that nothing will be automatically fixed during the next harvest.

2 Likes

Thanks for that Mike. Good to know.

There is also Mix’n’match:

https://mix-n-match.toolforge.org/#/catalog/506

which currently lists >180K people in BHL, not yet matched to a Wikidata ID:

https://mix-n-match.toolforge.org/#/list/506/unmatched

I’m not sure how often newly-minted BHL creator IDs are added to that.

1 Like