I asked the TAXACOM mailing list for feedback on BHL. After extracting the responses I asked ChatGPT to summarise them, which I have edited and present below. Nothing will be a surprise to seasoned BHL users, such as the difficulties searching, the taxonomic name finder misinterpreting scientific names (e.g., deciding that “scutellum” is always a scientific name as opposed to a key piece of insect anatomy, resulting in 704 pages of results mostly not about Scutellum), the gaps in coverage, and the inability of users to be able to flag OCR errors or bad taxonomic names.
Summary of TAXACOM responses:
Expanding and Completing the Content
A consistent message was the need to extend and complete BHL’s coverage.
Users highlighted:
- Keeping pace with the “moving wall” of public domain content
- Filling gaps in journal runs, including missing volumes
- Prioritising rare or at-risk materials held in obscure locations
- Expanding ephemeral publications such as nursery catalogues
- Including important but currently undigitised journals, particularly from underrepresented regions
In short: completeness matters as much as scale.
Strengthening Integration with Biodiversity Data
Many contributors see BHL not just as a library, but as infrastructure within a wider data ecosystem.
Key suggestions included:
- Integrating content directly rather than linking to external repositories
- Enabling better linking from taxonomic databases (e.g. to original species descriptions)
- Improving alignment between bibliographic metadata and how scientists actually cite literature
The goal is a more connected system where literature, names, and data work seamlessly together.
Making Content Easier to Find
Despite the breadth of content, finding specific items remains a challenge.
Common issues:
- Difficulty locating known works due to inconsistent metadata
- Variations in titles, publication dates, and author formats
- Limited advanced search functionality (e.g. AND, NOT, filtering within results)
For many users, discovery—not access—is the main barrier.
Improving OCR and Name Recognition
Automated text extraction and name indexing are powerful features, but users report significant quality issues.
These include:
- Poor OCR accuracy, especially in older texts or plates
- Problems recognising special characters (e.g. ligatures like æ/œ)
- Scientific name detection that:
- Misses real names
- Flags non-names as taxa
- Challenges handling historical formatting conventions
These issues create noise and reduce confidence in automated tools.
Enabling Community Contributions
Several contributors suggested that users themselves could help improve BHL.
Ideas included:
- Allowing users to flag errors in OCR or metadata
- Introducing controlled editing by trusted contributors
- Enabling users to correct or remove incorrect taxonomic name tagging
A hybrid model combining automation with expert input could significantly improve data quality.
Keeping the Interface Stable (While Improving It)
There was strong support for maintaining the current interface and structure.
At the same time, users suggested targeted improvements:
- Preserve deep links and avoid disruptive redesigns
- Add a full-screen reading mode, especially for smaller screens
Stability is seen as a major strength—changes should be incremental, not disruptive.