PhySH – Physics Subject Headings

Short Description

PhySH (Physics Subject Headings) is a physics classification scheme developed by the American Physical Society to organize journal, meeting, and other content by topic, available for use starting in January 2016. It is intended initially to meet the specific goals of the APS, while a longer term goal is to make it available for use by the broader community. PhySH consists of hierarchies of concepts grouped into facets: Research Areas, Physical Systems, Properties, Techniques, and Professional Topics. The concepts are also organized by discipline for convenience. Individual concepts may belong to more than one facet or discipline.

URL

http://physh.aps.org/

Owned/Developed by

  1. Name of Owner: American Physical Socety
  2. Technical Contact:  Arthur Smith, *protected email*
  3. License Contact: Mark Doyle, *protected email*

Adopted (as opposed to owned) by organizations/publishers:
How is this KM applied?

  1. Manually
  2. By Authors and Editorial Staff

How is this KM used?

  1. Classify manuscripts within the APS peer-review process
  2. Help editors in finding similar articles previously submitted and in finding suitable referees

Description of the current use case(s) of the KM

  1. Manuscripts are assigned concepts from PhySH by submitting authors; these assignments are reviewed and may be modified by editors.
  2. For each journal section, relevant concepts are assigned to the most appropriate handling editor who is knowledgeable in that area
  3. Editors can search for other articles on the same or related topics

Description of the future/potential use cases of the KM (not yet realized)

  1. Improve finding relevant content across APS journals, which often have overlapping scopes
  2. Link relevant content from other APS non-journal areas (e.g., meetings)
  3. Map other classification systems (eg. PACS) to PhySH to enable older material to be indexed in the same way
  4. Allow individuals (particularly referees) to identify their areas of expertise
  5. Integrate with other knowledge models – For example, an independent taxonomy for chemical substances or astronomical objects should be relatively easy to append to PhySH.
  6. PhySH concepts have permanent identifiers allowing them to be integrated into the web of linked data.

What are the main goals for using this KM?

  1. Improve the peer-review process for our journals by ensuring the right expertise is applied in reviews (PACS, which was previously used for this, ceased updating in 2010).
  2. Cover all of physics
  3. Improve discovery – Content properly tagged with PhySH is intended to enable new and useful ways to browse and search the content while providing the underpinnings for recommendation systems and other personalized services.

Rationale for KM vs other means of searching and browsing?

  1. By providing a standardized list of “keywords” rather than relying on author-supplied terms or full-text or statistical indexing, we can have a greater assurance that related articles are associated with one another.

Is the KM being actively developed?

  1. Yes, internally – also feedback is welcome, see physh.aps.org website for links.

License information:

  1. PhySH is copyrighted with all rights reserved by the American Physical Society. We are still considering what license we would use for any public release of PhySH.

Linking NPG ontologies to external datasets

 

We (Macmillan Science and Education) have made efforts to begin linking our domain models to external datasets.  Our article-types, journals and subjects models are now linked to DBpedia (Wikipedia) and Wikidata.  Our subjects model is additionally linked to Bio2RDF and MeSH.

Also, our core model is now linked to a number of other external models (CIDOC, FaBIO, schema.org, etc.).

Our models, now including these links, are available to view or download at nature.com/ontologies.

We will continue to refine and expand these links and would be interested in any thoughts, ideas and feedback from the community, particularly around any additional datasets we should consider linking to.

NPG Article Types Ontology

The NPG ArticleTypes Ontology is a categorization of kinds of publication which are used to index and group content published by Springer Nature. This taxonomy is organised into a single tree using the SKOS vocabulary. It includes article-types that are directly applied to content, such as Article, Review Article, News, or Book Review plus higher-level groupings such as Research, News and Comment, or Amendments and Corrections.

URL OF KNOWLEDGE MODEL:

http://www.nature.com/ontologies/models/domain/article-types/

OWNED/DEVELOPED BY:

ADOPTED (AS OPPOSED TO OWNED) BY ORGANIZATIONS/PUBLISHERS:

Springer Nature

HOW IS THIS KM APPLIED?

Applied manually by authors or editorial staff as part of the standard publishing workflow.

DESCRIPTION OF THE CURRENT USE CASE(S) OF THE KM

This model allows us to categorize content based on the type of publication, allowing content of similar type to be grouped or filtered at varying degrees of granularity.

IS THE KM BEING ACTIVELY DEVELOPED?

Yes, internally

LICENSE INFORMATION:

CC0 – http://creativecommons.org/about/cc0

Auto-tagging

We have a deep taxonomy of CS concepts – at the deepest level of the tree there are seven levels. The most useful concepts for precision search are of course the most granular concepts represented by the leaves of the tree.

However, concepts can be multi-parented, so the accurate application of a concept to a text requires that the context within the tree, i.e., the correct branch, be understood.

While expert authors who apply the terms to their articles have varying degrees of interest and attention to this indexing task, our experience shows that they rarely misapply terms – sometimes they appear lazy and are happy to assign only high-level concepts such as “Software” which is not too useful.

However, our experience with an auto-tagger shows that a huge amount of “noise” is created. We consider the noise unacceptable – presenting it to users will create distrust in the taxonomy itself.

We have been expanding the logical rules of the auto-tagger in an effort to reduce the noise to an acceptable level. So far, without success.

I have been trying to understand why.

So far, the best explanation I can come up with is that while hierarchical context is readily understood by the human brain, auto-taggers based on statistical occurrences of a concept and within proximity of other words and concepts, cannot accurately reproduce hierarchical context.

Any advice would be appreciated.

Springer Nature

Springer Nature is one of the world’s leading global research, educational and professional publishers, home to an array of respected and trusted brands providing quality content through a range of innovative products and services.

Springer Nature is the world’s largest academic book publisher, publisher of the world’s most influential journals and a pioneer in the field of open research. Springer Nature was formed in 2015 through the merger of Nature Publishing Group, Palgrave Macmillan, Macmillan Education and Springer Science+Business Media.

Springer Nature is embracing linked data technologies as an integral part of its content publishing operations and has developed a data model which is highly responsive to new and legacy business requirements. Linked data is central to the customer experience in providing content discovery applications and in facilitating emergent behaviours in interacting with content.

Many of the models developed have recently been published on our Ontologies Portal at nature.com/ontologies and are shared in order to contribute to the wider linked data community and to provide a public reference. These models cover publication things – articles, figures, etc. – and classification things – article-types, subjects, etc. – plus additional things used to manage our content publishing operation – assets, events, etc.

We have also published a model for conference proceedings on our LOD Conference Portal at lod.springer.com

Knowledge Models Used

Contact

NPG Subjects Ontology

The NPG Subjects Ontology is a polyhierarchical categorization of scholarly subject areas which are used for the indexing of content by Springer Nature. It includes subject terms of varying levels of specificity such as Biological sciences (top level), Cancer (level 2), or B-2 cells (level 7). In total there are more than 2750 subject terms, organised into a polyhierarchical tree using the SKOS vocabulary.

URL

http://www.nature.com/ontologies/models/domain/subjects/

Owned/Developed by:

Adopted by Organizations / Publishers:

How is this KM applied?

Applied manually, by authors or editorial staff as part of the standard publishing workflow, or by professional indexers

Description of the current use cases of the KM

The NPG Subjects Ontology constitutes the main backbone of nature.com subject areas, a new section on nature.com that allows users to browse content topically rather than navigate via the more usual journal paradigm. Each of the terms in the ontology includes a link to the relevant subject page on nature.com.

Is the KM being actively developed?

Yes, internally

 License Information:

CC0 – http://creativecommons.org/about/cc0

NICEM Thesaurus

Short Description

Thesaurus of the National Information Center for Educational Media (NICEM), used for indexing the records of NICEM’s bibliographic database.

Owned / Developed by

  1. Name of Owner – Access Innovations, Inc.
  2. Name of Developer – Access Innovations, Inc.
  3. Technical Contact – Mary Garcia, *protected email*
  4. License Contact – *protected email*

How is this KM applied?

  1. Manually | Auto-tagging software | Both
  2. By Authors | Editorial Staff | Professional indexers

How is this KM used?

  1. Direct Bibliographic Search | Indirect (e.g., used to expose content resulting from other user actions)
  2. Display | Grouping of results
  3. People search | Author profiles | Publication profile

Description of the current use cases of the KM

Used for indexing bibliographic records of non-print educational media in the NICEM database.

What are the main goals for using this KM?

  1. Enhance UX
  2. Increase Search Engine Ranking
  3. Increase time user spends on site
  4. Increase traffic
  5. Increase downloads

Rationale for KM vs other means of searching and browsing?

The thesaurus contains terms that reflect the subject matter of educational material, especially at the K-12 levels. An associated rule base that has been developed specifically for those terms enables appropriate indexing and accurate retrieval of bibliographic records, as well as user-friendly browsing in conjunction with a search interface.

 Is the KM being actively developed?

  1. Yes, internally

License Information:

  1. Terms of license or link to license terms – Contact *protected email*.

PLOS

PLOS (Public Library of Science) is a nonprofit publisher and advocacy organization founded to accelerate progress in science and medicine by leading a transformation in research communication.

Our core objectives are to provide ways to overcome unnecessary barriers to immediate availability, access and use of research, pursue a publishing strategy that optimizes the quality and integrity of the publication process, and develop innovative approaches to the assessment, organization and reuse of ideas and data.

Knowledge Models Used:

  • PLOS Thesaurus

Contact:

NewsIndexer Thesaurus

Short Description

An indexing system for the newspaper industry. A specialized group of terms with the newspaper industry’s indexing needs in mind. The vocabulary is divided into sections that correspond to the sections of a typical newspaper. An accompanying rule base enables highly accurate categorization of newspaper articles.

URL

http://www.newsindexer.com

Owned/Developed by

  1. Name of Owner: Access Innovations, Inc.
  2. Name of Developer: Access Innovations, Inc.
  3. Technical Contact: Mary Garcia, *protected email*
  4. License Contact:  Marjorie Hlava, *protected email*

Adopted (as opposed to owned) by organizations/publishers:
How is this KM applied?

  1. Manually and Auto-tagging software
  2. By Editorial Staff | Professional indexers

How is this KM used?

  1. Direct Bibliographic Search
  2. Display

Description of the current use case(s) of the KM

Customized version used by Acquire Media for categorization of news items, and RSS delivery according to customers’ interests.

Description of the future/potential use cases of the KM (not yet realized)

Categorization of news stories (including archived stories) by and for newspaper publishers; indexing of 20th and 21st century historical studies.

What are the main goals for using this KM?

  1. Increase downloads

Rationale for KM vs other means of searching and browsing?

Every news day, you can tag the articles as they are produced through a cloud service or installed on your own local servers. We automatically feed this data through NewsIndexer, which scans every article and searches for terms similar to those in its controlled vocabulary. NewsIndexer then displays these terms for the human indexer’s review and approval. For backfile collections you can just accept the indexing as an automatic batch process. For ongoing daily feeds you might want to review all or a random sample of the results on a regular basis for maintenance.

Is the KM being actively developed?

  1. Yes, internally

License information:

http://www.newsindexer.com/contact.htm