This document can be used to aid the setup of the Curator Logger for use with context-dependent metadata values ingested to and served from IPV's Metadata Central (MDC) Teragator. These are ingested using SKOS (Simple Knowledge Organization System) XML format. To prepare an ingest, a standalone command line tool (IpvSkosGenerator) can be given a CSV file containing the data. Please see the prerequisites including those listed for Metadata Central Teragator.
In summary, to make a functional system, you will need:
MDC Teragator 2.2.1 or newer installed and configured as described in the Pre-requisites section below.
- Curator Logger 2.3.1.67 or newer installed and configured as described in the Configuration steps to deploy Logger section below. This includes running a SQL script against the curatordb database.
- Data ingested to MDC Teragator with the lists of the hierarchical metadata. This requires preparing a list in CSV using the provided converter to SKOS format and ingesting via MDC Teragator's Watchfolder. See the Adding and Updating values for the Hierarchical Metadata section below for more details.
Pre-requisites
- Completed Curator System deployment including configured Ingest Functionality Module.
Curator 3.5.1 is recommended although it is possible to use Curator 3.4 as long as updates to individual minimum versions have been applied.
Minimum individual versions of software: Curator Logger 2.3.1.67 and MDC Teragator 2.2.1 - Data Structured into SKOS compliant documents, several documents may be needed. SKOS which is used for the definition of the hierarchies requires that each document has only unique objects within. denoted by unique names (prefLabels). This means that nowhere in the hierarchy there can be repetitions, that is parents and descendants cannot share names nor can names be shared in different descendancy branches: no child can have descendant named as a descendant of another child. If you need a same named descendant you need to separate that branch to a new document. See below in Adding and Updating values for the Hierarchical Metadata.
- You must have a folder for which MDC Teragator (owner of the service process) has Read access and users tasked with performing updates have Write access. This will be a Watchfolder for providing updates, as referred to in the EtlSources.config.
- Installed optional Curator Service Metadata Central Teragator 2.2.1 for Curator 3.5.1 and its prerequisites.
During installation,- set the Service address to the FQDN of the Teragator machine
- select Client Configuration: MetadataConfiguration_IpvSkosConfig and
- ETL Configuration - set ETL is Active to true:
Following installation, the ETL service (located at: [installation path]/Service/EtlSources.config) must be configured to ingest the required SKOS XML updates from the Watchfolder. See below for an example of a correctly set up paths to files in the Watchfolder:
<?xml version="1.0" encoding="utf-8" ?>
<etlSourcesSection>
<etlSourceCollection>
<!-- SKOS -->
<etlSource streamid="Topics SKOS" type="xmlfile" url="C:\ProgramData\IPV-Demo\WatchFolder\skosTopics.xml" ontology="http://teragator.com/ns/ontology/etl/cv/skos-1.0#" schema="ipvskosxmlv1"/>
<etlSource streamid="Countries and Cities SKOS" type="xmlfile" url="C:\ProgramData\IPV-Demo\WatchFolder\skosCountries.xml" ontology="http://teragator.com/ns/ontology/etl/cv/skos-1.0#" schema="ipvskosxmlv1"/
</etlSourceCollection>
</etlSourcesSection>
- After changing this configuration, remember to restart the Teragator service if it was already running.
You can ingest one or more individual documents each containing unique terms that are not repeated within the same document.
For ingest, each document has to be declared with a unique streamid and internally contain a unique document label that is not repeated in any other Skos input file.
- You must have downloaded the Curator Logger for Hierarchical Metadata - Configuration package as found in theSoftware/Configuration Sets subfolder of Curator installers for the appropriate version of Curator you are deploying. The following script will be used in order to configure the database with the required definitions for hierarchical logging:
- Context MDC 2.2-SKOS Resource Metadata View.sql
- You must have downloaded the Windows executable IpvSkosGenerator [version].zip (for version 2.2.1) found in the above package. This is a tool for converting CSV tables into SKOS format accepted by Metadata Central Teragator. You will copy files from the folder space where the converter is used to the Watchfolder. For further instructions on use, see the Adding and Updating values for the Hierarchical Metadata section below.
Installation steps to deploy Logger
- Complete pre-requisites.
- Install Curator Logger (see Curator Web Applications Installation for more details).
Configuration steps to deploy Logger
- Edit the SQL script Context MDC 2.2-SKOS Resource Metadata View.sql (downloaded during the prerequisites steps).
- Set the 'MDCTeragatorHost' (line 14) to be the name of the machine on which MDC Teragator is installed. Make sure to enter the FQDN of the Teragator machine matching the configuration of the Service, see Prerequisites above.
- Confirm that "Topic" and "Countries" as top-level categories are suitable for your needs. if not, alter lines 49-50:
SET @topGeoValue = 'Countries'; -- sets the query to the top level preflabel
- Review the remaining metadata names to check if these are suitable for your needs.
- Run the script. It should complete without errors.
- If you already had a Rating metadata name defined, you may find it has extra options, as the script will add enum values of 1,2,3,4,5 with display values: 1 Star; 2 Star, 3 Star, 4 Star, 5 Star. It will not delete previous values to avoid data loss. Adjust to the site requirement if these are different from this (customer may want to transfer the site’s values to the star system). Post-deployment, you may wish to visualise the values as stars (⭐). Use CSA to modify display values for the enum metadata Rating. Load the metadata, click on the values in the Enumeration values row, and enter stars for the display name as required e.g. ⭐⭐⭐ where “3 Star” was. The star shown here is a standard emoji symbol, sometimes referred to as the "white star".
2. Edit the Curator Logger config file.
- Set up your proxy streaming properties
- Replace the <viewSelections> section of web.config with code to display the ConceptInContextLoggingView logging view:
<viewSelections>
<view name="MultiLoggerMediaView" display="Media View">
<subClipViewSelections>
<view name="ConceptInContextLoggingView" display="Hierarchical Topics" numberOfColumns="2"/>
</subClipViewSelections>
</view>
</viewSelections>
- Alternatively, add the <view name="ConceptInContextLoggingView" display="Hierarchical Topics" numberOfColumns="2"/> to your existing subClipViewSelections, to create an additional view.
- If you want to use Batch logging (logging a set of media directly rather than creating and annotating subclips) please see further configuration needed under the heading Batch Logging.
Using Curator Logger for Hierarchical Metadata
The simplest use case for this is on a single primary asset (media, audio or image) with no relevant game metadata saved.
- Open a web browser and connect to the logger site.
NOTE: You may be asked to log in via Curator Gateway. - The logger will open. You should see the familiar tree view and asset list page.
- Double-click a single Media or Audio, or select a collection, folder or set of assets and click Start Logging.
- If you have chosen to set a required parent media metadata the Media metadata editor will open, enter values and save.
- The Logging page will now be available.
- Where the Metadata is set in a Hierarchical dependency, you can see a yellow triangle next to the dependent metadata.
This indicates that first you must select a value for the parent metadata. You can see the parent metadata name in the Tooltip:
Once a value has been selected for parent metadata (in this case Topic), the values corresponding to the dependent metadata (in this case Keywords) will be available for selection:
- You are now ready to start logging using the video panel (set in-out points for clipmarks) and assigning data from the logging controls. The default logging view will be the first child view specified in the web config, unless you are using the viewSelectValue property with declared metadata values on the parent asset selecting the view.
Batch Logging
Batch logging allows log page data to be set against multiple primary assets (media, audio, image) with a single Save action.
NOTE: In this mode, it is not possible to mark in/out/bookmark values, and configured handles will be ignored.
Configuration
- Add an entry to the child view selection that includes the setting media=”true” e.g.,
<view name="ConceptInContextLoggingView" display="Batch Concept in Context" media=”true” numberOfColumns="2" /> - Optionally remove the use of parent media view, it is not necessary to use this view in most cases. This will reduce the number of pop-ups appearing. To remove the media view set the view name in viewSelections to an empty string, see blue below.
- Sports Logging requires the Parent media view GameSelectionMediaView so it should not be removed even when you are batch logging on the media.
- Optionally set <add key="UseFirstAsSourceOfTruth" value="true" /> in the web.config file to remove the need for and ability to select the asset to use as Source of Truth for parent metadata; the first loaded asset will be always used. This will reduce the number of pop-ups appearing e.g.:
<viewSelections display="">
<view name="" display="">
<subClipViewSelections>
<view name="ConceptInContextLoggingView" display="Batch Concept in Context" numberOfColumns="2" media="true"/>
</subClipViewSelections>
</view>
</viewSelections>
...
<add key="UseFirstAsSourceOfTruth" value="true"/>
Use
- Select multiple primary assets from the Browse page (or from a Folder or a Collection containing primary assets) and click Start Logging.
- Avoid selecting clipmarks for now - any changes made here will affect their parent media/audio.
- The Logging page will appear, displaying multiple items in the Sidebar.
- If the parent media view was set a Media Metadata dialog may open - this will apply to the highlighted asset only. Close it.
- Select the view that corresponds to the media="true" setting. This will have a media icon next to it rather than a clipmark icon.
- Open the sidebar and select the assets you wish to enter to batch for logging. Hover over a thumbnail to see the name and the checkbox on the right, then select Done or click outside the sidebar. Further logging will apply only to the assets selected to batch.
- Note that above the video player you now have a Batch Logging badge displayed. The Save button will now display Save Batch.
These act as a reminder that any action you take will apply to all assets in the selected batch. - If there is a parent media view in use, you will be prompted for Media Metadata. Set these, save, and close the pop-up.
- For Sports as on a single asset, you have to provide the 6 required metadata items so that the logging page can be set up for an appropriate Sport and Game's rosters and actions.
- Start setting values on the Logging page. These values will be saved on all assets selected in the Batch.
- When complete, click Save Batch to save your settings. Batch mode will exit.
To exit batch mode before making a log:
- Open the sidebar by clicking on it.
- Select Edit Batch.
- Deselect assets individually until you have only one left OR deselect all assets using the checkbox above the first item.
- Click Done.
- Only a single item in the list will be highlighted.
- The Batch Logging badge will disappear.
- The Save button will revert back to Save.
- You will be logging metadata only to the media selected and displayed in the player.
Troubleshooting when no options for metadata are present in the Curator Client
The most common problems include:
- The higher hierarchy metadata was set up , but there are no values for child metadata in the hierarchy
Usually, this means that in the data ingested via ETL there were no child values for that value of the parent (so no error but wrong expectation)- Check in the CSV file for a value that has children in the hierarchy and select it as a value for the parent, if you now see the expected children all should be working.
- There are no values presented for any of the metadata in the hierarchy
- Check MDC log files for the ETL ingest errors and correct these
The log file is located at C:\ProgramData\IPV\Metadata CentralService\Logs\Metadata Central.log on the MDC machine. - Check MDC log files for errors connected to bad requests and report to IPV.
- Check Curator Server logs for errors received from MDC Teragator when GetOptions is requested.
- If you misconfigured the Teragator endpoint entries in the Teragator config file or the deployment script there will be nothing in the Metadata Central log files, but Curator Server log files will report this or a similar error:
Failing to do that will result in a lack of metadata options in Curator Client applications and in Curator Server Logs, an error similar to this:
INFO OptionsProvider.Mdc.Slrql.MdcSlrqlQuery - [HttpProvider, MakeRestTransaction] - Received response code BadRequest from REST POST query.
ERROR Curator Server - Curator Server - [MetadataRepository, GetMetadataNameOptions] - One or more errors occurred.
- If you misconfigured the Teragator endpoint entries in the Teragator config file or the deployment script there will be nothing in the Metadata Central log files, but Curator Server log files will report this or a similar error:
- Check MDC log files for the ETL ingest errors and correct these
Adding and Updating values for the Hierarchical Metadata
The process to add or update the values served to supply hierarchical metadata consists of three steps:
- Prepare a CSV document describing the terms and their hierarchy in the format, as shown below. Preparation and required format of the CSV files. For each top category (Topics and Countries by default), one CSV file will contain a new complete set of values to be stored and returned to Curator applications.
- Use theIpvSkosGenerator provided as a zip file to generate the SKOS XML format.
- Place the created SKOS XML file in the MDC Teragator Watchfolder (see Pre-requisites) to ingest the metadata values. Ensure the file is named exactly as referred to in the ETLSources.config, in the prerequisites example these would be skosTopics.xml or skosCountries.xml .
Detailed steps
Extract the IpvSkosGenerator [version].zip. The top-level has a couple of example batch files: input includes example CSV files and Output includes example SKOS XML files. Items in the System subfolder should not be modified as this subfolder contains executable items and the necessary libraries.
Once the CSV document is ready, it is easiest to make your own batch (.bat) files (content here assumes the batch is present in the top directory of IpvSkosGenerator) referencing your input CSV and name for the output XML and the report file:
System\IpvSkosGenerator.exe --input "Input\NewTopics_update_1 _20010101.csv" --create
"Output\NewTopics_update_1 _20010101.xml" --report "Output\NewTopics_update_1 _20010101.txt"
After executing the batch file, copy the resulting SKOS XML file (in this example: Output\NewTopics_update_1 _20010101.xml) to the Watchfolder created in the prerequisites, renaming it to match the relevant file referred to in the ETLSources.config file, in the prerequisites example above skosTopics.xml or skosCountries.xml.
Confirm that the ETL has ingested the update in the Teragator.log file (found in C:\ProgramData\IPV\Metadata Central Service\Logs path on the MDC Teragator server). When the database has successfully updated with the new set of values, this will be indicated by a "Database updated" statement similar to the following:
INFO TeragatorFramework.Api.ApiImplementation - ETL Ingest: timestamp=31/03/2022 10:23:17:
status=SuccessStageComplete: OntologyWriter.Process(): Database updated
Preparation and required format of the CSV files:
- The first row contains headers - place the column labels exactly as indicated:
document created date, document label, preferred label of concept, preferred label of narrower concept, alternate label of concept, definition of concept. - Below that, empty lines are allowed in order to make it easier to separate parts of the file.
- You can use only a certain list
- The first data line has the values for "document created date" and "document label". These must be set and apply to the entire document. The document created date is in the format DD/MM/YYYY. Document label has to be unique to a document to be updated. In the example below the unique document label is "Keyword Lists". If you require additional document it will need to have a different document label e.g. "Campaigns".
- Each item to be ingested as a value for metadata must be added in the "preferred label of concept" column. The "alternate label of concept" and "definition of concept" are optional additional descriptions.
- To create a hierarchy, create pairs of "preferred label of concept" and "preferred label of narrower concept".
- You must declare the top-level hierarchy items as Topics and Countriesto use with the resource deployment Context MDC 2.2-SKOS Resource Metadata View.sql with only minor modifications commented within.
- If you require different top-level categories, you will have to further modify the resource deployment script as described in the section above: Configuration steps to deploy Logger
- Within Topics, declare the categories you require (in the below example Business and Leisure) as "preferred label of narrower concept".
- Declare each of these categories as "preferred label of concept".
- Declare for each of these the list of the "preferred label of narrower concept".
- Finally, declare the lowest level as "preferred label of concept" in their own right - these will not have any narrower concept declared.
- You can continue declaring the hierarchy deeper in, but remember that to use these you will need to declare the metadata for these and create and assign the resources appropriately, which requires more complicated intervention to the scripts.
- Remember to save the file as a CSV.
- You must declare the top-level hierarchy items as Topics and Countriesto use with the resource deployment Context MDC 2.2-SKOS Resource Metadata View.sql with only minor modifications commented within.
document created date | document label | preferred label of concept | preferred label of narrower concept | alternate label of concept | definition of concept |
23/01/2001 | Keyword Lists | Topics | Business | ||
Leisure | |||||
Business | Banking | ||||
Markets | |||||
Retail | |||||
Leisure | Fashion | ||||
Food | |||||
Travel | |||||
Banking | |||||
Markets | |||||
Retail | |||||
Fashion | |||||
Food | |||||
Travel |