Anonymous

NeoVoc/language/hr: Difference between revisions

From eneoli wikibase
recreate wiki page using upload_lang_page.py
(recreate wiki page using upload_lang_page.py)
(recreate wiki page using upload_lang_page.py)
Line 2: Line 2:
== Workflow steps ==
== Workflow steps ==
=== Concepts and multilingual equivalents ===
=== Concepts and multilingual equivalents ===
The list of terms in French (i.e., results of ENEOLI WG1 Task 1.1) has been uploaded to Wikibase. For each of these terms, we create a new concept entity. That is, entities describing concepts will have only French equivalents in the beginning. In order to ease the manual provision of equivalents (an equivalent is a term in another language representing the same concept) in more languages, and to get drafted glosses (glosses are very short sense definitions, used for a human user to discriminate word senses)), we automatically import what is there for the concept on Wikidata.
Validated lists of terms to start with (i.e., results of ENEOLI WG1 Task 1.1) are French or English, and have been uploaded to Wikibase. As soon as Task 1.1 provides more French or English terms, they will be uploaded, too. For each of these terms, the task leaders create a new concept entity. That is, entities describing concepts will have only French equivalents in the beginning. In order to ease the manual provision of equivalents (an equivalent is a term in another language representing the same concept) in more languages, and to get drafted glosses (glosses are very short sense definitions, used for a human user to discriminate word senses)), we automatically import what is there for the concept on Wikidata.


The first goal is to validate all multilingual equivalents that have been imported for Croatian from Wikidata. For this, use the second query on this page, and see a list of entries that contain a warning attached to the label of Croatian. Click on the ID of the entry (first column), and decide:
The first goal is to validate all multilingual equivalents that have been imported for Croatian from Wikidata. For this, use the second query on this page (concepts with a warning), and see a list of entries that contain a warning attached to the label of Croatian. Click on the ID of the entry (first column), and decide:
* If the equivalent imported to the "statements" section of the entry is fine, remove the warning (click "edit" next to the equivalent in Croatian, send the "warning" qualifier to the trash, and save).
* If the equivalent imported to the "statements" section of the entry is fine, remove the warning (click "edit" next to the equivalent in Croatian, send the "warning" qualifier to the trash, and save).
* If the imported equivalent is to be replaced, replace it by clicking "edit" next to the equivalent in Croatian, and correct it. Also, send the Wikidata warning to the trash before saving.
* If the imported equivalent is to be replaced, replace it by clicking "edit" next to the equivalent in Croatian, and correct it. Also, send the Wikidata warning to the trash before saving.
Line 10: Line 10:
* From time to time, the label in the upper part of the entry ("labels" section) will be updated according to what you enter as "equivalent" in Croatian in the "statements" section. That means you don't have to edit the equivalent manually in both places; the "equivalent" in the "statements" section is the one we will use. The reason why we have to enter new equivalents in the “statements” section (instead of simply adding labels in the “labels” section) is that we want to make statements about these equivalents (for example, that there is no "warning" stating that it is still unvalidated, or where we have found the equivalent).
* From time to time, the label in the upper part of the entry ("labels" section) will be updated according to what you enter as "equivalent" in Croatian in the "statements" section. That means you don't have to edit the equivalent manually in both places; the "equivalent" in the "statements" section is the one we will use. The reason why we have to enter new equivalents in the “statements” section (instead of simply adding labels in the “labels” section) is that we want to make statements about these equivalents (for example, that there is no "warning" stating that it is still unvalidated, or where we have found the equivalent).


The second goal will be to provide missing equivalents in Croatian. For this, use the first query on this page. The first lines are those that are still not defined. To define an equivalent, enter the entry page by clicking on the ID in the first column, and click on "add value" in the “statements" section, where values for the “equivalents" property are listed (e.g., for example at https://eneoli.wikibase.cloud/wiki/Item:Q1083#P57) .
The second goal will be to provide missing equivalents in Croatian. For this, use the first query on this page ("all concepts"). The first lines are those that are still not defined. To define an equivalent, enter the entry page by clicking on the ID in the first column, and click on "add value" in the “statements" section, where values for the “equivalents" property are listed (e.g., for example at https://eneoli.wikibase.cloud/wiki/Item:Q1083#P57) .
=== Linguistic description (lexicographical view) ===
=== Linguistic description (lexicographical view) ===
For concept equivalents in Croatian where you have removed the warning, or where you have provided an equivalent in the "statements" section of the entity page without adding any warning (you may add warning text like "not sure", or whatever you like), we will automatically create a dictionary entry. The linguistic description of the term will go there.
For concept equivalents in Croatian where you have removed the warning, or where you have provided an equivalent in the "statements" section of the entity page without adding any warning (you may add warning text like "not sure", or whatever you like), we will automatically create a dictionary entry. The linguistic description of the term will go there.
Line 18: Line 18:
* “concept entry”: On a Wikibase, an entity URI starting with “Q” describes an ontological concept (an entry in a concept-centered termbase), which is “labelled” with one preferred label and several “alternative” labels. Wikibase labels are not supposed to be terms, at least not on Wikidata; they are strings that users might want to enter in a search when trying to find the concept entry. That allows fuzzyness and redundancy. We enter our exact and validated terms (the multilingual equivalents that denote the concept we have in front of us) in the “statements” section, where we can further describe them.
* “concept entry”: On a Wikibase, an entity URI starting with “Q” describes an ontological concept (an entry in a concept-centered termbase), which is “labelled” with one preferred label and several “alternative” labels. Wikibase labels are not supposed to be terms, at least not on Wikidata; they are strings that users might want to enter in a search when trying to find the concept entry. That allows fuzzyness and redundancy. We enter our exact and validated terms (the multilingual equivalents that denote the concept we have in front of us) in the “statements” section, where we can further describe them.
* “lexeme entry”: On a Wikibase, lexical dictionary-like entries (lemma-centered entries that offer a linguistic description) are by default modelled according to Ontolex-Lemon. Their URI starts with an “L”. Each Lexeme entity has “Sense” and “Form” subentities; these are displayed on the same entity page (e.g. https://eneoli.wikibase.cloud/wiki/Lexeme:L1). The “sense” section lists dictionary senses, the “forms” section lists (inflected) word forms together with a morphological description of the form (on Wikibase called “grammatical features”, like genitive, plural, etc.). Lexeme entries do not have labels, they have lemmata associated to language codes instead (they can have more than one, look at https://www.wikidata.org/wiki/Lexeme:L791). The further linguistic description of the lexeme consists in statements attached at the appropriate level (entry, sense, form). Most important for us is that dictionary senses will be linked to ontology items. This link in Ontolex is referred to as ontolex:reference, on Wikidata as http://www.wikidata.org/entity/P5137, and on our Wikibase as https://eneoli.wikibase.cloud/entity/P12 (“concept for this sense”). This is what links lexical entries to concept entries; exploiting that link, data involving concept entries and lexical entries can be brought together.
* “lexeme entry”: On a Wikibase, lexical dictionary-like entries (lemma-centered entries that offer a linguistic description) are by default modelled according to Ontolex-Lemon. Their URI starts with an “L”. Each Lexeme entity has “Sense” and “Form” subentities; these are displayed on the same entity page (e.g. https://eneoli.wikibase.cloud/wiki/Lexeme:L1). The “sense” section lists dictionary senses, the “forms” section lists (inflected) word forms together with a morphological description of the form (on Wikibase called “grammatical features”, like genitive, plural, etc.). Lexeme entries do not have labels, they have lemmata associated to language codes instead (they can have more than one, look at https://www.wikidata.org/wiki/Lexeme:L791). The further linguistic description of the lexeme consists in statements attached at the appropriate level (entry, sense, form). Most important for us is that dictionary senses will be linked to ontology items. This link in Ontolex is referred to as ontolex:reference, on Wikidata as http://www.wikidata.org/entity/P5137, and on our Wikibase as https://eneoli.wikibase.cloud/entity/P12 (“concept for this sense”). This is what links lexical entries to concept entries; exploiting that link, data involving concept entries and lexical entries can be brought together.
* A "term" in our database will appear twice: (1) as equivalent to a concept, in the "statements" section of a concept entry, and (2) as lemma to a lexical entry. Equivalents with no warning attached to it are validated terms. For these, lexical entries will be created automatically as soon as the task leaders run the maintanance script; the equivalent statement and the lexical entry sense will be linked to each other. You have to make sure that concept equivalents without a warning attached are indeed validated.
* A "warning" is a [[Property:P58|P58]] qualifier to an equivalent statement. For adding a warning to an equivalent, click on "edit" next to the equivalent, and add a qualifier, typing "warning" or "P58" in the qualifier property field, and entering any value in the qualifier value field, then click "save".
* "Wikibase" is a software platform, and we are running our own instance of it, this one, ENEOLI Wikibase. "Wikidata" is another Wikibase instance, available at https://www.wikidata.org.
== See the content of NeoVoc for Croatian ==
== See the content of NeoVoc for Croatian ==
=== All NeoVoc concept entries ===
=== All NeoVoc concept entries ===
Bots, emailconfirmed
8,882

edits