NeoVoc/language/he: Difference between revisions

From eneoli wikibase
(recreate wiki page using upload_lang_page.py)
(recreate wiki page for Hebrew using upload_lang_page.py)
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Neovoc for Hebrew =
= NeoVoc for Hebrew =
== Workflow steps ==
== Workflow steps ==
=== Concepts Multilingual equivalents ===
=== Concepts and multilingual equivalents ===
We create concepts starting from terms that we collect in French or English (i.e., results of ENEOLI WG1 Task 1). That is, entities describing concepts will have only French or English equivalents in the beginning. In order to ease the manual provision of equivalents in more languages, and in order to get drafted concept glosses (=very short definitions, just for word sense discrimination), we get what is there for the concept on Wikidata.
Validated lists of terms to start with (i.e., results of ENEOLI WG1 Task 1.1) are French or English, and have been uploaded to Wikibase. As soon as Task 1.1 provides more French or English terms, they will be uploaded, too. For each of these terms, the task leaders create a new concept entity. That is, entities describing concepts will have only French equivalents in the beginning. In order to ease the manual provision of equivalents (an equivalent is a term in another language representing the same concept) in more languages, and to get drafted glosses (glosses are very short sense definitions, used for a human user to discriminate word senses)), we automatically import what is there for the concept on Wikidata.


First goal is the validate all concept labels that have been imported for your language from Wikidata. For this, Use the second query, and see a list of concept entries that contain a warning attached to the label of your language. Click on the ID of the entry (first column), and decide:
The first goal is to validate all multilingual equivalents that have been imported for Hebrew from Wikidata. For this, use the second query on this page (concepts with a warning), and see a list of entries that contain a warning attached to the label of Hebrew. Click on the ID of the entry (first column), and decide:
* If the imported label is fine, remove the warning (click "edit" next to the "equivalent" in your language, and send the "warning" qualifier to the trash, and save).
* If the equivalent imported to the "statements" section of the entry is fine, remove the warning (click "edit" next to the equivalent in Hebrew, send the "warning" qualifier to the trash, and save).
* If the imported label is to be replaced, replace it by clicking "edit" next to the "equivalent" in your language, and correct the label. Also, send the Wikidata warning to the trash before saving.
* If the imported equivalent is to be replaced, replace it by clicking "edit" next to the equivalent in Hebrew, and correct it. Also, send the Wikidata warning to the trash before saving.
* From time to time, the label in the upper part of the entry ("labels" section) will be updated according to what you enter as "equivalent" in your language in the "statements" section.
* If, in addition to a drafted equivalent, you find a gloss (short description) in the upper "description" section, please review also that. If you regard it as inappropriate, please edit it; you can provide a gloss if there is nothing if you want (you are encouraged to do so, but the equivalents are more important than the glosses, and, for these glosses, we will not have "warnings" for the unvalidated).
* From time to time, the label in the upper part of the entry ("labels" section) will be updated according to what you enter as "equivalent" in Hebrew in the "statements" section. That means you don't have to edit the equivalent manually in both places; the "equivalent" in the "statements" section is the one we will use for the creation of a lexical entry. The reason why we have to enter new equivalents in the “statements” section (instead of simply adding labels in the “labels” section) is that we want to make statements about these equivalents (for example, that there is no "warning" stating that it is still unvalidated, or where we have found the equivalent).


Second goal will be to provide missing equivalents in your language. For this, use the first query. The first lines are those that are still not defined. To define an equivalent, enter the entry page by clicking on the ID in the first row, and click on "add value" in the "equivalents" property section.
The second goal will be to provide missing equivalents in Hebrew. For this, use the first query on this page ("all concepts"). The first lines are those that are still not defined. To define an equivalent, enter the entry page by clicking on the ID in the first column, and click on "add value" in the “statements" section, where values for the “equivalents" property are listed (e.g., for example at https://eneoli.wikibase.cloud/wiki/Item:Q1083#P57).
 
NEW DECISION (July 15, 2024): We have decided that we want to annotate all validated equivalents with one metadatum: The information of who has validated the equivalent. That will be a [[Property:P64|P64]] qualifier to the equivalent statement, as in this example: https://eneoli.wikibase.cloud/wiki/Item:Q1080#P57 (Basque equivalent). The reasons for this decision are: (1) Instead of defining validation as the absence of “warnings” on the equivalent, we want to have an explicit statement here: “validated by”; (2) The annotation of persons in the “validated by” qualifier enables us to create real-time summaries of who has validated which (and how many) terms.
* In order to add the “validated by” qualifier, edit the equivalent (click “edit” next to the equivalent), then “add qualifier” (NOT “add reference”), start to type “validated by” or type “P64” in the qualifier property field, and then start to write the name of the validator in the qualifier value field. That will suggest matching entities describing persons.
* The person who has validated the equivalent needs to have its “person entry” (an entity describing the person) on this Wikibase. We already have entities for those who author articles in our Zotero collection, e.g. https://eneoli.wikibase.cloud/entity/Q624 for John Humbley.
* It is easy to create a new person entity: Go to https://eneoli.wikibase.cloud/wiki/Special:NewItem and enter the name of the person as English (en) label, “a person” as description, create the entity. You should then add one statement: “instance of “(P5) “human” (Q5), and, if available, the exact match on Wikidata (P1).
=== Linguistic description (lexicographical view) ===
=== Linguistic description (lexicographical view) ===
For concept labels in your language where you have removed the warning, or where you have provided an equivalent in the "statements" section of the concept entity page, we will create a dictionary entry. The linguistic discription of the term will go there.
For concept equivalents in Hebrew which do not have any remaining warnings, AND where you have added the “validated by” qualifier (see above), we will automatically create a dictionary entry. The linguistic description of the term will go there.
* We will soon decide on the type of information to be collected in each term's dictionary entry.
* We will soon decide on the type of information to be collected in each term's dictionary entry.
 
=== Meta-terminology ===
The meta-terminology used for this task might be unusual for terminologists and/or lexicographers; for obvious reasons, we tend to call things how they are called on a Wikibase. May this very short glossary be helpful:
* “concept entry”: On a Wikibase, an entity ID starting with “Q” describes an ontological concept (an entry in a concept-centered termbase where concepts can have terminological relations to each other), which is “labelled” with one preferred label and several “alternative” labels. Wikibase labels are not supposed to be terms, at least not on Wikidata; they are strings that users might want to enter in a search when trying to find the concept entry. That allows fuzzyness and redundancy. We enter our exact and validated terms (the multilingual equivalents that denote the concept we have in front of us) in the “statements” section, where we can further describe them.
* “lexeme entry”: On a Wikibase, lexical dictionary-like entries (lemma-centered entries that offer a lexicographic description) are by default modelled according to Ontolex-Lemon. Their ID starts with an “L”. Each Lexeme entity has “Sense” and “Form” subentities; these are displayed on the same entity page (e.g. https://eneoli.wikibase.cloud/wiki/Lexeme:L1). The “sense” section lists dictionary senses, the “forms” section lists (inflected) word forms together with a morphological description of the form (on Wikibase called “grammatical features”, like genitive, plural, etc.). Lexeme entries do not have labels, they have lemmata associated to language codes instead (they can have more than one, look at https://www.wikidata.org/wiki/Lexeme:L791). The further linguistic description of the lexeme consists in statements attached at the appropriate level (entry, sense, form). Most important for us is that dictionary senses will be linked to concept entries. That link or "property" in Ontolex is referred to as ontolex:reference, on Wikidata as http://www.wikidata.org/entity/P5137, and on our Wikibase as https://eneoli.wikibase.cloud/entity/P12 (“concept for this sense”). This is what links lexical entries to concept entries; exploiting that link, data involving concept entries and lexical entries can be brought together. Lemmata of lexical entries are the strings to find in the NeoCorpus articles (if a lemma is found in lemmatized text, the article metadata gets enriched with that information - that is our plan. We will thus be able to ask in what articles a term occurs, and do that multilingually, exploting links to commonly denoted concepts).
* A "term" in our database will appear twice: (1) as equivalent to a concept, in the "statements" section of a concept entry, and (2) as lemma to a lexical entry. Equivalents with no warning attached to it are validated terms. For these, lexical entries will be created automatically as soon as the task leaders run the maintanance script; the equivalent statement and the lexical entry sense will be linked to each other. You have to make sure that concept equivalents without a warning attached are indeed validated.
* A "warning" is a [[Property:P58|P58]] qualifier to an equivalent statement. For adding a warning to an equivalent, click on "edit" next to the equivalent, and add a qualifier, typing "warning" or "P58" in the qualifier property field, and entering any value in the qualifier value field, then click "save".
* The “validated by” [[Property:P64|P64]] qualifier is created in the same way. Only equivalents that have that qualifier will be taken as validated. That means that you can safely DRAFT equivalents directly on Wikibase; until somebody qualifies them with “validated by” and indicating the person who signs the validation, it will not appear when querying for validated terms.
* "Wikibase" is a software platform, and we are running our own instance of it, this one, ENEOLI Wikibase. "Wikidata" is another Wikibase instance, available at https://www.wikidata.org.
== See the content of NeoVoc for Hebrew ==
== See the content of NeoVoc for Hebrew ==
=== All NeoVoc concept entries ===
=== All NeoVoc concept entries ===
Line 25: Line 38:
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>


select ?concept ?label_mylang ?warning ?descript_mylang (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr
select ?concept ?equiv_mylang ?warning ?descript_mylang (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr


where {
where {
Line 32: Line 45:
   optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
   optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
   optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
   optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
   optional {?concept enp:P57 ?label_st. ?label_st enps:P57 ?label_mylang. filter(lang(?label_mylang)="he")
   optional {?concept enp:P57 ?equiv_st. ?equiv_st enps:P57 ?equiv_mylang. filter(lang(?equiv_mylang)="he")
           optional {?label_st enpq:P58 ?warning.}}
           optional {?equiv_st enpq:P58 ?warning.}}
   optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
   optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
   optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
   optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
   optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}
   optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}


} order by lcase(?label_mylang)
} order by lcase(?equiv_mylang)
</sparql>
</sparql>
=== NeoVoc entries with labels for Hebrew that still have warnings ===
=== NeoVoc entries with equivalents for Hebrew that still have warnings ===
<sparql tryit="1">
<sparql tryit="1">
#title: NeoVoc concepts with warnings on Hebrew labels.
#title: NeoVoc concepts with warnings on Hebrew labels.
Line 50: Line 63:
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>


select ?concept ?label_mylang ?warning ?descript_mylang (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr
select ?concept ?equiv_mylang ?warning ?descript_mylang (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr
 
where {
  ?concept endp:P5 enwb:Q12. # instances of "NeoVoc Concept"
  optional {?concept endp:P1 ?wd.}
  optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
  optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
  ?concept enp:P57 ?equiv_st. ?equiv_st enps:P57 ?equiv_mylang. filter(lang(?equiv_mylang)="he")
          ?equiv_st enpq:P58 ?warning.
  optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
  optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
  optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}
 
} order by lcase(?equiv_mylang)
</sparql>
=== Only those NeoVoc entries with validated equivalents in Hebrew (equivalents that are qualified with "validated by") ===
<sparql tryit="1">
#title: NeoVoc concepts validated Hebrew equivalents.
 
PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>
PREFIX enp: <https://eneoli.wikibase.cloud/prop/>
PREFIX enps: <https://eneoli.wikibase.cloud/prop/statement/>
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>
 
select
?concept ?equiv_mylang ?validator ?descript_mylang ?sense (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr


where {
where {
Line 57: Line 96:
   optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
   optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
   optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
   optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
   ?concept enp:P57 ?label_st. ?label_st enps:P57 ?label_mylang. filter(lang(?label_mylang)="he")
   ?concept enp:P57 ?equiv_st. ?equiv_st enps:P57 ?equiv_mylang. filter(lang(?equiv_mylang)="he")
           optional {?label_st enpq:P58 ?warning.}
          ?equiv_st enpq:P64 [rdfs:label ?validator]. filter(lang(?validator)="en") # This restricts to equivalents that have the "validated by" qualifier
           optional {?equiv_st enpq:P63 ?sense.}
   optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
   optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
   optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
   optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
   optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}
   optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}


} order by lcase(?label_mylang)
} order by lcase(?equiv_mylang)
</sparql>
</sparql>
=== NeoVoc lexical entries in Hebrew ===
=== NeoVoc lexical entries in Hebrew ===
Line 73: Line 113:
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>


select ?lexical_entry ?lemma ?sense ?sense_gloss ?concept ?conceptLabel (iri(concat(str(wd:),?wd)) as ?wikidata)
select ?lexical_entry ?lemma ?posLabel ?sense ?sense_gloss ?concept ?conceptLabel (iri(concat(str(wd:),?wd)) as ?wikidata)


where {
where {
   ?lexical_entry a ontolex:LexicalEntry; dct:language enwb:Q985; wikibase:lemma ?lemma; ontolex:sense ?sense.
   ?lexical_entry a ontolex:LexicalEntry; dct:language enwb:Q985; wikibase:lemma ?lemma; wikibase:lexicalCategory ?pos; ontolex:sense ?sense.
   optional {?sense endp:P12 ?concept. optional {?sense skos:definition ?sense_gloss.}
   optional {?sense endp:P12 ?concept. optional {?sense skos:definition ?sense_gloss.}
   optional {?concept endp:P1 ?wd.}
   optional {?concept endp:P1 ?wd.}
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,fr". }}
   SERVICE wikibase:label { bd:serviceParam wikibase:language "he,en,fr". }}
   }
   }
</sparql>
=== NeoVoc Hebrew terms and number of occurences in Hebrew NeoCorpus articles ===
* This lists the validated Hebrew terms, and in how many articles they have been found
<sparql tryit="1">
#title: NeoVoc lexical entry lemmata, and in how many articles they occur
PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>
select ?lexical_entry ?lemma (count(?bib_item) as ?in_how_many_articles)
where {
  ?lexical_entry endp:P5 enwb:Q13; wikibase:lemma ?lemma. filter(lang(?lemma)="he")
  ?bib_item endp:P65 ?lexical_entry.
} group by ?lexical_entry ?lang ?lemma order by desc(?in_how_many_articles)
</sparql>
=== NeoCorpus articles in Hebrew, and NeoVoc terms found in them ===
* This lists the validated Hebrew terms, and in how many articles they have been found
<sparql tryit="1">
#title: NeoVoc articles in Hebrew, and terms found in them
PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>
select ?bib_item ?authors_year ?title (group_concat(?lemma; SEPARATOR="; ") as ?terms)
where {
  ?bib_item endp:P7 enwb:Q985; endp:P65 ?lexical_entry.
  ?bib_item rdfs:label ?title; schema:description ?desc.
  filter(lang(?title)="en") filter(lang(?desc)="en") bind(replace(?desc, "Publication by ", "") as ?authors_year)
  ?lexical_entry endp:P5 enwb:Q13; wikibase:lemma ?lemma.
} group by ?bib_item ?authors_year ?title ?terms
</sparql>
</sparql>


'This page will be updated automatically, don't edit it (last edited on June 14, 2024).
''This page will be updated automatically, don't edit it (last edited on November 5, 2024).''


For discussion in the group working on this language, you may use [[Talk:NeoVoc/language/he]].'
''For discussion in the group working on Hebrew, you may use this page: [[Talk:NeoVoc/language/he]].''

Latest revision as of 09:29, 5 November 2024

NeoVoc for Hebrew

Workflow steps

Concepts and multilingual equivalents

Validated lists of terms to start with (i.e., results of ENEOLI WG1 Task 1.1) are French or English, and have been uploaded to Wikibase. As soon as Task 1.1 provides more French or English terms, they will be uploaded, too. For each of these terms, the task leaders create a new concept entity. That is, entities describing concepts will have only French equivalents in the beginning. In order to ease the manual provision of equivalents (an equivalent is a term in another language representing the same concept) in more languages, and to get drafted glosses (glosses are very short sense definitions, used for a human user to discriminate word senses)), we automatically import what is there for the concept on Wikidata.

The first goal is to validate all multilingual equivalents that have been imported for Hebrew from Wikidata. For this, use the second query on this page (concepts with a warning), and see a list of entries that contain a warning attached to the label of Hebrew. Click on the ID of the entry (first column), and decide:

  • If the equivalent imported to the "statements" section of the entry is fine, remove the warning (click "edit" next to the equivalent in Hebrew, send the "warning" qualifier to the trash, and save).
  • If the imported equivalent is to be replaced, replace it by clicking "edit" next to the equivalent in Hebrew, and correct it. Also, send the Wikidata warning to the trash before saving.
  • If, in addition to a drafted equivalent, you find a gloss (short description) in the upper "description" section, please review also that. If you regard it as inappropriate, please edit it; you can provide a gloss if there is nothing if you want (you are encouraged to do so, but the equivalents are more important than the glosses, and, for these glosses, we will not have "warnings" for the unvalidated).
  • From time to time, the label in the upper part of the entry ("labels" section) will be updated according to what you enter as "equivalent" in Hebrew in the "statements" section. That means you don't have to edit the equivalent manually in both places; the "equivalent" in the "statements" section is the one we will use for the creation of a lexical entry. The reason why we have to enter new equivalents in the “statements” section (instead of simply adding labels in the “labels” section) is that we want to make statements about these equivalents (for example, that there is no "warning" stating that it is still unvalidated, or where we have found the equivalent).

The second goal will be to provide missing equivalents in Hebrew. For this, use the first query on this page ("all concepts"). The first lines are those that are still not defined. To define an equivalent, enter the entry page by clicking on the ID in the first column, and click on "add value" in the “statements" section, where values for the “equivalents" property are listed (e.g., for example at https://eneoli.wikibase.cloud/wiki/Item:Q1083#P57).

NEW DECISION (July 15, 2024): We have decided that we want to annotate all validated equivalents with one metadatum: The information of who has validated the equivalent. That will be a P64 qualifier to the equivalent statement, as in this example: https://eneoli.wikibase.cloud/wiki/Item:Q1080#P57 (Basque equivalent). The reasons for this decision are: (1) Instead of defining validation as the absence of “warnings” on the equivalent, we want to have an explicit statement here: “validated by”; (2) The annotation of persons in the “validated by” qualifier enables us to create real-time summaries of who has validated which (and how many) terms.

  • In order to add the “validated by” qualifier, edit the equivalent (click “edit” next to the equivalent), then “add qualifier” (NOT “add reference”), start to type “validated by” or type “P64” in the qualifier property field, and then start to write the name of the validator in the qualifier value field. That will suggest matching entities describing persons.
  • The person who has validated the equivalent needs to have its “person entry” (an entity describing the person) on this Wikibase. We already have entities for those who author articles in our Zotero collection, e.g. https://eneoli.wikibase.cloud/entity/Q624 for John Humbley.
  • It is easy to create a new person entity: Go to https://eneoli.wikibase.cloud/wiki/Special:NewItem and enter the name of the person as English (en) label, “a person” as description, create the entity. You should then add one statement: “instance of “(P5) “human” (Q5), and, if available, the exact match on Wikidata (P1).

Linguistic description (lexicographical view)

For concept equivalents in Hebrew which do not have any remaining warnings, AND where you have added the “validated by” qualifier (see above), we will automatically create a dictionary entry. The linguistic description of the term will go there.

  • We will soon decide on the type of information to be collected in each term's dictionary entry.

Meta-terminology

The meta-terminology used for this task might be unusual for terminologists and/or lexicographers; for obvious reasons, we tend to call things how they are called on a Wikibase. May this very short glossary be helpful:

  • “concept entry”: On a Wikibase, an entity ID starting with “Q” describes an ontological concept (an entry in a concept-centered termbase where concepts can have terminological relations to each other), which is “labelled” with one preferred label and several “alternative” labels. Wikibase labels are not supposed to be terms, at least not on Wikidata; they are strings that users might want to enter in a search when trying to find the concept entry. That allows fuzzyness and redundancy. We enter our exact and validated terms (the multilingual equivalents that denote the concept we have in front of us) in the “statements” section, where we can further describe them.
  • “lexeme entry”: On a Wikibase, lexical dictionary-like entries (lemma-centered entries that offer a lexicographic description) are by default modelled according to Ontolex-Lemon. Their ID starts with an “L”. Each Lexeme entity has “Sense” and “Form” subentities; these are displayed on the same entity page (e.g. https://eneoli.wikibase.cloud/wiki/Lexeme:L1). The “sense” section lists dictionary senses, the “forms” section lists (inflected) word forms together with a morphological description of the form (on Wikibase called “grammatical features”, like genitive, plural, etc.). Lexeme entries do not have labels, they have lemmata associated to language codes instead (they can have more than one, look at https://www.wikidata.org/wiki/Lexeme:L791). The further linguistic description of the lexeme consists in statements attached at the appropriate level (entry, sense, form). Most important for us is that dictionary senses will be linked to concept entries. That link or "property" in Ontolex is referred to as ontolex:reference, on Wikidata as http://www.wikidata.org/entity/P5137, and on our Wikibase as https://eneoli.wikibase.cloud/entity/P12 (“concept for this sense”). This is what links lexical entries to concept entries; exploiting that link, data involving concept entries and lexical entries can be brought together. Lemmata of lexical entries are the strings to find in the NeoCorpus articles (if a lemma is found in lemmatized text, the article metadata gets enriched with that information - that is our plan. We will thus be able to ask in what articles a term occurs, and do that multilingually, exploting links to commonly denoted concepts).
  • A "term" in our database will appear twice: (1) as equivalent to a concept, in the "statements" section of a concept entry, and (2) as lemma to a lexical entry. Equivalents with no warning attached to it are validated terms. For these, lexical entries will be created automatically as soon as the task leaders run the maintanance script; the equivalent statement and the lexical entry sense will be linked to each other. You have to make sure that concept equivalents without a warning attached are indeed validated.
  • A "warning" is a P58 qualifier to an equivalent statement. For adding a warning to an equivalent, click on "edit" next to the equivalent, and add a qualifier, typing "warning" or "P58" in the qualifier property field, and entering any value in the qualifier value field, then click "save".
  • The “validated by” P64 qualifier is created in the same way. Only equivalents that have that qualifier will be taken as validated. That means that you can safely DRAFT equivalents directly on Wikibase; until somebody qualifies them with “validated by” and indicating the person who signs the validation, it will not appear when querying for validated terms.
  • "Wikibase" is a software platform, and we are running our own instance of it, this one, ENEOLI Wikibase. "Wikidata" is another Wikibase instance, available at https://www.wikidata.org.

See the content of NeoVoc for Hebrew

All NeoVoc concept entries

#title: All NeoVoc concepts with labels and glosses in Hebrew, and English and French labels and glosses.

PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>
PREFIX enp: <https://eneoli.wikibase.cloud/prop/>
PREFIX enps: <https://eneoli.wikibase.cloud/prop/statement/>
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>

select ?concept ?equiv_mylang ?warning ?descript_mylang (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr

where {
  ?concept endp:P5 enwb:Q12. # instances of "NeoVoc Concept"
  optional {?concept endp:P1 ?wd.}
  optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
  optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
  optional {?concept enp:P57 ?equiv_st. ?equiv_st enps:P57 ?equiv_mylang. filter(lang(?equiv_mylang)="he")
           optional {?equiv_st enpq:P58 ?warning.}}
  optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
  optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
  optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}

} order by lcase(?equiv_mylang)

Try it!

NeoVoc entries with equivalents for Hebrew that still have warnings

#title: NeoVoc concepts with warnings on Hebrew labels.

PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>
PREFIX enp: <https://eneoli.wikibase.cloud/prop/>
PREFIX enps: <https://eneoli.wikibase.cloud/prop/statement/>
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>

select ?concept ?equiv_mylang ?warning ?descript_mylang (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr

where {
  ?concept endp:P5 enwb:Q12. # instances of "NeoVoc Concept"
  optional {?concept endp:P1 ?wd.}
  optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
  optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
  ?concept enp:P57 ?equiv_st. ?equiv_st enps:P57 ?equiv_mylang. filter(lang(?equiv_mylang)="he")
           ?equiv_st enpq:P58 ?warning.
  optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
  optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
  optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}

} order by lcase(?equiv_mylang)

Try it!

Only those NeoVoc entries with validated equivalents in Hebrew (equivalents that are qualified with "validated by")

#title: NeoVoc concepts validated Hebrew equivalents.

PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>
PREFIX enp: <https://eneoli.wikibase.cloud/prop/>
PREFIX enps: <https://eneoli.wikibase.cloud/prop/statement/>
PREFIX enpq: <https://eneoli.wikibase.cloud/prop/qualifier/>

select
?concept ?equiv_mylang ?validator ?descript_mylang ?sense (iri(concat(str(wd:),?wd)) as ?wikidata) ?label_en ?descript_en ?label_fr ?descript_fr

where {
  ?concept endp:P5 enwb:Q12. # instances of "NeoVoc Concept"
  optional {?concept endp:P1 ?wd.}
  optional {?concept rdfs:label ?label_en. filter(lang(?label_en)="en")}
  optional {?concept rdfs:label ?label_fr. filter(lang(?label_fr)="fr")}
  ?concept enp:P57 ?equiv_st. ?equiv_st enps:P57 ?equiv_mylang. filter(lang(?equiv_mylang)="he")
           ?equiv_st enpq:P64 [rdfs:label ?validator]. filter(lang(?validator)="en") # This restricts to equivalents that have the "validated by" qualifier
           optional {?equiv_st enpq:P63 ?sense.}
  optional {?concept schema:description ?descript_en. filter(lang(?descript_en)="en")}
  optional {?concept schema:description ?descript_fr. filter(lang(?descript_fr)="fr")}
  optional {?concept schema:description ?descript_mylang. filter(lang(?descript_mylang)="he")}

} order by lcase(?equiv_mylang)

Try it!

NeoVoc lexical entries in Hebrew

  • We will create these after providing equivalents in the concept entries
#title: All NeoVoc Hebrew lexical entries

PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>

select ?lexical_entry ?lemma ?posLabel ?sense ?sense_gloss ?concept ?conceptLabel (iri(concat(str(wd:),?wd)) as ?wikidata)

where {
  ?lexical_entry a ontolex:LexicalEntry; dct:language enwb:Q985; wikibase:lemma ?lemma; wikibase:lexicalCategory ?pos; ontolex:sense ?sense.
  optional {?sense endp:P12 ?concept. optional {?sense skos:definition ?sense_gloss.}
  optional {?concept endp:P1 ?wd.}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "he,en,fr". }}
  }

Try it!

NeoVoc Hebrew terms and number of occurences in Hebrew NeoCorpus articles

  • This lists the validated Hebrew terms, and in how many articles they have been found
#title: NeoVoc lexical entry lemmata, and in how many articles they occur

PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>

select ?lexical_entry ?lemma (count(?bib_item) as ?in_how_many_articles)

where {
  ?lexical_entry endp:P5 enwb:Q13; wikibase:lemma ?lemma. filter(lang(?lemma)="he")
  ?bib_item endp:P65 ?lexical_entry.
} group by ?lexical_entry ?lang ?lemma order by desc(?in_how_many_articles)

Try it!

NeoCorpus articles in Hebrew, and NeoVoc terms found in them

  • This lists the validated Hebrew terms, and in how many articles they have been found
#title: NeoVoc articles in Hebrew, and terms found in them

PREFIX enwb: <https://eneoli.wikibase.cloud/entity/>
PREFIX endp: <https://eneoli.wikibase.cloud/prop/direct/>

select ?bib_item ?authors_year ?title (group_concat(?lemma; SEPARATOR="; ") as ?terms)

where {
  ?bib_item endp:P7 enwb:Q985; endp:P65 ?lexical_entry.
  ?bib_item rdfs:label ?title; schema:description ?desc.
  filter(lang(?title)="en") filter(lang(?desc)="en") bind(replace(?desc, "Publication by ", "") as ?authors_year)
  ?lexical_entry endp:P5 enwb:Q13; wikibase:lemma ?lemma.

} group by ?bib_item ?authors_year ?title ?terms

Try it!


This page will be updated automatically, don't edit it (last edited on November 5, 2024).

For discussion in the group working on Hebrew, you may use this page: Talk:NeoVoc/language/he.