Ta̱mpi̱let:auto cat
(This template should be used on pages in the Category: namespace.)
- The following documentation is located at Ta̱mpi̱let:auto cat/documentation. [edit]
- Useful links: subpage list • links • redirects • transclusions
Uses Lua: |
This template is used on category pages to automatically add a category boilerplate template. It deciphers the category name and transcludes the correct template with the correct parameters.
No parameters are needed in most cases. A few categories have optional or required parameters to help out the categorization; see below.
Note: all categories should now be handled by {{auto cat}}
. See Module:category tree/poscatboiler/data/documentation for more information, and for specifics regarding the category tree.
Most categories have an Edit category data button in the upper right that takes you directly to the module that implements the category.
To more easily add this template, place importScript('User:Erutuon/addAutoCat.js');
in your common.js. This will add buttons directly below the first heading. Click them to add the template and save, or add the template and preview.
Parameters
[jhyuk]Some categories allow or require parameters to {{auto cat}}
to help out categorization.
Affix categories
[jhyuk]These are categories are categories such as Category:Latin terms suffixed with -inus and Category:Japanese terms prefixed with 真っ. The types of affixes currently recognized are prefix, suffix, infix, interfix, circumfix and transfix. For these categories, the following parameters are allowed (none are required):
|alt=
- The affix with diacritics. Only needed for languages with extra diacritics in their headwords (e.g. Latin, Russian, Arabic and Old English), and only if those diacritics are present. For example, for Category:Latin terms suffixed with -inus, specify
{{auto cat|alt=-īnus}}
so that the suffix in the category description, and the breadcrumb at the top of the page, are displayed as -īnus, with a macron. |sort=
- The sort key. Mostly only needed for Japanese. For example, for Category:Japanese terms prefixed with 真っ, use
{{auto cat|sort=まっ}}
so that the page is properly sorted in its parent category Category:Japanese terms by prefix. |tr=
- Manual transliteration. Occasionally needed for non-Latin-script languages if the automatic transliteration is incorrect or if the language doesn't have automatic transliteration (e.g. Persian and Hebrew).
|sc=
- Script code. Almost never needed.
Language categories
[jhyuk]These are categories such as Category:French language and Category:Proto-Indo-European language. These are the root categories for the various languages represented in Wiktionary. These categories have required parameters specifying the country or countries where the language is spoken, as well as additional optional parameters:
|1=
,|2=
,|3=
, ...- The country or countries where the language is spoken. See Category:Languages by country and its subcategories. Make sure to include the word
the
if warranted, e.g.the Philippines
orthe United States
. At least one country is required unless the language is reconstructed (e.g. Proto-Indo-European) or constructed/artificial (e.g. Esperanto). If the country is truly unknown, use the valueUNKNOWN
. |extinct=1
- Specify that this language is extinct (no longer spoken).
|setwiki=
- The name of the Wikipedia article about the language to link to. If omitted, the Wikipedia article in the language's wikidata entry or the language's category name will be used (e.g.
French language
). Specify|setwiki=-
to show no Wikipedia article link. Preferably, the wikidata entry for the language should be added to the language's data file rather than specified manually in this template. |setwikt=
- The code of the language's Wiktionary edition. If omitted, the Wiktionary language code for the language will be used. Specify
|setwikt=-
to show no Wiktionary edition link. |setsister=
- The name of the category on Wikimedia Commons with files related to the language. If omitted, the category name will be used (e.g.
French language
). Specify|setsister=-
to show no Commons category link. |entryname=
- The English term on this Wiktionary to link to. If omitted, the canonical name of the language will be used (e.g.
French
). Specify|entryname=-
to show no entry link.
Lect categories
[jhyuk]These are categories that refer to regional, temporal and sociolectal varieties of languages such as Category:Latin American Spanish (regional), Category:Early Modern English (temporal) and Category:Classical Persian (sociolectal), including those that don't have the containing language in their name (e.g. Category:Provençal and Category:Dari) or have only part of the containing language in their name (e.g. Category:Walser German, which is a variety of the Alemannic German language and not a variety of German). Because of the diversity of naming conventions, {{auto cat}}
won't recognize or process such categories unless |lect=1
is given. The handler in {{auto cat}}
that handles such categories will attempt to infer the relevant properties of the lect in various ways:
- It will infer some default properties from the name of the lect itself. For example Category:Jakarta Indonesian is assumed to be a variety of Indonesian and spoken in Jakarta.
- It will look for a language-specific label that categorizes into the category in question. If and only if the label definition of that label has a Ta̱mpi̱let:cd field, the relevant fields of the label definition will be used to establish the lect's properties. This is now the preferred method of specifying a lect's properties, because it centralizes all information about the lect in the language-specific label modules. As an example, consider the category Category:Javanese Indonesian. The Indonesian label data module Module:labels/data/lang/id defines a label Ta̱mpi̱let:cd that has the setting Ta̱mpi̱let:cd (causing that label to categorize into Category:Javanese Indonesian) and also has a setting Ta̱mpi̱let:cd (indicating that the label is a lect whose parent lect is defined by the Ta̱mpi̱let:cd label). As a result of the Ta̱mpi̱let:cd setting, the properties of the Ta̱mpi̱let:cd label will be taken from the fields specified for this label, such as Ta̱mpi̱let:cd and Ta̱mpi̱let:cd, which together cause the description of the category to read Terms or senses in Indonesian as spoken on the island of Java. (See below for fields like Ta̱mpi̱let:cd and Ta̱mpi̱let:cd.) The value of the Ta̱mpi̱let:cd field, Ta̱mpi̱let:cd, will cause the category associated with that label (Category:Indonesian Indonesian) to be the parent category of Category:Javanese Indonesian. Top-level lects should use Ta̱mpi̱let:cd, which puts them directly under e.g. Category:Regional Indonesian (unless Ta̱mpi̱let:cd is also given, in which case they go under the higher-level category Category:Varieties of Indonesian; non-regional lects such as Category:Classical Indonesian and Category:Prokem Indonesian should use this). NOTE: The value of the Ta̱mpi̱let:cd field is a label, not a category. Labels and categories do not have to match exactly in name (cf. the label Ta̱mpi̱let:cd vs. the corresponding category Category:Javanese Indonesian, and the label Ta̱mpi̱let:cd vs. the corresponding category Category:Indonesian Indonesian). However, if they do not match, the connection between them should still be clear.
- It will take properties from parameters directly specified to
{{auto cat}}
. Generally, the same recognized parameters to{{auto cat}}
(described below) have the same names and semantics as the fields in label definitions that control lect properties.
In the above list, properties specified lower down override those specified higher-up. In other words, default properties inferred from the name itself are overridden by properties derived from a lect label that categorizes into the category in question, which are in turn overridden by properties directly specified by {{auto cat}}
parameters.
The following parameters are recognized to {{auto cat}}
, which are listed along with the corresponding fields in the lect label definition:
{{auto cat}} param |
Lect label field | Definition |
---|---|---|
|lect=1 |
Ta̱mpi̱let:cd | |lect=1 must be specified for {{auto cat}} to process the category, and Ta̱mpi̱let:cd must be given for a label to be treated as a lect label (along with the fact that the label must categorize into the category in question). The value of Ta̱mpi̱let:cd is either the containing lect label (not the category name) or true if the lect is a top-level lect (its parent will be Category:Regional LANG where LANG is the L2 language in question, or Category:Varieties of LANG if the field Ta̱mpi̱let:cd is set).
|
|1= |
Ta̱mpi̱let:cd | English description of the location where the lect is spoken (for regional lects), the time period where the lect was spoken (for temporal lects) or the Ta̱mpi̱let:w of the lect (for sociolects). The text normally appears after the words "Terms or senses in LANGUAGE as spoken in", although both the verb ("spoken") and the preposition ("in") can be customized. Normally, the description will be linked using {{l|en}} ; use |nolink=1 to disable this (see below). If the description names a country (or in some cases a sub-country entity such as California), and a category named Languages of country exists, the lect will automatically be categorized into this category. You can override the country or countries of the lect using |country= . If omitted, the default description is inferred from the lect name by subtracting the containing language (see |cat= and |lang= below). For example, for Category:Texas German, the containing language will be inferred as 'German', and after subtracting this, the default description becomes Texas . In some cases, this will be wrong, especially if the location is named in the lect using the adjectival form of the location, and the description must be given explicitly. For example, Category:Puerto Rican Spanish will result in a default description Puerto Rican when it should be Puerto Rico . If it's not possible to match the containing language in the lect name, |1= must be specified or an error results (unless |def= or |fulldef= are given; see below).
|
|cat= |
Ta̱mpi̱let:cd | The parent category. This is the first containing category listed at the bottom of the page and determines the trail of breadcrumbs displayed at the top of the page. This should be used to express containment relationships of regional and temporal lects. For example, Category:Durham University English has Category:Durham English as its parent, which in turn has Category:Northumbrian English as its parent, which in turn has Category:Northern England English as its parent, etc. For lect labels, the parent category is the regional or plain category that the parent label categorizes into. If the parent category is omitted, the default depends on the containing language, according to the following algorithm:
|
|lang= |
(no equivalent) | Override the containing language. See |cat= above for more details. The containing language determines the default parent category (see above) and the default breadcrumb (see below). Note that if the lect directly names an etymology-only language, |lang= will automatically be inferred to be that language, and the corresponding language code(s) will be shown as part of the "additional" text following the category description. There is no equivalent because the containing language is taken from whatever language-specific label module the lect label was found in. Note that this does not have to match the name of the category. For example, all Chinese-related lect labels are centralized in Module:labels/data/lang/zh, meaning that a category like Category:Beijing Mandarin will have its lect label definition in Module:labels/data/lang/zh, not in Module:labels/data/lang/cmn (corresponding to the Category:Mandarin language), and this will not cause a problem.
|
|breadcrumb= |
Ta̱mpi̱let:cd | Override the default breadcrumb displayed for the lect in the trail of breadcrumbs displayed at the top of the page. The default breadcrumb is normally the portion of the lect's name minus the containing language suffix. For example, Category:Southern Brazilian Portuguese has containing language 'Brazilian Portuguese' and hence will have default breadcrumb Southern . If the containing language cannot be matched in the lect's name, the code will try matching any parent languages of the containing language. For example, Category:Bahian Portuguese is a sublect of Brazilian Portuguese; if |lang=pt-BR is given to set the containing language appropriately, the name 'Brazilian Portuguese' is not a suffix of 'Bahian Portuguese', but its parent language Portuguese is, so the default breadcrumb will be Bahian . If neither the containing language nor any parent language matches, the breadcrumb is based on the entire lect (e.g. for lects like Category:Provençal).
|
|noreg=1 |
Ta̱mpi̱let:cd | Indicate that this lect is not a regional lect. This is only necessary when |cat= isn't explicitly given to {{auto cat}} (or equivalently, Ta̱mpi̱let:cd is given in the lect label definition), as its only purpose is to control the default parent category. See |cat= and Ta̱mpi̱let:cd for more information.
|
|nolink=1 |
Ta̱mpi̱let:cd | Don't automatically link the description in |1= using {{l|en}} . This should be specified if |1= contains a description such as from the 15th to the 18th centuries that is not a Wiktionary entry, and does not have any links in it (either bare or specified using {{l}} , {{w}} or the like). (If the value of |1= has bare links in it, the effect of wrapping with {{l|en}} is simply to convert those bare links into links pointing to the English section of the page in question, which is generally correct.)
|
|verb= |
Ta̱mpi̱let:cd | Override the verb "spoken" that normally appears in the category's description. Example values are formerly spoken for an extinct lect; chiefly spoken for a lect mostly spoken in the location specified in |1= but also spoken elsewhere; written for a written-only lect; etc.
|
|prep= |
Ta̱mpi̱let:cd | Override the preposition "in" that normally appears in the category's description. Example values are on if the location in |1= is an island; by if |1= specifies a group of people speaking the language (e.g. Ta̱mpi̱let:ws); etc. Use - to suppress the preposition (e.g. Category:Overseas Chinese sets |1=outside of [[China]] and [[Taiwan]] and |prep=- ).
|
|def= |
Ta̱mpi̱let:cd | Override the whole description following the words "Terms or senses in". The final period should not be included. |
|fulldef= |
Ta̱mpi̱let:cd | Override the entire description. The final period should not be included. |
|addl= |
Ta̱mpi̱let:cd | Specify additional text to display after the "Terms or senses in ..." category description, and before any category TOC (table of contents) bar. If this is given, include the final period. Note that if the lect directly names an etymology-only language, the additional text will automatically include the language code(s) of this etymology-only language. In such a case, any text specified using |addl= will follow this auto-added text.
|
|othercat= |
Ta̱mpi̱let:cd | Any additional category or categories to place the lect in. Separate multiple categories with a comma, without a following space (if a space follows the comma, it will not be considered a delimiter; this allows for embedded commas in categories, which are nearly always followed by a space). Unlike the value in |cat= , there are no restrictions on what sort of categories can be specified here.
|
|country= |
Ta̱mpi̱let:cd | Override the country or countries where the lect is spoken. See |1= above. Separate multiple countries with a comma without a following space, as with |othercat= . The purpose of this parameter is to add the lect to additional categories named Languages of country , so that such categories will be populated with all lects spoken in the country. If the Languages of country category does not already exist, the lect will not be added to it. As mentioned in |1= above, if |1= names a country and a corresponding Languages of country category exists, the lect will automatically be added to it, so |country= does not need to be specified. As a rule, do not specify |country= for sub-country lects. For example, Category:Texas English should not have |country=the United States specified, since Category:Texas English is a subcategory of Category:American English, which is in Category:Languages of the United States. An exception is when a language is spoken in only a portion of a country. For example, Category:Texas Silesian should have |country=the United States specified because there is no lect named Category:American Silesian (Silesian is not normally spoken in the United States except in Texas).
|
|wp= |
Ta̱mpi̱let:cd | Wikipedia link to include on the lect's page. This can be a single Wikipedia page or a comma-separated list of such pages (without any space after the comma; if a space follows the comma, it will not be considered a delimiter, to allow for embedded commas in Wikipedia page names). A given Wikipedia page can be prefixed with a language code to link to a page in a non-English Wikipedia. For example, Category:Japanese Korean specifies |wp=Zainichi Korean language,ko:재일조선어 to link to the Ta̱mpi̱let:w page on the English Wikipedia as well as the page Ta̱mpi̱let:w on the Korean Wikipedia. If the value of a Wikipedia page is + , 1 , yes , true , on or similar, the Wikipedia page will be taken from the lect name. Note that if the lect names an etymology-only language (e.g. Category:Provençal or Category:Brazilian Portuguese), the correct Wikipedia article for this lect will automatically be fetched based on the relevant Wikidata entry and added to the category page. To prevent this, specify an explicit value for |wp= ; use - , 0 , no , false , off or similar if you don't want any Wikipedia page displayed.
|
|type= |
Ta̱mpi̱let:cd | Specify the type of lect (extinct , extant , reconstructed , unattested or constructed ). Extinct lects are categorized into Category:All extinct languages. Reconstructed lects are categorized into Category:Reconstructed languages. Unattested lects are categorized into Category:Unattested languages. Constructed lects are categorized into Category:Constructed languages. In all cases an "additional text" message is placed indicating that the lect is (respectively) extinct, reconstructed, unattested or constructed. If the type is not given, it is inferred based on various factors (the type of the parent category, the type of the language that the lect belongs to, and whether the name of the category or language begins with "Proto-"). If no type can be inferred, it defaults to extant .
|
|pagename= |
(no equivalent) | Act as if the pagename is the specified value rather than its actual value. Any inferred parameters will be based off of the specified value. This is useful for testing and demonstration purposes (e.g. in documentation pages). |
Examples
[jhyuk]1. For Category:Hong Kong English, use:
{{auto cat|lect=1|cat=Chinese English}}
Here, |1=
does not need to be specified because the inferred description "Hong Kong" is correct. The language is automatically inferred as English (and in any case, this is an etymology-only language with code en-HK
, from which the language can be inferred). The parent category is set to Category:Chinese English in place of the default Category:Regional English.
2. For Category:Durham University English, use:
{{auto cat|lect=1|prep=at|{{w|Durham University}} in [[Durham]]|cat=Durham English|othercat=en:Universities}}
Here, we specify the region description in |1=
but the language is automatically inferred as English. The parent category is set to Category:Durham English in place of the default Category:Regional English (which leads to a breadcrumb chain Regional » European » British » English » Northern England » Northumbrian » Durham » Durham University based on parent categories). Category:en:Universities is added as an additional parent category.
3. For Category:Limburgan-Ripuarian transitional dialects, you could use:
{{auto cat|lect=1|lang=gmw-cfr|the tri-state region of <country>|cat=Ripuarian Franconian|country=Belgium,the [[Netherlands]],Germany|wp=Southeast Limburgish dialect}}
This is a more complex example. We have to set the language (Central Franconian) explicitly using |lang=
because it is not inferrable from the name and the category does not refer to an etymology-only language. The description in |1=
contains <country>
, which substitutes the countries mentioned in |country=
(which also cause the category to be added to Category:Languages of Belgium, Category:Languages of the Netherlands and Category:Languages of Germany. We also specify a parent category and Wikipedia page to link to.
However, this is now handled in the preferred way, using properties of the lect label, as follows (as found in Module:labels/data/lang/gmw-cfr):
labels["Limburgan Ripuarian"] = { region = "the tri-state region of <country>", country = "Belgium,the [[Netherlands]],Germany", aliases = {"Tri-state Limburgish", "Limburgan-Ripuarian", "Southeast Limburgish dialect", "Limburgan-Ripuarian Transitional Dialects"}, Wikipedia = "Southeast Limburgish dialect", plain_categories = "Limburgan-Ripuarian transitional dialects", parent = "Ripuarian", }
4. For Category:Dobhashi, use:
{{auto cat|lect=1|lang=bn|def=a literary register of Bengali that was in common use from the 14th century to the 19th century|type=extinct|noreg=1|wp=1}}
Here, we have to set the language (Bengali), and we override the definition after "Terms or senses in" using |def=
in place of specifying |1=
. Since this isn't a regional lect, we set |noreg=1
so the parent defaults to Category:Bengali language. We set |type=extinct
because this lect is extinct and this cannot be inferred from the parent (which is not extinct). We also use |wp=1
to link to Ta̱mpi̱let:w on Wikipedia.
5. For Category:The BMAC substrate, use:
{{auto cat|lect=1|def=the [[substrate]](s) spoken in the {{w|Bactria–Margiana Archaeological Complex}} and possibly found as a {{w|substratum in Vedic Sanskrit}}|breadcrumb=BMAC}}
Here, the language in question is an etymology-only substrate language whose actual name begins with a lowercase letter (the BMAC substrate
), but the lect handler automatically takes care of the mismatch and recognizes the etymology-only language. The type is automatically inferred to be unattested
based on it being a substrate language (this is done by checking the code; all substrate language codes begin with qsb-
). Based on the type, the default parent is Substrate languages. We set a breadcrumb to override the default breadcrumb The BMAC substrate
.
"Languages of COUNTRY" categories
[jhyuk]These are categories such as Category:Languages of India and Category:Languages of the United States. These categories contain subcategories for all the languages and sublects spoken in the country in question. The following parameters are allowed (none are required):
|flagfile=
- An image file specifying the flag of the country in question, displayed in the upper right corner of the category page. The
File:
prefix should be omitted. An example is|flagfile=Flag of Afghanistan (2013–2021).svg
for Afghanistan. The default isFlag of country.svg
; if this file does not exist, no flag is displayed. Use|flagfile=-
to cause the flag to be omitted even if the appropriate flag file is present. |wp=
- A link to a Wikipedia article describing the languages of the country, such as Ta̱mpi̱let:w. Use
|wp=+
or|wp=1
to specify that the name of the Wikipedia article is the same as the category name. |commonscat=
- A link to a Commons category describing the languages of the country, such as Commons:Category:Languages of Chad. Use
|commonscat=+
or|commonscat=1
to specify that the name of the Commons category is the same as the category name.
User language competency categories
[jhyuk]These are categories such as Category:User fr-4 indicating that the user speaks French at near-native competency. The following parameters are allowed (none are required):
|text=
- The native-language text specifying a translation of the English text describing the competency of the users in the category in the language in question. An image file specifying the flag of the country in question, displayed in the upper right corner of the category page. The
File:
prefix should be omitted. An example is|text=Ces utilisateurs parlent <<français>> à un niveau '''comparable à la langue maternelle'''.
for the translation of "These users speak French at a near-native level." The text describing the level of competency should be boldfaced and the text specifying the language should be surrounded in double angle brackets, as shown. The language in double angle brackets will be boldfaced and linked to the higher-level user-competency category (e.g. Category:User fr); in that category, double angle bracket text is linked to the language category (e.g. Category:French language). If the text is omitted, the category is placed in two cleanup categories: Category:Requests for translations in user-competency categories by language and Category:Requests for translations in user-competency categories with ##-## users (e.g. Category:Requests for translations in user-competency categories with 16-31 users). (The purpose of the latter categories is to segment the categories with missing text by number of users so that the ones with more users can be focused on first.) |verb=
- The correct verb to use in the English text, in place of "speak" or (for sign languages) "communicate in". For example, protolanguages may prefer the verb "know".
|langname=
- Override the name of the language. This is chiefly used in user competency categories for invalid language codes (e.g.
eml
for Emiliano-Romagnol; on Wiktionary, this code is represented by two languages, Emilian with codeegl
and Romagnol with codergn
). Such categories should be actively eliminated by moving the users in them to the nearest valid Wiktionary code and then deleting the category when empty.
Spelled-with categories
[jhyuk]These are categories such as Category:English terms spelled with É, Category:Japanese terms spelled with 愛 and Category:Ladino terms derived from the Hebrew root ח־ב־ר. Normally, what follows spelled with
is a single character, but occasionally multiple characters are used, as in Category:Ladino terms derived from the Hebrew root ח־ב־ר. For these categories, the following parameters are allowed (none are required):
|sort=
- The sort key; used to sort the page in its parent category (e.g. Category:Japanese terms by their individual characters). Only needed if the automatically generated sort key is wrong. Examples are Category:Spanish terms spelled with Î, which should use
{{auto cat|sort=I}}
, and Category:Japanese terms spelled with 衛, which should use{{auto cat|sort=行10}}
. Japanese and Okinawan terms use Module:Hani-sortkey to generate the sort key, but currently this always generates Chinese sort keys, which in rare cases are wrong for Japanese (for example, the autogenerated sort key for Category:Japanese terms spelled with 衛 is行09
instead of行10
). |char=
- If the category name has a descriptive word in it, such as
gershayim
, this should be the actual character referred to (in this case,{{auto cat|char=״}}
). Otherwise, it should be left out. |context=
,|context2=
- Provided for compatibility purposes, but unused.
Japonic "terms spelled with KANJI read as READING" categories
[jhyuk]These are categories such as Category:Japanese terms spelled with 学 read as がく and Category:Okinawan terms spelled with 光 read as ふぃちゃい. These contain terms spelled with individual kanji read in particular ways (where the reading is written in hiragana). These categories have required parameters specifying the type(s) of reading(s):
|1=
,|2=
,|3=
, ... (required)- The reading type(s); one or more of
kun
,on
,goon
,kan'on
,kan'yōon
,tōon
,sōon
ornanori
. For example, Category:Japanese terms spelled with 学 read as がく should use{{auto cat|goon|kan'on}}
and Category:Okinawan terms spelled with 光 read as ふぃちゃい should use{{auto cat|kun}}
. The particular reading type(s) can often be found on the page dedicated to the kanji in question.
Japonic "terms with KANJI replaced by daiyōji DAIYOJI" categories
[jhyuk]These are categories such as Category:Japanese terms with 諒 replaced by daiyōji 了. These contain terms spelled with specific uncommon kanji that normally have that character replaced by another homophonic character (a daiyōji) chosen only for the sound and not the meaning. These categories have a required parameter specifying the sort key:
|sort=
(required)- The sort key; in this case, hiragana representing the pronunciation of the character in question.
Japonic "kanji read as READING" categories
[jhyuk]These are categories such as Category:Japanese kanji read as ゐ. These are umbrella categories grouping categories for kanji read with specific readings that have specific origins (e.g. kun, on). These categories have the following optional parameters:
|histconsol=modern
- Specify that this is a historical reading and that such readings are normally consolidated into the modern reading Ta̱mpi̱let:cd (with modernized pronunciation). A message to this effect appears in the category text.