CALLHOME TRANSCRIPTION CONVENTIONS - General What to transcribe: 10 minutes (600 seconds) from the recorded telephone conversations. This should not include the beginning of the conversation where the speakers are getting permission for being recorded. Definition of turns: Separate turns are defined by the following criteria: (1) speaker change, e.g. A: Well I was thinking about that B: I know I talked to &Jan about it yesterday (2) within one speaker's stretch of talk, a long turn should be broken up in terms of what makes grammatical/semantic sense, e.g. A: And I told her um I didn't I wasn't setting you up to be a spiritual director or anything {laugh} but I did say to her that if she were to talk if she felt that she wanted to talk about her prayer experience in Spanish A: that you would probably be able to certainly to understand her but to empathize a little bit with what she was experiencing (3) If there is an extra-long pause within a single speaker's turn, break the turn up into two turns, e.g. B: When we were fishing out on &Lake &Travis last August I thought I saw, uh [[long pause]] B: uh, &George &Martin, but I wasn't sure it was him. Timestamps: Each speaker turn is marked with a unique timestamp (in seconds). The timestamps mark the beginning and end time of each turn relative to the beginning of the recording. Each timestamp is precise to the 100th of a second, and is in the format: beginning time [space] ending time, followed by the turn. Some samples: 27.98 28.72 A: You know so 137.49 139.47 A: yeah {breath} (( )) [distortion] 284.54 286.79 B: %ah &Lydia &Van &Damme. Special Conventions: Acronyms Acronyms pronounced like a word are written in all caps with no spaces, e.g. AIDS NARAL Acronyms pronounded like the individual letters are written in all caps with spaces between the letters: C I A H I V C E O Numbers Write all numbers out, do not use digits twenty-two nineteen-ninety-five Non-lexemes Use the most standard spelling (as given on the lexicon list, if it's there); don't try to represent lengthening by writing multiple consonants (like 'ooooh'). uh-huh mm-hm uh-oh okay jeez uh Special symbols: Noises, conversational phenomena, foreign words, etc. are marked with special symbols. In the table below, "text" represents any word or descriptive phrase. {text} sound made by the talker {laugh} {cough} {sneeze} [text] sound not made by the talker (background or channel) [distortion] [background noise] [buzz] [/text] end of continuous or intermittent sound not made by the talker (beginning marked with previous [text]) [[text]] comment; most often used to describe unusual characteristics of immediately preceding or following speech (as opposed to separate noise event) [[previous word lengthened]] [[speaker is singing]] ((text)) unintelligible; text is best guess at transcription ((coffee klatch)) (( )) unintelligible; can't even guess text (( ))speech in another language ? indicates unrecognized language; (( )) indicates untranscribable speech -text text- partial word #text# simultaneous speech on same channel (simultaneous speech on different channels is not explicitly marked, but is identifiable as such by reference to time marks) //text// aside (talker addressing someone in background) //quit it, I'm talking to your sister!// +text+ mispronounced word (spell it in usual orthography) +turbot+ **text** idiosyncratic word, not in common use **poodle-ish** %text This symbol flags non-lexemes, and is added automatically to the transcripts according to the non-lexeme list given in the document 'Principles of Transcribing ECA'. &text used to mark proper names and place names &Mary &Jones &Arizona &Harper's &Fiat &Joe's &Grill text -- marks end of interrupted turn and continuation -- text of same turn after interruption, e.g. A: I saw &Joe yesterday coming out of -- B: You saw &Joe?! A: -- the music store on &Seventeenth and &Chestnut. PRINCIPLES OF TRANSCRIBING ECA (as of 9/10/96) 1. Spelling When a question arises about the proper spelling of a word (such as /H/ or /h/, /a/ or /i/), our "authoritative" source is the Badawi & Hinds "Dictionary of Egyptian Arabic". In general, we are avoiding writing long vowels at the ends of words (with some exceptions below). Initial glottal stops are not written, since they are fully predicable and occur before all word-initial vowels. 2. Definite articles The definite article /il/ is followed by a "+" if immediately followed by a noun, regardless of its actual pronunciation. Some examples: il+rAgil "the man" il+salAm "the peace" il+qizAzaB "the bottle" The exception to this involves a high frequency set phrase: ilHamdulillA "Thank God" For words that begin with the definite article 'il', preceding a word that begins with 'k' or 'g', assimilation of the 'l' is variable, producing either /ikk/ or /ilk/, and /igg/ or /ilg/ respectively. In devtest02 and train03, the particular pronunciation of each such case in the transcripts is notated as: il+k unassimilated il(k)+k assimilated il+g unassimilated il(g)+g assimilated 2.1. Definite articles and proper names If a proper name is preceded by the definite article /il/, place the "&" symbol after the "+" before the name itself: il+&sucudiyyaB il+&raml 3. tEh marbUta "B" In ECA, many feminine nouns and some feminine adjectives ending in /-a/ can be pronounced as either [-a] or [-it], depending upon what word comes after it in a sentence. To capture the generalization that only the pronunciation is changing, all words which have the tEh marbUta in MSA are written with a final /-B/ for ECA, regardless of the actual pronunciation. The rules for deriving the set of pronunciations are in the Lexicon. Examples: HAgaB ca$araB diyya tuscumiyyaB tuscumiyyaB wi xamsIn tuscumiyyaB dulAr baqiyyaB mAmaB (many speakers say "mamti" for 'my mama') However, in devtest02 and train03 transcripts, words that end in the orthographic symbol 'B' (e.g. feminine nouns) which may be pronounced either /a/ or /it/, are coded for the specific pronunciation used in each case with the alternatives 'B~' and 'B(t)', respectively: B~ /a/ B(t) /it/ 4. Verbal prefixes and suffixes (not pronominal suffixes) Verbal prefixes and suffixes will be written as part of the verb (just as in MSA), without the use of "+" or the inclusion of a space. The vowel deletions which occur in such forms will be recorded in the spelling. Some examples: biyitxAniq (not bi+yitxAniq) "he is fighting" biyifham "he understands" Hayifham "he will understand" mafhimti$ "I don't understand" 5. Pronominal suffixes Pronominal suffixes are also written as part of the word without a "+" or space between the verb and the suffix. The reason for this is that for maintaining constancy with negated verbs such as /mafahimtaha$/ "I don't understand it", where the /$/ remains attached to the verb as in (4) above. Examples: katabha "he wrote it (fem.)" katabu "he wrote it (masc.)" katabtaha "I wrote it (fem.)" 6. "Inseparable" prepositions The "inseparable" prepositions /bi-/ "with", /li-/ "to, for", /ka-/ "like" are all written with the following word, separated by a "+". If the definite article comes between the inseparable preposition and the word stem, it is written in the same manner. Example: bi+il+lEl li+il+madInaB "to the city" *NOTE*: only prepositions and the definite article are separated by the "+". 7. Numerals The numerals should all be written in citation form. The lexicon will include the rules for deriving their pronunciation, since numerals behave differently from other adjectives. Examples: xamsaB "five" ca$araB "ten" ca$araB ayyAm "ten days" Note: The word for "days" will always be written /ayyAm/ even though it is pronounced [iyyAm] after the numerals 3-10. We will include this as a rule in the lexicon. 8. Foreign words and placenames Foreign words are transcribed using the convention . However, there are some instances where the words or placenames have been nativized. These words should be written as pronounced. Some examples: &niujirsi "New Jersey" &niuyOrk "New York" &lusanjilus "Los Angeles" yA "yeah" "Seven Up" 9. Standard spellings The words below should be transcribed as shown, regardless of variant pronunciations. matuqcudI$ "don't sit down" kuwayyis "well" nuSS "half" bass "enough" bAba "father" mAmaB "mother" diqiqtEn "two minutes" mazilt "still" ya xUya "my brother" ya xti "my sister" walla "or / a short version of wallAhi" la "no" 10. Words with variable spellings The words below should be transcribed as shown depending upon what one hears: iwci / iwca "don't" buqq / buqqi "mouth, my mouth" ca / cala "on" kat / kAnit "was" laHsan / li+aHsan "for the better" ca$An / cala$An "because" ana xadt / ana axadt "I took it" 11. Miscellaneous cases The following phrases are written as one word, the reason being that they are high frequency and occur as set phrases: in$ACallA / in$alla "God willing" (depending upon what is said) biCiznillA "God willing" wallAhi "swear to God" liCinn "because" allAhuakbar "God is greatest" 12. List of standardized non-lexemes. The following are marked automatically with '%' in the transcripts: ah Ah aha E Eyy hm M mm mhm O OhO yaa yOO yuu uh ayyO yA yO yOO ha hE wAw Hay 13. On indicating dialectal words If a speaker pronounces a word in marked dialect (especially if the word changes shape due to the dialect), the word will be flagged as if it is a foreign word with either or , the two main dialect areas of Egypt.