CALLHOME TRANSCRIPTION CONVENTIONS - General


What to transcribe:     10 minutes (600 seconds) from the recorded
                        telephone conversations.  This should not
                        include the beginning of the conversation
                        where the speakers are getting permission
                        for being recorded.

Definition of turns:    Separate turns are defined by the following
                        criteria:

                (1) speaker change, e.g.

                        A:  Well I was thinking about that

                        B:  I know I talked to &Jan about it yesterday

                (2) within one speaker's stretch of talk, a long
                turn should be broken up in terms of what makes
                grammatical/semantic sense, e.g.

                        A: And I told her um I didn't I wasn't
                        setting you up to be a spiritual director or
                        anything {laugh} but I did say to her that if she
                        were to talk if she felt that she wanted to
                        talk about her prayer experience in Spanish

                        A: that you would probably be able to certainly
                        to understand her but to empathize a little bit
                        with what she was experiencing

                (3) If there is an extra-long pause within a
                single speaker's turn, break the turn up into two
                turns, e.g.

                        B: When we were fishing out on &Lake &Travis last
                        August I thought I saw, uh [[long pause]]

                        B: uh, &George &Martin, but I wasn't sure it was him.


Timestamps:             Each speaker turn is marked with a unique timestamp
                        (in seconds). The timestamps mark the beginning and
                        end time of each turn relative to the beginning of the
                        recording. Each timestamp is precise to the 100th of a
                        second, and is in the format: beginning time [space]
                        ending time, followed by the turn.
                        Some samples:

                27.98 28.72 A: You know so

                137.49 139.47 A: yeah {breath} (( )) [distortion]

                284.54 286.79 B: %ah &Lydia &Van &Damme.


Special Conventions:


     Acronyms           Acronyms pronounced like a word are written in all caps
                        with no spaces, e.g.

                        AIDS    NARAL

                        Acronyms pronounded like the individual letters are
                        written in all caps with spaces between the letters:

                        C I A           H I V           C E O

    Numbers             Write all numbers out, do not use digits

                        twenty-two      nineteen-ninety-five

    Non-lexemes         Use the most standard spelling (as given on the
                        lexicon list, if it's there); don't try to
                        represent lengthening by writing multiple consonants
                        (like 'ooooh').

                        uh-huh  mm-hm   uh-oh   okay    jeez    uh


Special symbols:


    Noises, conversational phenomena, foreign words, etc. are marked
    with special symbols.  In the table below, "text" represents any
    word or descriptive phrase.

    {text}              sound made by the talker

                        {laugh} {cough} {sneeze}

    [text]              sound not made by the talker (background or channel)

                        [distortion]    [background noise]      [buzz]

    [/text]             end of continuous or intermittent sound not made by
                        the talker (beginning marked with previous [text])

    [[text]]            comment; most often used to describe unusual
                        characteristics of immediately preceding or following
                        speech (as opposed to separate noise event)

                        [[previous word lengthened]]    [[speaker is singing]]

    ((text))            unintelligible; text is best guess at transcription

                        ((coffee klatch))

    (( ))               unintelligible; can't even guess text

                        (( ))

         speech in another language

                        

               ? indicates unrecognized language; (( )) indicates
                        untranscribable speech

                          

    -text
    text-               partial word

    #text#              simultaneous speech on same channel
                        (simultaneous speech on different channels is not
                        explicitly marked, but is identifiable as such by
                        reference to time marks)

    //text//            aside (talker addressing someone in background)

                        //quit it, I'm talking to your sister!//

    +text+              mispronounced word (spell it in usual orthography)

                        +turbot+

     **text**           idiosyncratic word, not in common use

                        **poodle-ish**

    %text               This symbol flags non-lexemes, and is added
                        automatically to the transcripts according to
			the non-lexeme list given in the document
			'Principles of Transcribing ECA'.

    &text               used to mark proper names and place names

                        &Mary &Jones    &Arizona        &Harper's
                        &Fiat           &Joe's &Grill


    text --             marks end of interrupted turn and continuation
    -- text             of same turn after interruption, e.g.

                        A: I saw &Joe yesterday coming out of --

                        B: You saw &Joe?!

                        A: -- the music store on &Seventeenth and &Chestnut.



            PRINCIPLES OF TRANSCRIBING ECA (as of 9/10/96)


1.  Spelling

When a question arises about the proper spelling of a word (such as
/H/ or /h/, /a/ or /i/), our "authoritative" source is the Badawi &
Hinds "Dictionary of Egyptian Arabic".  In general, we are avoiding
writing long vowels at the ends of words (with some exceptions below).
Initial glottal stops are not written, since they are fully predicable
and occur before all word-initial vowels.


2.  Definite articles

The definite article /il/ is followed by a "+" if immediately followed
by a noun, regardless of its actual pronunciation.  Some examples:

        il+rAgil        "the man"
        il+salAm        "the peace"
        il+qizAzaB      "the bottle"

The exception to this involves a high frequency set phrase:

        ilHamdulillA     "Thank God"

For words that begin with the definite article 'il', preceding a word
that begins with 'k' or 'g', assimilation of the 'l' is variable, producing
either /ikk/ or /ilk/, and /igg/ or /ilg/ respectively.  In devtest02
and train03, the particular pronunciation of each such case in the 
transcripts is notated as:

il+k    unassimilated
il(k)+k assimilated
il+g    unassimilated
il(g)+g assimilated


2.1.  Definite articles and proper names

If a proper name is preceded by the definite article /il/, place the
"&" symbol after the "+" before the name itself:

        il+&sucudiyyaB
        il+&raml


3.  tEh marbUta "B"

In ECA, many feminine nouns and some feminine adjectives ending in
/-a/ can be pronounced as either [-a] or [-it], depending upon what
word comes after it in a sentence.  To capture the generalization that
only the pronunciation is changing, all words which have the tEh 
marbUta in MSA are written with a final /-B/ for ECA, regardless of
the actual pronunciation.  The rules for deriving the set of pronunciations
are in the Lexicon.  Examples:

        HAgaB
        ca$araB
        diyya
        tuscumiyyaB
        tuscumiyyaB wi xamsIn
        tuscumiyyaB dulAr
        baqiyyaB
        mAmaB   (many speakers say "mamti" for 'my mama')

However, in devtest02 and train03 transcripts, words that end in the 
orthographic symbol 'B' (e.g. feminine nouns) which may be pronounced 
either /a/ or /it/, are coded for the specific pronunciation used in each
case with the alternatives 'B~' and 'B(t)', respectively:

B~      /a/
B(t)    /it/


4.  Verbal prefixes and suffixes (not pronominal suffixes)

Verbal prefixes and suffixes will be written as part of the verb (just
as in MSA), without the use of "+" or the inclusion of a space.  The
vowel deletions which occur in such forms will be recorded in the
spelling.  Some examples:

        biyitxAniq      (not bi+yitxAniq)       "he is fighting"
        biyifham        "he understands"
        Hayifham        "he will understand"
        mafhimti$       "I don't understand"


5.  Pronominal suffixes

Pronominal suffixes are also written as part of the word without a "+"
or space between the verb and the suffix.  The reason for this is that
for maintaining constancy with negated verbs such as /mafahimtaha$/ "I
don't understand it", where the /$/ remains attached to the verb as in
(4) above.  Examples:

        katabha         "he wrote it (fem.)"
        katabu          "he wrote it (masc.)"
        katabtaha       "I wrote it (fem.)"


6.  "Inseparable" prepositions

The "inseparable" prepositions /bi-/ "with", /li-/ "to, for", /ka-/
"like" are all written with the following word, separated by a "+".
If the definite article comes between the inseparable preposition and
the word stem, it is written in the same manner.  Example:

        bi+il+lEl
        li+il+madInaB   "to the city"

*NOTE*:  only prepositions and the definite article are separated by
the "+".


7.  Numerals

The numerals should all be written in citation form.  The lexicon will
include the rules for deriving their pronunciation, since numerals
behave differently from other adjectives.  Examples:

        xamsaB          "five"
        ca$araB         "ten"
        ca$araB ayyAm   "ten days"

Note: The word for "days" will always be written /ayyAm/ even though
it is pronounced [iyyAm] after the numerals 3-10.  We will include
this as a rule in the lexicon.


8.  Foreign words and placenames

Foreign words are transcribed using the convention .  However, there are some instances where the words or
placenames have been nativized.  These words should be written as
pronounced.  Some examples:

        &niujirsi       "New Jersey"
        &niuyOrk        "New York"
        &lusanjilus     "Los Angeles"
        yA              "yeah"
              "Seven Up"


9.  Standard spellings

The words below should be transcribed as shown, regardless of variant
pronunciations.

        matuqcudI$      "don't sit down"
        kuwayyis        "well"
        nuSS            "half"
        bass            "enough"
        bAba            "father"
        mAmaB           "mother"
        diqiqtEn        "two minutes"
        mazilt          "still"
        ya xUya         "my brother"
        ya xti          "my sister"
        walla           "or / a short version of wallAhi"
        la              "no"


10.  Words with variable spellings

The words below should be transcribed as shown depending upon what one
hears:

        iwci     /      iwca            "don't"
        buqq     /      buqqi           "mouth, my mouth"
        ca       /      cala            "on"
        kat      /      kAnit           "was"
        laHsan   /      li+aHsan        "for the better"
        ca$An    /      cala$An         "because"
        ana xadt /      ana axadt       "I took it"


11.  Miscellaneous cases

The following phrases are written as one word, the reason being that
they are high frequency and occur as set phrases:

        in$ACallA / in$alla     "God willing" (depending upon what is said)
        biCiznillA              "God willing"
        wallAhi                 "swear to God"
        liCinn                  "because"
        allAhuakbar             "God is greatest"


12.  List of standardized non-lexemes.  The following are marked
automatically with '%' in the transcripts:

        ah
        Ah
        aha
        E
        Eyy
        hm
        M
        mm
        mhm
        O
        OhO
        yaa
        yOO
        yuu
        uh
        ayyO
        yA
        yO
        yOO
        ha
        hE
        wAw
        Hay


13.  On indicating dialectal words

If a speaker pronounces a word in marked dialect (especially if the
word changes shape due to the dialect), the word will be flagged as if
it is a foreign word with either  or , the two
main dialect areas of Egypt.